Skip to content

Commit

Permalink
update quarto report rendering pipeline
Browse files Browse the repository at this point in the history
  • Loading branch information
SoloSynth1 committed Jun 25, 2024
1 parent a28fef1 commit a6c6d2f
Show file tree
Hide file tree
Showing 5 changed files with 20 additions and 15 deletions.
6 changes: 4 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,12 @@ data/processed/ground_truth.csv : analysis/preprocess_batch_run_result.py data/b

# Build 'report/docs/index.html' by rendering the Jupyter notebooks using Quarto.
report/docs/index.html : data/processed/ground_truth.csv
quarto render
quarto render --cache-refresh
awk '{gsub(/proposal/,"final_report"); print}' ./report/docs/index.html > tmp && mv tmp ./report/docs/index.html

.PHONY : publish
publish : data/processed/ground_truth.csv
quarto publish gh-pages
quarto publish gh-pages ./report

# The 'clean' target is used to clean up generated files and directories.
.PHONY : clean
Expand All @@ -35,4 +36,5 @@ clean :
rm -rf data/batch_run/batch_run_4o
rm -rf data/processed/ground_truth.csv
rm -rf data/processed/score_*csv
rm -rf report/.quarto/

15 changes: 9 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,15 +63,15 @@ Run `fixml --help` for more details.

> [!IMPORTANT]
> By default, this tool uses OpenAI's `gpt3.5-turbo` for evaluation. To run any
command that requires calls to LLM (i.e. `fixml evaluate`, `fixml generate`),
an environment variable `OPENAI_API_KEY` needs to be set. To do so, either use
> command that requires calls to LLM (i.e. `fixml evaluate`, `fixml generate`),
> an environment variable `OPENAI_API_KEY` needs to be set. To do so, either use
`export` to set the variable in your current session, or create a `.env` file
with a line `OPENAI_API_KEY={your-api-key}` saved in your working directory.
> with a line `OPENAI_API_KEY={your-api-key}` saved in your working directory.
> [!TIP]
> Currently, only calls to OpenAI endpoints are supported. This tool is still in
ongoing development and integrations with other service providers and locally
hosted LLMs are planned.
> ongoing development and integrations with other service providers and locally
> hosted LLMs are planned.
#### Test Evaluator

Expand Down Expand Up @@ -208,4 +208,7 @@ learning engineers, educators, practitioners, and hobbyists alike. Your
contributions and feedback are invaluable in making this package a reliable
resource for the community.

Special thanks to the University of British Columbia (UBC) and the University of Wisconsin-Madison for their support and resources. We extend our gratitude to Dr. Tiffany and Dr. Simon for their guidance and expertise, which have been instrumental in the development of this project.
Special thanks to the University of British Columbia (UBC) and the University of
Wisconsin-Madison for their support and resources. We extend our gratitude to
Dr. Tiffany Timbers and Dr. Simon Goringfor their guidance and expertise, which
have been instrumental in the development of this project.
6 changes: 4 additions & 2 deletions _quarto.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
project:
type: website
render:
- "report/*qmd"
output-dir: report/docs
- report/final_report.qmd
- report/proposal.qmd
output-dir: "report/docs/"

website:
title: "FixML - Checklists and LLM prompts for efficient and effective test creation in data analysis"
sidebar:
style: "docked"
logo: "img/logo.png"
Expand Down
3 changes: 1 addition & 2 deletions report/final_report.qmd
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
---
title: "Final Report - Checklists and LLM prompts for efficient and effective test creation in data analysis"
format:
html:
code-fold: true
bibliography: references.bib
---

# Final Report - Checklists and LLM prompts for efficient and effective test creation in data analysis

by John Shiu, Orix Au Yeung, Tony Shum, Yingzi Jin

## Executive Summary
Expand Down
5 changes: 2 additions & 3 deletions report/proposal.qmd
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
---
title: "Proposal Report - Checklists and LLM prompts for efficient and effective test creation in data analysis"
format:
html:
code-fold: true
bibliography: references.bib
jupyter: python3
---

# Proposal Report - Checklists and LLM prompts for efficient and effective test creation in data analysis

by John Shiu, Orix Au Yeung, Tony Shum, Yingzi Jin

## Executive Summary
Expand Down Expand Up @@ -36,7 +35,7 @@ We propose to develop testing suites diagnostic tools based on Large Language Mo

Our solution offers an end-to-end application for evaluating and enhancing the robustness of users' ML systems.

![Main components and workflow of the proposed system. The checklist would be written in [YAML](https://yaml.org/) to maximize readability for both humans and machines. We hope this will encourage researchers/users to read, understand and modify the checklist items, while keeping the checklist closely integrated with other components in our system.](../img/proposed_system_overview.png)
![Main components and workflow of the proposed system. The checklist would be written in [YAML](https://yaml.org/) to maximize readability for both humans and machines. We hope this will encourage researchers/users to read, understand and modify the checklist items, while keeping the checklist closely integrated with other components in our system.](../img/proposed_system_overview.png){.lightbox}

One big challenge in utilizing LLMs to reliably and consistently evaluate ML systems is their tendency to generate illogical and/or factually wrong information known as hallucination [@zhang2023sirens].

Expand Down

0 comments on commit a6c6d2f

Please sign in to comment.