Skip to content

Commit

Permalink
docs: Add credits to ubc and university of wisconsin in readme (#193)
Browse files Browse the repository at this point in the history
* add credits to UBC and university of Wisconsin in Readme

* update AcnoAcknowledgements

* update quarto report rendering pipeline

---------

Co-authored-by: SoloSynth1 <solosynth1@gmail.com>
jinyz8888 and SoloSynth1 authored Jun 25, 2024
1 parent 0c06c94 commit d89f38c
Showing 5 changed files with 22 additions and 14 deletions.
6 changes: 4 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -19,11 +19,12 @@ data/processed/ground_truth.csv : analysis/preprocess_batch_run_result.py data/b

# Build 'report/docs/index.html' by rendering the Jupyter notebooks using Quarto.
report/docs/index.html : data/processed/ground_truth.csv
quarto render
quarto render --cache-refresh
awk '{gsub(/proposal/,"final_report"); print}' ./report/docs/index.html > tmp && mv tmp ./report/docs/index.html

.PHONY : publish
publish : data/processed/ground_truth.csv
quarto publish gh-pages
quarto publish gh-pages ./report

# The 'clean' target is used to clean up generated files and directories.
.PHONY : clean
@@ -35,4 +36,5 @@ clean :
rm -rf data/batch_run/batch_run_4o
rm -rf data/processed/ground_truth.csv
rm -rf data/processed/score_*csv
rm -rf report/.quarto/

16 changes: 11 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -68,15 +68,15 @@ Run `fixml --help` for more details.

> [!IMPORTANT]
> By default, this tool uses OpenAI's `gpt3.5-turbo` for evaluation. To run any
command that requires calls to LLM (i.e. `fixml evaluate`, `fixml generate`),
an environment variable `OPENAI_API_KEY` needs to be set. To do so, either use
> command that requires calls to LLM (i.e. `fixml evaluate`, `fixml generate`),
> an environment variable `OPENAI_API_KEY` needs to be set. To do so, either use
`export` to set the variable in your current session, or create a `.env` file
with a line `OPENAI_API_KEY={your-api-key}` saved in your working directory.
> with a line `OPENAI_API_KEY={your-api-key}` saved in your working directory.
> [!TIP]
> Currently, only calls to OpenAI endpoints are supported. This tool is still in
ongoing development and integrations with other service providers and locally
hosted LLMs are planned.
> ongoing development and integrations with other service providers and locally
> hosted LLMs are planned.
#### Test Evaluator

@@ -199,6 +199,7 @@ deliverable product during our capstone project of the UBC-MDS program in
collaboration with Dr. Tiffany Timbers and Dr. Simon Goring. It is licensed
under the terms of the MIT license for software code. Reports and instructional
materials are licensed under the terms of the CC-BY 4.0 license.

## Citation

If you use fixml in your work, please cite:
@@ -221,3 +222,8 @@ welcome it to be read, revised, and supported by data scientists, machine
learning engineers, educators, practitioners, and hobbyists alike. Your
contributions and feedback are invaluable in making this package a reliable
resource for the community.

Special thanks to the University of British Columbia (UBC) and the University of
Wisconsin-Madison for their support and resources. We extend our gratitude to
Dr. Tiffany Timbers and Dr. Simon Goringfor their guidance and expertise, which
have been instrumental in the development of this project.
6 changes: 4 additions & 2 deletions _quarto.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
project:
type: website
render:
- "report/*qmd"
output-dir: report/docs
- report/final_report.qmd
- report/proposal.qmd
output-dir: "report/docs/"

website:
title: "FixML - Checklists and LLM prompts for efficient and effective test creation in data analysis"
sidebar:
style: "docked"
logo: "img/logo.png"
3 changes: 1 addition & 2 deletions report/final_report.qmd
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
---
title: "Final Report - Checklists and LLM prompts for efficient and effective test creation in data analysis"
format:
html:
code-fold: true
bibliography: references.bib
---

# Final Report - Checklists and LLM prompts for efficient and effective test creation in data analysis

by John Shiu, Orix Au Yeung, Tony Shum, Yingzi Jin

## Executive Summary
5 changes: 2 additions & 3 deletions report/proposal.qmd
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
---
title: "Proposal Report - Checklists and LLM prompts for efficient and effective test creation in data analysis"
format:
html:
code-fold: true
bibliography: references.bib
jupyter: python3
---

# Proposal Report - Checklists and LLM prompts for efficient and effective test creation in data analysis

by John Shiu, Orix Au Yeung, Tony Shum, Yingzi Jin

## Executive Summary
@@ -36,7 +35,7 @@ We propose to develop testing suites diagnostic tools based on Large Language Mo

Our solution offers an end-to-end application for evaluating and enhancing the robustness of users' ML systems.

![Main components and workflow of the proposed system. The checklist would be written in [YAML](https://yaml.org/) to maximize readability for both humans and machines. We hope this will encourage researchers/users to read, understand and modify the checklist items, while keeping the checklist closely integrated with other components in our system.](../img/proposed_system_overview.png)
![Main components and workflow of the proposed system. The checklist would be written in [YAML](https://yaml.org/) to maximize readability for both humans and machines. We hope this will encourage researchers/users to read, understand and modify the checklist items, while keeping the checklist closely integrated with other components in our system.](../img/proposed_system_overview.png){.lightbox}

One big challenge in utilizing LLMs to reliably and consistently evaluate ML systems is their tendency to generate illogical and/or factually wrong information known as hallucination [@zhang2023sirens].

0 comments on commit d89f38c

Please sign in to comment.