Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement job_result_download for experiment service #632

Merged
merged 24 commits into from
Jan 28, 2025

Conversation

veekaybee
Copy link
Member

@veekaybee veekaybee commented Jan 15, 2025

What's changing

We currently download results per job. We'd like to implement all data results downloaded per experiment.

  1. We look in the jobs table for all jobs that match a specific experiment id (inference + eval) and
  2. return the results as JSON.

image

In order to make these changes, we need to change the jobs, experiments service, and related API calls.

Provide a clear and concise description of the content changes you're proposing. List all the changes you are making to the content.

  • Changing jobs service
  • Changing experiment service
  • Changing experiment service API calls

If this PR is related to an issue or closes one, please link it here.

See #572

How to test it

Steps to test the changes:

  1. Upload dialogsum and run experiment:
#!/bin/bash
if [ "$#" -gt 0 ]; then
    DATA_CSV_PATH="$1"
else
    DATA_CSV_PATH="$HOME/lumigator-datasets/dialogsum/dialogsum.csv"
fi

if [[ -z "${BACKEND_URL}" ]]; then
  BACKEND_URL=http://localhost:8000
fi

echo Connecting to $BACKEND_URL...

DATASET_ID=$(curl -s $BACKEND_URL/api/v1/datasets/ \
  -H 'Accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'dataset=@'"$DATA_CSV_PATH"';type=text/csv' \
  -F 'format=job' | jq -r '.id')

EVAL_NAME="test_experiment_bart"
EVAL_DESC="Test experiment (inference + eval) with BART"
EVAL_MODEL="hf://facebook/bart-large-cnn"
EVAL_MAX_SAMPLES="10"

JSON_STRING=$(jq -n \
                --arg name "$EVAL_NAME" \
                --arg desc "$EVAL_DESC" \
                --arg model "$EVAL_MODEL" \
                --arg dataset_id "$DATASET_ID" \
                --arg max_samples "$EVAL_MAX_SAMPLES" \
                '{name: $name, description: $desc, model: $model, dataset: $dataset_id, max_samples: $max_samples}')

echo Connecting to $BACKEND_URL...

echo Starting new experiment...

echo $JSON_STRING

curl -s $BACKEND_URL/api/v1/experiments_new/ \
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json' \
  -d "$JSON_STRING"
  1. Once experiment runs, hit the new experiments endpoint to get results:
curl -X 'GET' \
  'http://localhost:8000/api/v1/experiments_new/{yourid}/result/download' \
  -H 'accept: application/json'

You should get URLs for two files:

{
  "id": "65dfc643-b35a-4b0b-8491-3089d2fab944",
  "download_url": "http://localhost:4566/lumigator-storage/jobs/results/test_experiment_bart-evaluation/65dfc643-b35a-4b0b-8491-3089d2fab944/results.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=test%2F20250120%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20250120T211805Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=32b8a08273e6406534ffd7887813006f556b950c60270732a2cbe136bd1cdfa9"
}

Additional notes for reviewers

I already...

  • [X ] Tested the changes in a working environment to ensure they work as expected
  • Added some tests for any new functionality - would like to address in a follow-up PR
  • Updated the documentation (both comments in code and product documentation under /docs)
  • Checked if a (backend) DB migration step was required and included it if required

@github-actions github-actions bot added backend api Changes which impact API/presentation layer labels Jan 15, 2025
@veekaybee veekaybee force-pushed the 572_experiment_results branch from bbab0b5 to 9283c23 Compare January 15, 2025 12:40
@github-actions github-actions bot added the schemas Changes to schemas (which may be public facing) label Jan 16, 2025
@veekaybee veekaybee force-pushed the 572_experiment_results branch from a725415 to 3ee4aac Compare January 16, 2025 16:46
@veekaybee veekaybee requested a review from aittalam January 16, 2025 19:59
@veekaybee veekaybee marked this pull request as ready for review January 16, 2025 19:59
Copy link
Contributor

@javiermtorres javiermtorres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some comments but I'm pre-ok with your responses :)

@veekaybee veekaybee force-pushed the 572_experiment_results branch from 589e764 to 7d92dd7 Compare January 20, 2025 14:42
@veekaybee
Copy link
Member Author

Rebasing based on changes from #657 and #604

@veekaybee veekaybee self-assigned this Jan 21, 2025
@veekaybee veekaybee force-pushed the 572_experiment_results branch 2 times, most recently from 3e0f901 to 31d95f1 Compare January 27, 2025 15:57
@veekaybee
Copy link
Member Author

@aittalam - ready for a re-review, I've incorporated all the changes since the last time this was reviewed and noted a place we might want to be careful

Copy link
Contributor

@dpoulopoulos dpoulopoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@veekaybee veekaybee force-pushed the 572_experiment_results branch from ee93b2b to 21da5d6 Compare January 28, 2025 14:56
Copy link
Member

@aittalam aittalam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY Vicki! I think this is ready to merge after syncing with main.
Many thanks for addressing my comments 🙏

@veekaybee veekaybee force-pushed the 572_experiment_results branch from 4a49e79 to 670387b Compare January 28, 2025 20:04
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 28, 2025
@veekaybee veekaybee merged commit 1c73e22 into main Jan 28, 2025
13 checks passed
@veekaybee veekaybee deleted the 572_experiment_results branch January 28, 2025 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Changes which impact API/presentation layer backend documentation Improvements or additions to documentation schemas Changes to schemas (which may be public facing)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants