Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐕 Batch: Refactoring Test workflows in models #1484

Open
1 of 7 tasks
DhanshreeA opened this issue Jan 3, 2025 · 40 comments · Fixed by #1515
Open
1 of 7 tasks

🐕 Batch: Refactoring Test workflows in models #1484

DhanshreeA opened this issue Jan 3, 2025 · 40 comments · Fixed by #1515
Assignees

Comments

@DhanshreeA
Copy link
Member

DhanshreeA commented Jan 3, 2025

Summary

This issue will encompass efforts to reconcile, clean up, and enhance our test (and build) pipelines for individual models.

We currently have a test module and CLI command (ersilia test ...) that can check a given model for functionality, completeness, and correctness. In addition to this, we also have a testing playground - a test utility which checks a given model for functionality, completeness, and correctness; and is able to simulate running one or more models on a user's system.

Existing test in our model pipeline is quite redundant in face of these functionalities because it is quite naive in comparison since it only tests nullity in model predictions, and is not robust to how a model might serialize its outputs. Moreover, the Docker build pipelines are bloated with code that can be removed in favor of a singular workflow testing the built images. We also need to handle testing for ARM and AMD builds more smartly since currently we only test the AMD images, but recently we have experienced some models successfully building for the ARM platform but then not actually working.

Furthermore, we need to revisit H5 serialization within Ersilia, and also include tests for this functionality at the level of testing models.

Each of the objectives below should be considered individual tasks, and should be addressed in separate PRs referencing this issue.

Objective(s)

  • Consolidate the following input-output combinations in the testing scenarios covered by the ersilia test command:
  1. Input = CSV - Output = CSV
  2. Input = CSV - Output = HDF5
  3. Input = CSV - Output = JSON
  4. Input = SMILE - Output = CSV
  5. Input = SMILE - Output = HDF5
  6. Input = SMILE - Output = JSON
  • For the test-model.yml workflow, we should remove the current testing logic (L128-L144) and keep it in favor of only using the ersilia test command. We also want to upload the logs generated from this command, as well as the results of this command as artifacts with a retention period of 14 days.
  • Same as above, in the test-model-pr.yml workflow, we should only keep to using the ersilia test command. Same conditions apply for handling and uploading the logs and results as artifacts with retention of 14 days.
  • Refactor the upload-ersilia-pack.yml, and upload-bentoml.yml workflows to only build and publish model images (both for ARM and AMD), ie we can remove the testing logic from these workflows. These images should be tagged dev.
  • Refactor the testing playground to work with specific model ids, as well as image tags.
  • Create a new test workflow for docker builds that is triggered after the Upload model to DockerHub workflow. This workflow should utilise the Testing Playground utility from Ersilia and test the built model image (however it gets built, ie using Ersilia Pack or legacy approaches). This workflow should run on a matrix of ubuntu-latest, and macos-latest, to ensure that we are also testing the AMD images. Based on the result of this workflows, we can tag the images latest and identify which architectures they successfully work on.
  • The Post model upload workflow should run at the very last and update necessary metadata stores (Airtable, S3 JSON), and README. We can remove the step to create testing issues for community members at this point from this workflow.

Documentation

  1. ModelTester class used in the test CLI command: https://ersilia.gitbook.io/ersilia-book/ersilia-model-hub/developer-docs/model-tester
  2. Testing Playground utility: https://ersilia.gitbook.io/ersilia-book/ersilia-model-hub/developer-docs/testing-playground
@DhanshreeA DhanshreeA removed the status in Ersilia Model Hub Jan 3, 2025
@Abellegese Abellegese self-assigned this Jan 3, 2025
@GemmaTuron GemmaTuron changed the title 🐕 Batch: Refactoring Test workfllows in models 🐕 Batch: Refactoring Test workflows in models Jan 3, 2025
@GemmaTuron
Copy link
Member

Hi @Abellegese or @DhanshreeA

Can you clarify if the test command needs to be modified (according to point 1) or both the test command and the playground will be modified?
The test command only tests the model from source, right? And the only modification we will do to it currently is to test all different combinations of input and output which currently was not happening? Once an output is generated, whichever the format, the next step is to check that the output has the required length, is not none etc?
What are the modifications to do in the testing playground more specifically? Maybe opening one issue with more details for each task would be helpful as those get tackled.
I would also add that Documenting in GitBook is an important part of each task

@Abellegese
Copy link
Contributor

Hey @GemmaTuron our plan is to update both pipeline for this functionality. I am creating one issue for both.

@Abellegese
Copy link
Contributor

A few more detail about the features has been given here #1488.

@GemmaTuron
Copy link
Member

GemmaTuron commented Jan 9, 2025

We have re-evaluated our strategy for Model Testing and this is what we have finally agreed on:
The Model Testing happens through workflows in two repositories: eos-template and ersilia-maintenance. All those workflows should simply use different flavours of the ersilia test command, the playground will be reserved for testing the Ersilia CLI itself.
The test command will work in three levels. Each level can be run either on MacOS or Linux, for consistency we want to test models in both platforms.

  1. Basic: ersilia test [model_id] --from_dir/from_github/from_s3
    • The default test will not fetch the model through Ersilia, only download it locally unless it already exists (from_dir).
    • When using the --from_dir flag, the user needs to pass the path to local directory. When using from_github/from_s3, the model will automatically be downloaded in the following directory: eos/tmp/[model_id]
    • This command is designed to perform high-level checks, and can be run for example as part of the ersilia-maintenance. The high level checks include:
      • File integrity
      • All metadata fields compliant with the set rules (for metadata that is only included upon the first model incorporation, Contributor, S3, DockerHub and Docker Architecture, the test will only be performed if these fields are available in the .json or .yml files)
      • URLS (model repository, Dockerhub if existing, S3 if existing) are checked
      • Dependencies pinned: all dependencies have a version specified in the .yml or Dockerfile.
      • Model size (total size of all folders, incl. checkpoints). This information will be saved as metadata
  2. Shallow: ersilia test [model_id] --shallow --from_dir/from_github/from_s3/from_dockerhub
    • First the model is downloaded and the basic tests are performed. If the model is indicated --from_dockerhub, it will be downloaded from_github in the eos/tmp directory.
    • Next the model is fetched and served through Ersilia's CLI. So that:
      • If ersilia test [model_id] --shallow --from_dockerhub what will actually happen is: model test --from_github + model fetch --from_dockerhub
      • If ersilia test [model_id] --shallow --from_dir/from_github/from_s3/from_dockerhub: what will actually happen is model test --from_dir/from_github/from_s3 + model fetch --from_dir [path_to_dir]/[path_to_tmp]
    • The environment size is calculated (if from_dir/from:github/from_s3) and saved as metadata
    • The container size is calculated (if from_dockerhub) and saved as metadata
    • Output correctness:
      • All formats: .json, .csv, .h5
      • Consistency between runs
      • Not nulls
  3. Deep: ersilia test [model_id] --deep --from_dir/from_github/from_s3/from_dockerhub
    • All tests in basic and shallow are performed the same way as in --shallow but in addition the computational performance for 1, 50, 100 inputs is measured and stored as metadata (to decide, maybe just one performance, 100, needs to be stored?)

Optional flags to the test command:

  • as_json: save the output of the test command as an easily parsable .json file
  • version: image tag for docker image. Default, dev

Once the test command is refactored, the workflows on eos_template need to be modified. In general lines:

  • model-test-on-pr.yml will use the model test --shallow --from _dir
  • model-test-source.yml will use the model test --shallow --from_github
  • model-test-image.yml will use the model test --deep --from_dockerhub
    Because some metadata fields will be updated, Airtable will also need to be updated. We need to agree on which fields will be created to prepare the columns on the Airtable board @miquelduranfrigola can you take care of the naming of these?

Responsabilities

  • @Abellegese will refactor the test command
  • @DhanshreeA will consolidate the workflows (please revise what we have said to make sure everything makes sense)
  • @miquelduranfrigola will oversee the general dev and take care of new metadata fields

@miquelduranfrigola
Copy link
Member

miquelduranfrigola commented Jan 10, 2025

Thanks @GemmaTuron
Yes, I will summarize for all of you how the new and old AirTable fields are called
I will let you know as soon as I have made sufficient progress

@DhanshreeA
Copy link
Member Author

DhanshreeA commented Jan 13, 2025

So far, work has been completed for 1) default testing, and 2) shallow testing with all combinations for fetching a model. @Abellegese is currently implementing the optional flag as_json (version is implemented).

@GemmaTuron
Copy link
Member

Thanks @DhanshreeA and @Abellegese . Would be very useful if you can provide here an example with one model of the commands that can be run with the basic and --shallow and what is its output with the different flags, so we can see if any edits need to be made before moving on

@Abellegese
Copy link
Contributor

Okay @GemmaTuron we will give it here.

@Abellegese
Copy link
Contributor

Hi @GemmaTuron here is the sample for commands

Basic

  • ersilia test eos3b5e --from_dir/--from_s3/--from_github

basic_check

@Abellegese
Copy link
Contributor

Abellegese commented Jan 13, 2025

Shallow Dockerhub

ersilia test eos3b5e --shallow --from_dockerhub

Edited
There was two table title mistake and its been corrected.

  1. From Shallow Check Summary to Validation and Size Check Summary
  2. From Model Output Content Validation Summary to Consistency Summary Between Ersilia and Bash Execution Outputs

shallow

@Abellegese
Copy link
Contributor

Abellegese commented Jan 13, 2025

Shallow Dockerhub: Env Size instead of Docker Image in this case

Edited
There was two table title mistake and its been corrected.

  1. From Shallow Check Summary to Validation and Size Check Summary

ersilia test eos3b5e --shallow --from_dir/--from_github/--from_s3

git

@Abellegese
Copy link
Contributor

Abellegese commented Jan 13, 2025

Deep:

ersilia test eos3b5e --deep --from_dir/--from_github/--from_s3

comp

@miquelduranfrigola
Copy link
Member

On a quick look, this is quite amazing.

@DhanshreeA
Copy link
Member Author

DhanshreeA commented Jan 14, 2025

Hi @Abellegese some observations from running the test command locally:

  • For the model eos9gg2, I see the Model Input and Output Type tests as failing whereas that field is present in the metadata, and you can see it through the logs as well. This is when running ersilia test eos9gg2 --from_s3/--from_github:
{'Identifier': 'eos9gg2', 'Slug': 'chemical-space-projections-drugbank', 'Status': 'In progress', 'Title': 'Chemical space 2D projections against DrugBank', 'Description': 'This tool performs PCA, UMAP and tSNE projections taking the DrugBank chemical space as a reference. The Ersilia Compound Embeddings as used as descriptors. Four PCA components and two UMAP and tSNE components are returned.', 'Mode': 'In-house', 'Task': 'Representation', 'Input': 'Compound', 'Input Shape': 'Single', 'Output': 'Descriptor', 'Output Type': 'Float', 'Output Shape': 'List', 'Interpretation': 'Coordinates of 2D projections, namely PCA, UMAP and tSNE.', 'Tag': ['Embedding'], 'Publication': 'https://academic.oup.com/nar/article/52/D1/D1265/7416367', 'Source Code': 'https://github.com/ersilia-os/compound-embedding', 'License': 'GPL-3.0-or-later', 'S3': 'https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos9gg2.zip'}
  • Personal note for me: The DockerHub Architecture field will always fail when fetching the model from S3 because the field will be unset since the DockerHub building happens after S3 upload. (We can change this order).
  • The --shallow flag is not running the tests it is supposed to run, and this is happening with all the modes. I ran ersilia test eos3b5e --shallow --from_github/--from_dockerhub/--from_s3. I only see the same output as in the default behavior of the test command.
  • The --shallow flag works with the model eos9gg2 in the DockerHub configuration, ie, ersilia test eos9gg2 --shallow --from_dockerhub, but exits with the following error:
Performing shallow checks                Validation and Size Check Results                
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ CheckStatus ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Docker Image Size1134.70 MB │
├────────────────────┼─────────────────────────────────────────┤
│ Check Single Input │                                ✔ PASSED │
├────────────────────┼─────────────────────────────────────────┤
│ Check Predefined   │                                ✔ PASSED │
│ Example Input      │                                         │
├────────────────────┼─────────────────────────────────────────┤
│ Check Consistency  │                                ✔ PASSED │
│ of Model Output    │                                         │
└────────────────────┴─────────────────────────────────────────┘
⠦ Performing shallow checks 🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨

Error message:

Failed to read CSV from /tmp/tmpjnz08pks/bash_output.csv.
If this error message is not helpful, open an issue at:
 - https://github.com/ersilia-os/ersilia
Or feel free to reach out to us at:
 - hello[at]ersilia.io

If you haven't, try to run your command in verbose mode (-v in the CLI)
 - You will find the console log file in: /home/dee/eos/current.log
Run process finished successfully.
  • For the GitHub, S3, or Dir options, the shallow flags seems to be doing nothing for the model eos9gg2. I see the following output on the terminal:
Performing shallow checks
⠙ Performing shallow checks Run process finished successfully.

@Abellegese
Copy link
Contributor

Hi @DhanshreeA okay right, what was the command?

@DhanshreeA
Copy link
Member Author

Hi @DhanshreeA okay right, what was the command?

@Abellegese updated my comment.

@DhanshreeA
Copy link
Member Author

Okay, I think the shallow checks are working but not being displayed for some reason? Here are the logs when I run the test command in verbose mode:

ersilia -v test eos3b5e --shallow --from_dockerhub:

eos3b5e_test.log

What's more weird is that when I run it without verbosity, the process simply appears to exit.

@Abellegese
Copy link
Contributor

@DhanshreeA from this log seems nothing went wrong.

@Abellegese
Copy link
Contributor

Abellegese commented Jan 14, 2025

Updates: eos9gg2 with json file report

Image

eos9gg2-test.json

@Abellegese
Copy link
Contributor

Updates

  • I pushed which I believe the final refactoring that has everything specified in the task
  • I kept as_json flas as bool and save the report in the pwd named `eosxxxx-test.json'
  • I tested eos7jio from docker/from git./from s3 but the s3 does not have run.sh in the model/framework folder. This folder does not have anything in it except README file. We cant run --from_s3 in this case.
  • The command can update either of the metadata file, in the --deep --from_dir/--from_github/--from_s3 and it can save Env Size and CP (for three run)
  • For --deep --from_dockerhub it save the Image size and CP in the metadata

Note: I tested it locally extensively and will never know what will happen when you try this out and I will take responsibility for that.

@DhanshreeA
Copy link
Member Author

DhanshreeA commented Jan 15, 2025

@DhanshreeA from this log seems nothing went wrong.

@Abellegese that's the point. In the verbose mode, it was fine, but without the flag set up, at least for eos3b5e, the process simply exited with nothing printed on the screen.

However, with the recent push, I see some changes, but for some reason, Ersilia crashed with a Subprocess execution failed error. Attaching logs:

           Model Information Checks            
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Check                          ┃     Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Model ID                       │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Slug                     │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Status                   │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Title                    │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Description              │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Task                     │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Input                    │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Input Shape              │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Output                   │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Output Type              │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Output Shape             │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Interpretation           │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Tag                      │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Publication              │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Source Code              │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Contributor              │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model Dockerhub URL            │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ Model S3 URL                   │   ✔ PASSED │
└────────────────────────────────┴────────────┘
               Model File Checks               
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Check                          ┃     Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ File: Dockerfile               │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ File: metadata.json            │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ File: model/framework/run.sh   │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ File: src/service.py           │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ File: pack.py                  │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ File: README.md                │   ✔ PASSED │
├────────────────────────────────┼────────────┤
│ File: LICENSE                  │   ✔ PASSED │
└────────────────────────────────┴────────────┘
             Model Directory Sizes             
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Check                          ┃       Size ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Directory                      │        1MB │
└────────────────────────────────┴────────────┘
                                   Dependency Check                                    
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Check                          ┃                                             Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Dockerfile Check               │                                           ✔ PASSED │
├────────────────────────────────┼────────────────────────────────────────────────────┤
│ Check Details                  │                 Dockerfile dependencies are valid. │
└────────────────────────────────┴────────────────────────────────────────────────────┘
* Basic checks completed!
Performing shallow checks...
⠦ Performing shallow checks No predefined examples found for the model. Generating random examples.
⠇ Performing shallow checks                            Validation and Size Check Results                           
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Check                          ┃                                             Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Image Size                     │                                          353.22 MB │
├────────────────────────────────┼────────────────────────────────────────────────────┤
│ Check Single Input             │                                           ✔ PASSED │
├────────────────────────────────┼────────────────────────────────────────────────────┤
│ Check Predefined Example Input │                                           ✔ PASSED │
├────────────────────────────────┼────────────────────────────────────────────────────┤
│ Check Consistency of Model     │                                           ✔ PASSED │
│ Output                         │                                                    │
└────────────────────────────────┴────────────────────────────────────────────────────┘
⠙ Performing shallow checks 🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨

Error message:

Subprocess execution failed.
If this error message is not helpful, open an issue at:
 - https://github.com/ersilia-os/ersilia
Or feel free to reach out to us at:
 - hello[at]ersilia.io

If you haven't, try to run your command in verbose mode (-v in the CLI)
 - You will find the console log file in: /home/dee/eos/current.log
Run process completed.

I think this specifically failed in the CSV test because I do not see a file.csv having been created, however I do see file.h5, and file.json.

@DhanshreeA
Copy link
Member Author

DhanshreeA commented Jan 15, 2025

I am testing the following models, and I will attach logs or share results from each model in a separate comment:

  • eos3b5e
  • eos4e40
  • eos7d58
  • eos9gg2
  • eos3cf4
  • eos7w6n
  • eos4wt0
  • eos2gw4
  • eos7jio
  • eos5axz
  • eos4u6p

@DhanshreeA
Copy link
Member Author

Updates

* I pushed which I believe the final refactoring that has everything specified in the task

* I kept `as_json` flas as bool and save the report in the `pwd` named `eosxxxx-test.json'

* I tested eos7jio from docker/from git./from s3 but the s3 does not have `run.sh` in the `model/framework` folder. This folder does not have anything in it except README file. We cant run --from_s3 in this case.

* The command can update either of the metadata file, in the `--deep --from_dir/--from_github/--from_s3` and it can save Env Size and CP (for three run)

* For `--deep --from_dockerhub` it save the Image size and CP in the metadata

Note: I tested it locally extensively and will never know what will happen when you try this out and I will take responsibility for that.

@Abellegese The S3 issue with eos7jio is understandable, the model was refactored last week, and since our workflows are not working properly, the refactored version which has run.sh could not get uploaded to S3.

@DhanshreeA
Copy link
Member Author

DhanshreeA commented Jan 15, 2025

Model eos3b5e

  • ersilia test eos3b5e --from_github
  • ersilia test eos3b5e --from_s3
  • ersilia test eos3b5e --from_dir
  • ersilia test eos3b5e --from_dockerhub
  • ersilia test eos3b5e --from_github --as_json
  • ersilia test eos3b5e --from_s3 --as_json
  • ersilia test eos3b5e --from_dir --as_json
  • ersilia test eos3b5e --from_dockerhub --as_json
  • ersilia test eos3b5e --from_dockerhub --version
  • ersilia test eos3b5e --from_dockerhub --version dev --as_json
  • ersilia test eos3b5e --shallow --from_github
  • ersilia test eos3b5e --shallow --from_s3
  • ersilia test eos3b5e --shallow --from_dir
  • ersilia test eos3b5e --shallow --from_dockerhub
  • ersilia test eos3b5e --shallow --from_dockerhub --version
  • ersilia test eos3b5e --shallow --from_github --as_json
  • ersilia test eos3b5e --shallow --from_s3 --as_json
  • ersilia test eos3b5e --shallow --from_dir --as_json
  • ersilia test eos3b5e --shallow --from_dockerhub --as_json
  • ersilia test eos3b5e --shallow --from_dockerhub --version
  • ersilia test eos3b5e --deep --from_github
  • ersilia test eos3b5e --deep --from_s3
  • ersilia test eos3b5e --deep --from_dir
  • ersilia test eos3b5e --deep --from_dockerhub
  • ersilia test eos3b5e --deep --from_dockerhub --version

Notes:

  • The as_json flag seems to not be having any effect as I do not see any JSON file being serialized.
  • Again without the -v flag, the shallow checks don't display on the terminal, and I see the following logs:
* Basic checks completed!
Performing shallow checks...
⠸ Performing shallow checks Run process completed.
  • I think I figured out the issue with the shallow exiting without printing anything. I think when it runs into an error, it silently exits instead of printing anything. And this is only caught within the verbose mode.

@DhanshreeA
Copy link
Member Author

@Abellegese we can format the output in this table to display better:

                           Computational Performance Summary                           
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Check                          ┃                                             Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Computational Performance      │                                           ✔ PASSED │
│ Tracking                       │                                                    │
├────────────────────────────────┼────────────────────────────────────────────────────┤
│ Computational Performance      │         1 predictions executed in 9.88 seconds. 10 │
│ Tracking Details               │          predictions executed in 9.92 seconds. 100 │
│                                │             predictions executed in 10.01 seconds. │
└────────────────────────────────┴────────────────────────────────────────────────────┘

We simply only need a newline character after each line in the Computational Performance Tracking Details.

@Abellegese
Copy link
Contributor

@DhanshreeA yes nice idea.

@DhanshreeA
Copy link
Member Author

Another thing, when running deep checks, this is what I see:

* Basic checks completed!
Performing deep checks...
⠇ Performing shallow checks

We shouldn't see the line Performing deep checks until the checks have actually started.

@DhanshreeA
Copy link
Member Author

I am not sure I understand this table:

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Check                          ┃                                             Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Environment Size               │                                              475MB │
├────────────────────────────────┼────────────────────────────────────────────────────┤
│ Check Single Input             │                                           ✔ PASSED │
├────────────────────────────────┼────────────────────────────────────────────────────┤
│ Check Predefined Example Input │                                           ✔ PASSED │
├────────────────────────────────┼────────────────────────────────────────────────────┤
│ Check Consistency of Model     │                                           ✔ PASSED │
│ Output                         │                                                    │
└────────────────────────────────┴────────────────────────────────────────────────────┘

Especially the Check Single Input and Check Predefined Example Input fields. This comes from the shallow checks.

@DhanshreeA
Copy link
Member Author

Note on --as_json:

{
    "ModelInformationChecks": [
        {
            "Check": "Model ID",
            "Status": "PASSED"
        },
        {
            "Check": "Model Slug",
            "Status": "PASSED"
        },
        {
            "Check": "Model Status",
            "Status": "PASSED"
        },
        {
            "Check": "Model Title",
            "Status": "PASSED"
        },
        {
            "Check": "Model Description",
            "Status": "PASSED"
        },
        {
            "Check": "Model Task",
            "Status": "PASSED"
        },
        {
            "Check": "Model Input",
            "Status": "PASSED"
        },
        {
            "Check": "Model Input Shape",
            "Status": "PASSED"
        },
        {
            "Check": "Model Output",
            "Status": "PASSED"
        },
        {
            "Check": "Model Output Type",
            "Status": "PASSED"
        },
        {
            "Check": "Model Output Shape",
            "Status": "PASSED"
        },
        {
            "Check": "Model Interpretation",
            "Status": "PASSED"
        },
        {
            "Check": "Model Tag",
            "Status": "PASSED"
        },
        {
            "Check": "Model Publication",
            "Status": "PASSED"
        },
        {
            "Check": "Model Source Code",
            "Status": "PASSED"
        },
        {
            "Check": "Model Contributor",
            "Status": "PASSED"
        },
        {
            "Check": "Model Dockerhub URL",
            "Status": "PASSED"
        },
        {
            "Check": "Model S3 URL",
            "Status": "PASSED"
        },
        {
            "Check": "Model Docker Architecture",
            "Status": "PASSED"
        }
    ],
    "ModelFileChecks": [
        {
            "Check": "File: Dockerfile",
            "Status": "PASSED"
        },
        {
            "Check": "File: metadata.json",
            "Status": "PASSED"
        },
        {
            "Check": "File: model/framework/run.sh",
            "Status": "PASSED"
        },
        {
            "Check": "File: src/service.py",
            "Status": "PASSED"
        },
        {
            "Check": "File: pack.py",
            "Status": "PASSED"
        },
        {
            "Check": "File: README.md",
            "Status": "PASSED"
        },
        {
            "Check": "File: LICENSE",
            "Status": "PASSED"
        }
    ],
    "ModelDirectorySizes": [
        {
            "Check": "Directory",
            "Size": "1MB"
        }
    ],
    "DependencyCheck": [
        {
            "Check": "Dockerfile Check",
            "Status": "PASSED"
        },
        {
            "Check": "Check Details",
            "Status": "Dockerfile dependencies are valid."
        }
    ],
    "ValidationandSizeCheckResults": [
        {
            "Check": "Environment Size",
            "Status": "475MB"
        },
        {
            "Check": "Check Single Input",
            "Status": "PASSED"
        },
        {
            "Check": "Check Predefined Example Input",
            "Status": "PASSED"
        },
        {
            "Check": "Check Consistency of Model Output",
            "Status": "PASSED"
        }
    ],
    "ConsistencySummaryBetweenErsiliaandBashExecutionOutputs": [],
    "ModelOutputContentValidationSummary": [
        {
            "Check": "str : CSV",
            "Detail": "Valid Content",
            "Status": "PASSED"
        },
        {
            "Check": "str : JSON",
            "Detail": "Valid Content",
            "Status": "PASSED"
        },
        {
            "Check": "str : HDF5",
            "Detail": "Valid Content",
            "Status": "PASSED"
        },
        {
            "Check": "list : CSV",
            "Detail": "Valid Content",
            "Status": "PASSED"
        },
        {
            "Check": "list : JSON",
            "Detail": "Valid Content",
            "Status": "PASSED"
        },
        {
            "Check": "list : HDF5",
            "Detail": "Valid Content",
            "Status": "PASSED"
        },
        {
            "Check": "csv : CSV",
            "Detail": "Valid Content",
            "Status": "PASSED"
        },
        {
            "Check": "csv : JSON",
            "Detail": "Valid Content",
            "Status": "PASSED"
        },
        {
            "Check": "csv : HDF5",
            "Detail": "Valid Content",
            "Status": "PASSED"
        }
    ],
    "ComputationalPerformanceSummary": [
        {
            "Check": "Computational Performance Tracking",
            "Status": "PASSED"
        },
        {
            "Check": "Computational Performance Tracking Details",
            "Status": "1 predictions executed in 9.86 seconds. 10 predictions executed in 9.93 seconds. 100 predictions executed in 10.01 seconds."
        }
    ]
}

This needs to be changed to something more machine readable where it's straightforward to do json[key] and get the value. Also the file is not being saved in the PWD, it's being saved in PWD/.. (ie a dir above that). This needs to change.

@DhanshreeA
Copy link
Member Author

Okay, so the next steps involve updating ersilia such that we can actually update the new metadata fields to Airtable. Mainly the airtableops.py script takes care of that. The function update_metadata_to_airtable runs to read the metadata file from the repo and then uses that to update the corresponding fields for a model in AirTable.

This function utilizes the RepoMetadataFile class which in turn uses the BaseInformation class to read these fields. Mainly this class is what we want to update with the new fields. @Abellegese should open a PR to make this change, and also add a unit test for this. The test fixture should have an ideal metadata.json file and similarly a metadata.yaml file with these new fields. And RepoMetadataClass should be able to read from these files correctly.

The updated fields include:

  1. Docker Pack Method
  2. Environment Size
  3. Image Size
  4. Computational Performance 1
  5. Computational Performance 10
  6. Computational Performance 100

As per @miquelduranfrigola these fields have been created in the Airtable DB.

@DhanshreeA
Copy link
Member Author

Moreover, just for neatness, we should filter these warning logs from fuzzywuzzy:

/Users/mduranfrigola/miniconda3/envs/ersilia/lib/python3.11/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
 warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')

@DhanshreeA
Copy link
Member Author

Let's update the JSON Format to something like:

{
    "model_information_checks": {
        "model_id": true,
        "model_slug": true,
        "model_status": true,
        "model_title": true,
        "model_description": true,
        "model_task": true,
        "model_input": true,
        "model_input_shape": true,
        "model_output": true,
        "model_output_type": true,
        "model_output_shape": true,
        "model_interpretation": true,
        "model_tag": true,
        "model_publication": true,
        "model_source_code": true,
        "model_contributor": true,
        "model_dockerhub_url": true,
        "model_s3_url": true,
        "model_docker_architecture": true
    },
    "model_file_checks": {
        "dockerfile": true,
        "metadata_json": true,
        "model_framework_run_sh": true,
        "src_service_py": true,
        "pack_py": true,
        "readme_md": true,
        "license": true
    },
    "model_directory_sizes": {
        "directory_size_mb": 1
    },
    "dependency_check": {
        "dockerfile_check": true,
        "check_details": "Dockerfile dependencies are valid."
    },
    "validation_and_size_check_results": {
        "environment_size_mb": 475,
        "check_single_input": true,
        "check_predefined_example_input": true,
        "check_consistency_of_model_output": true
    },
    "consistency_summary_between_ersilia_and_bash_execution_outputs": [],
    "model_output_content_validation_summary": {
        "str_csv": true,
        "str_json": true,
        "str_hdf5": true,
        "list_csv": true,
        "list_json": true,
        "list_hdf5": true,
        "csv_csv": true,
        "csv_json": true,
        "csv_hdf5": true
    },
    "computational_performance_summary": {
            "pred_1": 9.86,
            "pred_10": 9.93,
            "pred_100": 10.01
        }
}

@Abellegese
Copy link
Contributor

Noted @DhanshreeA .

@Abellegese
Copy link
Contributor

Abellegese commented Jan 16, 2025

Updates

  1. I refactored the eos3b5e model locally to make it to return invalid values for several datastructure output such as Single, List, Flexible List, Matrix and Serializable Object
  2. I added a detailed traceback based error tracking to be able to display the error without the verbosity mode
  3. I updated the way json report is created, meaning it can now save the runtime error in it and the code moved to the finally in the try-excepy block to make the result json file be created even at the failure case with the detailed runtime error in it.
  4. The file content for combination of input and output was implemented in this check_model_output_content function but now also other checks in the model tester that executes run command now have validation for their output content.
  5. Performed code clean up

Here is sample output to test invalid contents

Image

Image

Image

@DhanshreeA
Copy link
Member Author

DhanshreeA commented Jan 21, 2025

Results from Model eos4e40

  • ersilia test eos4e40 --from_github
  • ersilia test eos4e40 --from_s3
  • ersilia test eos4e40 --from_dir
  • ersilia test eos4e40 --from_dockerhub
  • ersilia test eos4e40 --from_github --as_json
  • ersilia test eos4e40 --from_s3 --as_json
  • ersilia test eos4e40 --from_dir --as_json
  • ersilia test eos4e40 --from_dockerhub --as_son
  • ersilia test eos4e40 --from_dockerhub --version
  • ersilia test eos4e40 --from_dockerhub --version dev --as_json
  • ersilia test eos4e40 --shallow --from_s3
  • ersilia test eos4e40 --shallow --from_github
  • ersilia test eos4e40 --shallow --from_dir
  • ersilia test eos4e40 --shallow --from_dockerhub
  • ersilia test eos4e40 --shallow --from_dockerhub --version
  • ersilia test eos4e40 --shallow --from_github --as_json
  • ersilia test eos4e40 --shallow --from_s3 --as_json
  • ersilia test eos4e40 --shallow --from_dir --as_json
  • ersilia test eos4e40 --shallow --from_dockerhub --as_json
  • ersilia test eos4e40 --shallow --from_dockerhub --version
  • ersilia test eos4e40 --deep --from_github
  • ersilia test eos4e40 --deep --from_s3
  • ersilia test eos4e40 --deep --from_dir
  • ersilia test eos4e40 --deep --from_dockerhub
  • ersilia test eos4e40 --deep --from_dockerhub --version

@GemmaTuron
Copy link
Member

GemmaTuron commented Feb 3, 2025

Quick questions to complete the documentation around the Model Test. Sorry if this has already been discussed before between you @Abellegese and @DhanshreeA but it is not entirely clear to me.

Just to be clear, I am using as examples eos3b5e (BentoML) and eos3mk2 (FastAPI). There are issues with Docker that are related to problems I have already described before but are not smartly caught by the test module

  • In the basic test command ersilia test model_id. If no flag is passed, what is the fetch from? Same for the other commands
  • The dependency check in the basic command only checks if versions are pinned, but does not install anything right?
  • The dependency check in basic command for FastAPI models needs to be modified, as now it is looking for the same format as the old dockerfile. See example attached of the error for model eos3mk2 which actually does have the dependencies pinned @Abellegese : eos3mk2_deps.txt
  • When checking the metadata fields, if a metadata field does not exist because it has not yet been created (i.e URLS) will it still appear as FAILED or it will show something else?
  • How many molecules are tested in --shallow and --deep? Deep uses 1,50 and 100 finally? Which one is saved as metadata?

Using eos3b5e as example: ersilia test eos3b5e --shallow --from_github

  • In the shallow checks the warning pops up: No predefined examples found for the model. Generating random examples. but then the Check Predefined Example Input appears as Passed. Is this meant to be this way? A random example is generated and then the check passes?
  • What is the difference between the "Check Consistency of Model Output" and "Consistency Summary Between Ersilia and Bash Execution Outputs". I am not sure I follow what happens in the second one.
  • Where does it check there are no Null outputs? I assume this is checked at some point but no specific report is generated?
  • I noticed if you pass an incorrect model ID it does not give an informative error, maybe this could be improved: AttributeError: 'NoneType' object has no attribute 'get'

Fetching models from dockerhub is giving me a few errors:

Using eos3b5e as example: ersilia test eos3b5e --shallow --from_dockerhub

  • The checks are not correctly saved in the json file nor printed on the terminal? This is the json file saved eos3b5e-test-dockerhub.json and this is what I see on the terminal right below the Dependency check table: ⠙ Performing shallow checks Run process completed. I have tried using the -v option and I see an error I have already reported many times, which is:
13:58:17 | INFO     | Detailed error:
13:58:17 | INFO     | Standard model run from CSV was not possible for model eos3b5e
13:58:17 | INFO     | Output file /home/gturon/eos/dest/eos3b5e/example_standard_output.csv was not created successfully
13:58:17 | INFO     | Hints:
13:58:17 | INFO     | If you fetch this model from Docker Hub, or you are running it through URL, this is the first time run is executed in your local computer. Reach out to Ersilia to get specific help.

The model tester did not give me any hint that there was a problem, only that the shallow checks were completed. Full error message here: eos3b5e-test-dockerhub.json

Using eos3mk2 as example: ersilia test eos3mk2 --shallow --from_dockerhub

  • The first shallow tests pass but then it runs into an error (see attached) eos3mk2_shallow_error.txt
  • Using the -v command this is the information I get. I don't know why the Completed Status on the Check Consistency of Model Output gives Error if the Check is passed and I do not know what is happening with the bash output csv
14:08:21 | INFO     | Model output is consistent
14:08:21 | ERROR    | Completed status: [('Check Consistency of Model Output', 'Model Output Was Consistent', '[green]✔ PASSED[/green]')]
                                      Validation and Size Check Results                                       
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Check                          ┃       Details        ┃                                             Status ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Image Size Mb                  │      634.33 MB       │                       Size Successfully Calculated │
├────────────────────────────────┼──────────────────────┼────────────────────────────────────────────────────┤
│ Check Predefined Example Input │       ✔ PASSED       │                                                    │
├────────────────────────────────┼──────────────────────┼────────────────────────────────────────────────────┤
│ Check Consistency of Model     │   Model Output Was   │                                           ✔ PASSED │
│ Output                         │      Consistent      │                                                    │
└────────────────────────────────┴──────────────────────┴────────────────────────────────────────────────────┘
14:08:21 | DEBUG    | Randomly sampling input
14:08:21 | DEBUG    | Default environment: ersiliatest
14:08:21 | DEBUG    | Running bash script: /tmp/tmp6xyq_v0q/script.sh
⠴ Performing shallow checks 14:08:21 | DEBUG    | Subprocess output: 
14:08:21 | INFO     | Bash script subprocess output: 
🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨

Error message:

Failed to read CSV from /tmp/tmp6xyq_v0q/bash_output.csv.
If this error message is not helpful, open an issue at:
 - https://github.com/ersilia-os/ersilia
Or feel free to reach out to us at:
 - hello[at]ersilia.io

If you haven't, try to run your command in verbose mode (-v in the CLI)
 - You will find the console log file in: /home/gturon/eos/current.log
14:08:21 | INFO     | Performance Extraction is started
Run process completed.

Deep command:
Using eos3b5e as example: ersilia test eos3b5e --deep --from_github : works fine
Using eos3mk2 as example: ersilia test eos3mk2 --deep --from_github : the test only prints the basic tests and then prints an error I do not understand. This error only shows if you use the -v command, else the test command simply says shallow tests Run process completed, which is totally misleading. If I manually delete the place where the model is stored, it works fine again. This might be due to the fact that the model was fetched from Dockerhub for the test previously. Maybe this temp directory needs to be deleted always.

⠋ Performing shallow checks 14:17:36 | INFO     | 🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨
14:17:36 | INFO     | Error message:
14:17:36 | INFO     | 'NoneType' object is not subscriptable
14:17:36 | INFO     | If this error message is not helpful, open an issue at:
14:17:36 | INFO     | - https://github.com/ersilia-os/ersilia
14:17:36 | INFO     | Or feel free to reach out to us at:
14:17:36 | INFO     | - hello[at]ersilia.io
14:17:36 | INFO     | If you haven't, try to run your command in verbose mode (-v in the CLI)
14:17:36 | INFO     | - You will find the console log file in: /home/gturon/eos/current.log
⠸ Performing shallow checks 14:17:36 | INFO     | /home/gturon/miniconda3/envs/ersiliatest/lib/python3.12/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
14:17:36 | INFO     | warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
14:17:36 | INFO     | 14:17:36 | DEBUG    | Repo path specified: /home/gturon/eos/temp/eos3mk2
14:17:36 | INFO     | 14:17:36 | DEBUG    | Absolute path: /home/gturon/eos/temp/eos3mk2
14:17:36 | ERROR    | Error executing command: Command 'ersilia -v fetch eos3mk2 --from_dir /home/gturon/eos/temp/eos3mk2' returned non-zero exit status 1.
14:17:36 | DEBUG    | Output: 🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨
Error message:
'NoneType' object is not subscriptable
If this error message is not helpful, open an issue at:
- https://github.com/ersilia-os/ersilia
Or feel free to reach out to us at:
- hello[at]ersilia.io
If you haven't, try to run your command in verbose mode (-v in the CLI)
- You will find the console log file in: /home/gturon/eos/current.log
14:17:36 | ERROR    | Error: /home/gturon/miniconda3/envs/ersiliatest/lib/python3.12/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
14:17:36 | DEBUG    | Repo path specified: /home/gturon/eos/temp/eos3mk2
14:17:36 | DEBUG    | Absolute path: /home/gturon/eos/temp/eos3mk2
14:17:36 | INFO     | Performance Extraction is started
Run process completed.

The documentation is now updated here. Please have a look and we will refine it with the answers to these questions

@GemmaTuron
Copy link
Member

Some more comments:

  1. If a model is already fetched from DockerHub, and you simply run a shallow test, it will fail as it does not then download it from github and the basic tests do not happen. When I specified the model was fetched from_dockerhub, it did work.
    Can we use simply ersilia test model_id --shallow or we need to specify the flag from_xxx always @Abellegese ?
(ersiliatest) gturon@pujarnol:~$ ersilia test eos3ujl --shallow
/home/gturon/miniconda3/envs/ersiliatest/lib/python3.12/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
  warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
Model testing started for: eos3ujl
🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨

Error message:

[Errno 2] No such file or directory: '/home/gturon/eos/temp/eos3ujl/metadata.yml'
If this error message is not helpful, open an issue at:
 - https://github.com/ersilia-os/ersilia
Or feel free to reach out to us at:
 - hello[at]ersilia.io

If you haven't, try to run your command in verbose mode (-v in the CLI)
 - You will find the console log file in: /home/gturon/eos/current.log
Traceback (most recent call last):
  File "/home/gturon/github/ersilia-os/ersilia/ersilia/utils/exceptions_utils/throw_ersilia_exception.py", line 25, in inner_function
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/gturon/github/ersilia-os/ersilia/ersilia/publish/test.py", line 1331, in check_information
    data = self._get_metadata()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/gturon/github/ersilia-os/ersilia/ersilia/publish/test.py", line 1109, in _get_metadata
    with open(path, "r") as file:
         ^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/gturon/eos/temp/eos3ujl/metadata.yml'
  1. The session files keep giving problems it is extremely annoying:
    I cannot complete the test because the session was not overwritable
PermissionError: [Errno 13] Permission denied: '/home/gturon/eos/sessions/session_301099/_logs/tmp/ersilia-wxcqbz7a'

More important than this @Abellegese again, I had to run the test command in verbose to actually see that, as the terminal only printed Run process completed but did not show that all the shallow checks had failed. This is quite critical

@GemmaTuron
Copy link
Member

The issue with the bash run of FastAPI packaged models seems consistent, I have found the same problem for model eos3ujl from dockerhub:

17:50:21 | DEBUG    | Randomly sampling input
17:50:21 | DEBUG    | Default environment: ersiliatest
17:50:21 | DEBUG    | Running bash script: /tmp/tmpp5yn_lnc/script.sh
⠇ Performing shallow checks 17:50:21 | DEBUG    | Subprocess output: 
17:50:21 | INFO     | Bash script subprocess output: 
🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨

Error message:

Failed to read CSV from /tmp/tmpp5yn_lnc/bash_output.csv.
If this error message is not helpful, open an issue at:
 - https://github.com/ersilia-os/ersilia
Or feel free to reach out to us at:
 - hello[at]ersilia.io

If you haven't, try to run your command in verbose mode (-v in the CLI)
 - You will find the console log file in: /home/gturon/eos/current.log
17:50:21 | INFO     | Performance Extraction is started
Run process completed.

If I then try:
ersilia test eos3ujl --shallow --from_github
IT fails because even if I try to delete the model:

(ersiliatest) gturon@pujarnol:~$ ersilia delete eos3ujl
/home/gturon/miniconda3/envs/ersiliatest/lib/python3.12/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
  warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
💁 Model eos3ujl is not available locally, no delete necessary.

The temporal folder is not passing the checks for some reason:

 Performing shallow checks 17:55:14 | INFO     | Fetching model from: ['--from_dir', '/home/gturon/eos/temp/eos3ujl']
⠋ Performing shallow checks 17:55:14 | INFO     | 🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨
17:55:14 | INFO     | Error message:
17:55:14 | INFO     | 'NoneType' object is not subscriptable
17:55:14 | INFO     | If this error message is not helpful, open an issue at:
17:55:14 | INFO     | - https://github.com/ersilia-os/ersilia
17:55:14 | INFO     | Or feel free to reach out to us at:
17:55:14 | INFO     | - hello[at]ersilia.io
17:55:14 | INFO     | If you haven't, try to run your command in verbose mode (-v in the CLI)
17:55:14 | INFO     | - You will find the console log file in: /home/gturon/eos/current.log
⠙ Performing shallow checks 17:55:15 | INFO     | /home/gturon/miniconda3/envs/ersiliatest/lib/python3.12/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
17:55:15 | INFO     | warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
17:55:15 | INFO     | 17:55:14 | DEBUG    | Repo path specified: /home/gturon/eos/temp/eos3ujl
17:55:15 | INFO     | 17:55:14 | DEBUG    | Absolute path: /home/gturon/eos/temp/eos3ujl
17:55:15 | ERROR    | Error executing command: Command 'ersilia -v fetch eos3ujl --from_dir /home/gturon/eos/temp/eos3ujl' returned non-zero exit status 1.
17:55:15 | DEBUG    | Output: 🚨🚨🚨 Something went wrong with Ersilia 🚨🚨🚨
Error message:
'NoneType' object is not subscriptable
If this error message is not helpful, open an issue at:
- https://github.com/ersilia-os/ersilia
Or feel free to reach out to us at:
- hello[at]ersilia.io
If you haven't, try to run your command in verbose mode (-v in the CLI)
- You will find the console log file in: /home/gturon/eos/current.log
17:55:15 | ERROR    | Error: /home/gturon/miniconda3/envs/ersiliatest/lib/python3.12/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
17:55:14 | DEBUG    | Repo path specified: /home/gturon/eos/temp/eos3ujl
17:55:14 | DEBUG    | Absolute path: /home/gturon/eos/temp/eos3ujl
17:55:15 | INFO     | Performance Extraction is started
Run process completed.

I have to manually delete the temp folder and run the command again for it to run. Then it does run and does not find the same issues as the --from_dockerhub
Again without the -v command I only see Run Process Completed

@Abellegese
Copy link
Contributor

Abellegese commented Feb 3, 2025

Hi @GemmaTuron let me summarize the first comments using LLMs


General Questions & Observations

  • Model Fetching Behavior: If no flag is passed, it is unclear where the test fetches the model from.
  • Dependency Check : The current check only verifies if versions are pinned but does not install dependencies.
  • FastAPI Model Dependency Check Issue: The test module expects the old Dockerfile format and incorrectly reports missing dependencies for FastAPI models.
  • Metadata Field Failures – If certain metadata fields (e.g., URLs) do not exist, it is unclear if they fail the test or show a different message.
  • Molecule Testing Counts – Clarification is needed on how many molecules are tested in --shallow vs. --deep and which values are stored in metadata.
  • Predefined Example Check Inconsistency – The test warns that no predefined examples are found, but then marks the check as "Passed."

Test Execution & Errors

DockerHub Fetching Issues

  • JSON Output Not Saved Properly – When testing from DockerHub, the JSON file does not correctly reflect the checks performed.
  • Missing Error Handling – If a model fails due to Docker-related issues, the test output does not clearly indicate the failure.

Specific Model Issues

  • eos3b5e (--shallow --from_dockerhub) – The test appears to complete but does not provide meaningful hints when failures occur.
  • eos3mk2 (--shallow --from_dockerhub) – Reports a passed check for consistency but still outputs an error message related to CSV reading.
  • eos3mk2 (--deep --from_github) – Test fails with a NoneType error but does not indicate the issue unless run with -v. Clearing the model’s stored directory resolves the issue, suggesting a potential problem with cached directories.

@Abellegese
Copy link
Contributor

Abellegese commented Feb 3, 2025

Summarizing the second and the third one

  • Shallow Test Failure: When running a shallow test on a model that’s already fetched from DockerHub, it fails because it doesn’t attempt to download from GitHub, preventing basic tests from running. If the --from_dockerhub flag is specified, the test works. However, there’s a question of whether we need to always specify from_xxx, or can we use ersilia test model_id --shallow without it. Running the test without the -v flag hides critical errors, such as a missing metadata.yml file, making debugging more difficult.

  • Session File Issues: Problems with session files are causing tests to fail due to a PermissionError when overwriting session logs. This error isn’t visible without using the -v flag, which should be a critical fix, as it currently just prints "Run process completed" without giving adequate information about shallow check failures.

  • FastAPI Packaged Models Bash Run Issue: The issue with FastAPI packaged models failing consistently in the bash run (e.g., for model eos3ujl) remains unresolved. The error indicates that CSV files cannot be read from the expected temporary path (/tmp/tmpp5yn_lnc/bash_output.csv), which prevents the process from completing. Using the --from_github flag also fails when the model doesn’t exist locally, with attempts to delete it resulting in "Model not available locally."

  • Temporary Folder Problems: When fetching a model from a temporary folder, the shallow checks fail with a 'NoneType' object is not subscriptable error. This error isn’t visible without verbose mode. It’s necessary to manually delete the temporary folder and rerun the command for it to work, even though the same errors are suppressed in non-verbose mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants