Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Load pipeline catalog dynamically and error out when it is empty/not found #391

Merged
merged 23 commits into from
Nov 20, 2024

Conversation

surchs
Copy link
Contributor

@surchs surchs commented Nov 15, 2024

No longer needed

Changes proposed in this pull request:

  • new getter function that uses requests to load the catalog
  • failing that, falls back to a local copy of the catalog (that we'll need to keep up to date)

For reviewer: seems like mocking requests is a whole thing. My unit test mainly ensures that our fallback local file works if the request to the GitHub pipeline catalog fails. But we may want to be a bit smarter about testing this in the future. Maybe you have ideas.

Checklist

This section is for the PR reviewer

  • PR has an interpretable title with a prefix ([ENH], [FIX], [REF], [TST], [CI], [MNT], [INF], [MODEL], [DOC]) (see our Contributing Guidelines for more info)
  • PR has a label for the release changelog or skip-release (to be applied by maintainers only)
  • PR links to GitHub issue with mention Closes #XXXX
  • Tests pass
  • Checks pass

For new features:

  • Tests have been added

For bug fixes:

  • There is at least one test that would fail under the original bug conditions.

Summary by Sourcery

Enhance the CLI to dynamically load the pipeline catalog from a remote source, with a fallback to a local copy if the remote is unavailable. Update the project dependencies to include 'httpx' for HTTP requests and add tests to ensure the robustness of the catalog loading process.

Enhancements:

  • Implement dynamic loading of the pipeline catalog in the CLI, allowing it to fetch the catalog from a remote source and fall back to a local copy if the remote is unavailable.

Build:

  • Add 'httpx' and its dependencies to the project requirements for handling HTTP requests.

Tests:

  • Add tests to ensure the pipeline catalog is loaded from a local backup if the remote source is unreachable, and raise an exception if both sources fail.
  • Add tests to verify successful loading of the pipeline catalog from a remote source.

@coveralls
Copy link
Collaborator

coveralls commented Nov 15, 2024

Pull Request Test Coverage Report for Build 11940477817

Details

  • 55 of 55 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.06%) to 98.425%

Totals Coverage Status
Change from base Build 11937505138: 0.06%
Covered Lines: 1000
Relevant Lines: 1016

💛 - Coveralls

Copy link

codecov bot commented Nov 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.42%. Comparing base (0fa03ee) to head (2fde7f0).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #391      +/-   ##
==========================================
+ Coverage   98.36%   98.42%   +0.06%     
==========================================
  Files          18       18              
  Lines         977     1016      +39     
==========================================
+ Hits          961     1000      +39     
  Misses         16       16              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@surchs surchs marked this pull request as ready for review November 18, 2024 16:17
@surchs
Copy link
Contributor Author

surchs commented Nov 18, 2024

@alyssadai this should be ready for you now

@alyssadai alyssadai self-requested a review November 18, 2024 23:00
Copy link

sourcery-ai bot commented Nov 19, 2024

Reviewer's Guide by Sourcery

This PR implements dynamic loading of the pipeline catalog by first attempting to fetch it from a remote GitHub URL, with a fallback to a local backup file if the remote fetch fails. The implementation uses the httpx library for HTTP requests and includes comprehensive error handling and testing.

Sequence diagram for dynamic loading of pipeline catalog

sequenceDiagram
    participant User
    participant CLI
    participant RemoteServer
    participant LocalBackup

    User->>CLI: Request pipeline catalog
    CLI->>RemoteServer: Fetch catalog from remote URL
    alt Remote fetch successful
        RemoteServer-->>CLI: Return catalog data
    else Remote fetch fails
        CLI->>LocalBackup: Load catalog from local backup
        LocalBackup-->>CLI: Return catalog data
    end
    CLI-->>User: Provide catalog data
Loading

Updated class diagram for pipeline catalog loading

classDiagram
    class Mappings {
        +Namespace NP
        +Path PROCESSING_PIPELINE_PATH
        +String PROCESSING_PIPELINE_URL
        +parse_pipeline_catalog()
        +get_pipeline_catalog(get_url: str, get_path: Path) dict
        +get_pipeline_uris(in_arr: list) dict
        +get_pipeline_versions(in_arr: list) dict
    }
    class FileUtils {
        +load_json(path: Path) dict
    }
    Mappings --> FileUtils : uses
    note for Mappings "Handles dynamic loading of pipeline catalog"
Loading

File-Level Changes

Change Details Files
Implemented dynamic pipeline catalog loading with remote/local fallback mechanism
  • Added new get_pipeline_catalog function that attempts to fetch from remote URL first
  • Added fallback to local JSON file if remote fetch fails
  • Added error handling for HTTP errors and JSON decode failures
  • Refactored pipeline URI and version loading to work with the new catalog format
bagel/mappings.py
Added comprehensive unit tests for the new pipeline catalog loading functionality
  • Added test for successful remote catalog fetching
  • Added test for fallback to local backup when remote fails
  • Added test for error when both remote and local fetches fail
  • Updated existing pipeline URI and version tests to work with new implementation
tests/unit/test_derivative_utils.py
Updated project dependencies and configuration
  • Added httpx library as a new dependency
  • Added local backup of pipeline catalog
  • Removed pipeline-catalog git submodule
requirements.txt
setup.cfg
.gitmodules
bagel/local_vocab_backup/processing_pipelines.json

Assessment against linked issues

Issue Objective Addressed Explanation
#383 Fetch pipeline-catalog dynamically via HTTP GET request from GitHub
#383 Use local submodule as fallback if remote request fails
#374 Implement error handling when pipeline-catalog is not found or returns empty index
#374 Implement dynamic loading of pipeline catalog with fallback mechanism

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @surchs - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider using the responses library for mocking HTTP requests in tests - it provides a cleaner way to mock requests and responses without the current workarounds needed for httpx.Response objects.
Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟡 Testing: 3 issues found
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Hi @surchs! 👋

@sourcery-ai is now installed on this repository.

We found this recent PR of yours and reviewed it to show you what Sourcery can do.

If you want to review another PR, just comment with @sourcery-ai review

Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@alyssadai alyssadai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @surchs!

The overall logic of the changes makes sense to me. I left some suggestions and a few comments for code clarity, and I think there's one potential modification to be made to ensure we don't emit an irrelevant error message by using one of the existing CLI utilities.

See what makes sense to you!

@surchs
Copy link
Contributor Author

surchs commented Nov 19, 2024

Thanks Alyssa, agree with all your comments. Unfortunately airport network is too bad for me to accept your changes, so if you'd just push directly to this branch and then accept for me, I'd appreciate!

@alyssadai alyssadai added pr-minor Non-breaking feature or enhancement, will increment minor version (0.+1.0) pr-patch Incremental feature improvement, will increment patch version when merged (0.0.+1) and removed pr-minor Non-breaking feature or enhancement, will increment minor version (0.+1.0) labels Nov 20, 2024
@alyssadai alyssadai added the release Create a release when this PR is merged label Nov 20, 2024
@alyssadai alyssadai self-requested a review November 20, 2024 19:30
Copy link
Contributor

@alyssadai alyssadai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧑‍🍳

@alyssadai alyssadai changed the title [ENH] CLI loads pipeline catalog dynamically [ENH] Loads pipeline catalog dynamically and error out when it is empty/not found Nov 20, 2024
@alyssadai alyssadai changed the title [ENH] Loads pipeline catalog dynamically and error out when it is empty/not found [ENH] Load pipeline catalog dynamically and error out when it is empty/not found Nov 20, 2024
@alyssadai alyssadai merged commit 5a17f81 into main Nov 20, 2024
11 checks passed
@alyssadai alyssadai deleted the issue383 branch November 20, 2024 19:43
Copy link
Contributor

🚀 PR was released in v0.3.4 🚀

@neurobagel-bot neurobagel-bot bot added the released This issue/pull request has been released. label Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-patch Incremental feature improvement, will increment patch version when merged (0.0.+1) release Create a release when this PR is merged released This issue/pull request has been released.
Projects
None yet
3 participants