Test both pinned and unpinned versions of IREE dependencies #760

ScottTodd · 2025-01-06T18:07:09Z

We have many workflows testing only unpinned versions of the IREE packages:, such as:

shark-ai/.github/workflows/ci-sharktank.yml

Lines 79 to 84 in 5f08cb2

    
                     # Install nightly IREE packages. 
        
                     # We could also pin to a known working or stable version. 
        
                     pip install -f https://iree.dev/pip-release-links.html --pre \ 
        
                       iree-base-compiler \ 
        
                       iree-base-runtime \ 
        
                       iree-turbine

Some workflows explicitly test pinned versions like

shark-ai/.github/workflows/ci-sglang-benchmark.yml

Lines 71 to 75 in 5f08cb2

    
                     # Pin to known-working versions. 
        
                     pip install -f https://iree.dev/pip-release-links.html --pre --upgrade \ 
        
                       iree-base-compiler==3.1.0rc20241204 \ 
        
                       iree-base-runtime==3.1.0rc20241204 \ 
        
                       "numpy<2.0"

Having inconsistent version pinning results in fragmented PRs updating pins like #757, #746, and #721.

We should further consolidate where version pins are defined and also refactor some workflows to test both pinned and unpinned versions. Workflows that pin the versions can be marked as "required checks" and give us confidence that workflow failures are a result of the changes in a PR/commit and not due to changes in dependencies. We do still want early and regular signal for upcoming API breaks, regressions, and other issues coming from dependencies though, so these versions of workflows could still run on pull requests or at least on schedules (e.g. nightly).

ScottTodd · 2025-01-06T20:30:46Z

Brainstorming a few strategies for this...

Possible strategies

A) Test with only pinned, use dependabot to send PRs that try new versions

This would involve switching all workflows that install packages to use requirements files with pinned versions in them. Then we would have some automation (likely dependabot, but there are other options too) send pull requests at some regular frequency attempting to bump to the latest versions.

References:

B) Add a `matrix` to each job to run with multiple different versions

We use matrix strategies in some workflows already:

shark-ai/.github/workflows/ci-sharktank.yml

Lines 27 to 43 in 195b4dc

    
           strategy: 
        
             matrix: 
        
               python-version: ["3.11", "3.12"] 
        
               torch-version: ["2.3.0", "2.4.1", "2.5.1"] 
        
               os: [ubuntu-24.04] 
        
               include: 
        
                 - os: windows-2022 
        
                   python-version: "3.11" 
        
                   torch-version: "2.3.0" 
        
                 - os: windows-2022 
        
                   python-version: "3.12" 
        
                   torch-version: "2.4.1" 
        
               exclude: 
        
                 - python-version: "3.12" 
        
                    # `torch.compile` requires torch>=2.4.0 for Python 3.12+ 
        
                   torch-version: "2.3.0" 
        
             fail-fast: false

We could add new variables for versions like

    strategy:
      matrix:
        iree-requirements: ["requirements-iree-pinned.txt", "requirements-iree-nightly.txt"]

That would result in many more workflow jobs on each commit but would give us a complete view of job status for every event.

C) Fork workflows to run pinned/unpinned

Similar to how we have splits like .github/workflows/ci_eval.yaml and .github/workflows/ci_eval_short.yaml, we could have ci_sharktank_pinned.yml and ci_sharktank_unpinned.yml. That way, we would have separate run history pages like https://github.com/nod-ai/shark-ai/actions/workflows/ci_eval.yaml and https://github.com/nod-ai/shark-ai/actions/workflows/ci_eval_short.yaml.

D) Use reusable workflows to run pinned/unpinned

Reusable workflows (docs here) would be similar to (C), but without as much copy/paste. We could still set different triggers for each variant and track run histories separately.

Thoughts

I like the simplicity of (A), as this would let us bring up new workflows with minimal changes and keep all workflows predictable, with source code always being the source of truth for versions. Options (C) and (D) would give us independent workflow run history for each variant. I think we could simulate that with event/branch/actor filters and (A) though.

Note that for dependabot updates (A), jobs that only run nightly and not on individual pull requests would not pick up those changes by default. We could find a way to opt those PRs in to running those jobs, or go with one of the other options for those jobs.

ScottTodd · 2025-01-06T22:56:06Z

Chatted with @marbre a bit. Leaning towards option (A) for workflows that run on push and pull_request events, then option (B) for workflows that run on schedule.

The actual mechanism for (A) is TBD. I'm testing dependabot but having a hard time getting it to understand a single requirements.txt file that uses

--find-links https://iree.dev/pip-release-links.html
--pre
iree-base-compiler==3.1.0rc20250103
iree-base-runtime==3.1.0rc20250103

Might instead write a workflow explicitly, like https://github.com/iree-org/iree/blob/main/.github/workflows/bump_torch_mlir.yml. Or, if it works, use Renovate (https://docs.renovatebot.com/).

cc @stbaione @archana-ramalingam (I saw a separate discussion at #757 (comment))

archana-ramalingam · 2025-01-06T23:16:12Z

We had discussed earlier about which workflows require pinned vs latest versions in this PR. If that works we can stick with it. The overarching idea is pre-submits use pinned versions and nightly use latest/nightly versions.

ScottTodd · 2025-01-06T23:21:37Z

I'm planning to have:

Jobs running on pull_request and push test pinned versions.
Jobs running on schedule test both pinned versions and the latest versions.
A PR updating the pinned version would be updated once a day (whenever a new version is available to test).
- If the PR passes all tests then we can merge it to update the pins.
- If the PR has failures we can address them as needed.

This simplification will help with #760. Pros: * Now there are fewer places that use a ref pin * Workflows are now simpler Cons: * ~~Workflows will be several seconds slower since FetchContent always fetches all submodules~~ * The `SHORTFIN_IREE_SOURCE_DIR` option is no longer tested

This is prep work for #760. I also considered putting the files under `build_tools/` or `shark-ai/`, but we already have a few requirements files in the repository root. Still not as many as https://github.com/vllm-project/vllm though 😛.

Progress on #760. The idea here is that we will test with only pinned versions in all workflows that run on `pull_request` and `push` triggers, then we will create pull requests (ideally via automation like dependabot) that attempt to update the pinned versions. This will give us confidence that test regressions are _only_ due to the code changes in the pull request and not due to a dependency changing. Workflows will also be more reproducible as the versions they fetch will come from source code and not an external, time-dependent source.

This simplification will help with #760. Pros: * Now there are fewer places that use a ref pin * Workflows are now simpler Cons: * ~~Workflows will be several seconds slower since FetchContent always fetches all submodules~~ * The `SHORTFIN_IREE_SOURCE_DIR` option is no longer tested

This is prep work for #760. I also considered putting the files under `build_tools/` or `shark-ai/`, but we already have a few requirements files in the repository root. Still not as many as https://github.com/vllm-project/vllm though 😛.

Progress on #760. The idea here is that we will test with only pinned versions in all workflows that run on `pull_request` and `push` triggers, then we will create pull requests (ideally via automation like dependabot) that attempt to update the pinned versions. This will give us confidence that test regressions are _only_ due to the code changes in the pull request and not due to a dependency changing. Workflows will also be more reproducible as the versions they fetch will come from source code and not an external, time-dependent source.

ScottTodd · 2025-01-09T00:06:01Z

Made good progress on this.

Remaining tasks:

Switch presubmit workflows to use pinned versions
Automate version pin updates (dependabot or scripting that scrapes the latest versions once a day, updates the pins, and sends or updates a PR)
Switch nightly/scheduled workflows to use unpinned dependency file
Switch nightly/scheduled workflows to use both a pinned and unpinned dependency file

See #760 for context. We want to stay close to the latest versions while still pinning versions for predictability. Updating version pins is currently a manual process but we plan on automating it in the future. We can decide how noisy we want these dependency updates to be: * new PRs daily or less frequently * do or don't reuse existing PRs * merge ASAP or let them sit for multiple days

Progress on #760. We could make the scheduled jobs test both pinned and unpinned versions like on #767. Cleanup included here: * Dropped the "Installing the PyTorch CPU wheels saves multiple minutes and a lot of bandwidth on runner setup." comments since they are repetitive. Could add them back if people find them useful. * Stopped installing from the root `requirements.txt` in some workflows, instead opting to just install from the more specific `sharktank/requirements-tests.txt` I did not test the changes to scheduled workflows. Could do that on request, or just revert if we see issues.

Progress on #760, built off of the work in iree-org/iree-turbine#388. This adds a new workflow that runs once a day to update all pinned IREE versions. I also looked into using Dependabot but found that it struggles with `--find-links`, `--index-url`, and with there being multiple `requirements.txt` files in a repository. While I would love to not need to reinvent this wheel, I do like keeping full control over the process. This PR includes: * A new `build_tools/update_iree_requirement_pins.py` script handles updating the pins in `requirements-iree-pinned.txt` and `shortfin/CMakeLists.txt`. The script also sets some variables in `GITHUB_ENV`. * A new `.github/workflows/update_iree_requirement_pins.yml` workflow runs that script then calls https://github.com/peter-evans/create-pull-request to create or update a pull request if there are local changes after running that script. The commit message and pull request body are constructed using the variables set by the script. Test action run: https://github.com/ScottTodd/shark-ai/actions/runs/12777789320 Test pull request: ScottTodd#1

ScottTodd mentioned this issue Jan 6, 2025

Delegate IREE source fetching to shortfin/CMakeLists.txt. #762

Merged

This was referenced Jan 6, 2025

Move requirements-iree-*.txt to top level. #765

Merged

Add [pinned, unpinnned] matrix to ci_eval.yaml. #767

Draft

ScottTodd mentioned this issue Jan 7, 2025

Switch presubmit CI workflows to use pinned IREE versions. #774

Merged

ScottTodd mentioned this issue Jan 9, 2025

Bump IREE version pins to 3.2.0rc20250109. #802

Merged

ScottTodd mentioned this issue Jan 10, 2025

Switch workflows to use new requirements-iree-*.txt files. #813

Merged

ScottTodd mentioned this issue Jan 14, 2025

Add script+workflow to update IREE requirement pins. #827

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test both pinned and unpinned versions of IREE dependencies #760

Test both pinned and unpinned versions of IREE dependencies #760

ScottTodd commented Jan 6, 2025

ScottTodd commented Jan 6, 2025

ScottTodd commented Jan 6, 2025

archana-ramalingam commented Jan 6, 2025

ScottTodd commented Jan 6, 2025

ScottTodd commented Jan 9, 2025 •

edited

Loading

Test both pinned and unpinned versions of IREE dependencies #760

Test both pinned and unpinned versions of IREE dependencies #760

Comments

ScottTodd commented Jan 6, 2025

ScottTodd commented Jan 6, 2025

Possible strategies

A) Test with only pinned, use dependabot to send PRs that try new versions

B) Add a matrix to each job to run with multiple different versions

C) Fork workflows to run pinned/unpinned

D) Use reusable workflows to run pinned/unpinned

Thoughts

ScottTodd commented Jan 6, 2025

archana-ramalingam commented Jan 6, 2025

ScottTodd commented Jan 6, 2025

ScottTodd commented Jan 9, 2025 • edited Loading

B) Add a `matrix` to each job to run with multiple different versions

ScottTodd commented Jan 9, 2025 •

edited

Loading