-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore!: Update PyTorch to 2.5 #465
base: main
Are you sure you want to change the base?
Conversation
This change increases the upper version of PyTorch to allow version 2.5. Resolves instructlab#462 Signed-off-by: Fabien Dupont <[email protected]>
E2E (NVIDIA L40S x4) workflow launched on this PR: View run |
@@ -17,6 +17,6 @@ openai>=1.13.3,<2.0.0 | |||
sentencepiece>=0.2.0 | |||
tabulate>=0.9.0 | |||
tenacity>=8.3.0,!=8.4.0 | |||
torch>=2.3.0,<2.5.0 | |||
torch>=2.3.0,<2.6.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be
torch>=2.3.0,<2.6.0 | |
torch>=2.5.0,<2.6.0 |
...to enforce the use of 2.5? If other required packages in this file change (or their versions change), then pip may end up pulling 2.3.z or 2.4.z when trying to automatically resolve dependencies.
I'm referencing the title: Update PyTorch to 2.5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an older version is fine if that's a requirement for one component. That's unlikely to happen though, unless a library is downgrading its dependencies. And that's also what we've done for the instructlab/instructlab repo: instructlab/instructlab#2865.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. So we're allowing 2.5 if other packages support it, but not enforcing the use of 2.5.
I was confused by the PR title because it suggested we were actually updating the version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for hightling this @courtneypacheco ! Looking in the CI logs, even though this raises the cap it looks like CI is actually installing torch 2.4.x and running with that. So, we may have more work to do here to figure out what dependency is forcing our torch to 2.4.x instead of 2.5.x.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I see in the logs:
2025-01-08T12:55:07.4571834Z Collecting torch<2.5.0,>=2.3.0 (from instructlab==0.21.0a2.dev258)
2025-01-08T12:55:07.4574409Z Obtaining dependency information for torch<2.5.0,>=2.3.0 from
Our CI tests install instructlab from latest main, so we must have picked up instructlab before the torch bump was merged there. Kicking this off again to ensure it picks up torch 2.5.x now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that it's because we're still on vLLM 0.6.2, which depends on PyTorch 2.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, yes. So in other words, neither instructlab/instructlab nor instructlab/sdg can actually bump our torch version in use until we move to a newer vllm than 0.6.2 that depends on torch 2.5.x. Regardless of this version range, our vllm dependency specifies an exact dependency on torch 2.4.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that seems to be the case. I have opened a new PR in instructlab to update vLLM to 0.6.6.post1: instructlab/instructlab#2892.
e2e workflow succeeded on this PR: View run, congrats! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving this so it can merge - turns out we have a fairly complicated process to bump PyTorch versions, as the end-to-end CI tests for SDG actually install instructlab/instructlab and drive things from that CLI, which means we end up using the PyTorch and vllm versions from that repo as long as it does not directly conflict with what's in here.
So, raising the cap here does not mean we've run CI against PyTorch 2.5.x. However, it does allow instructlab/instructlab to then bump to a newer vllm which raises the minimum pytorch requirement to 2.5.x. If we didn't bump here first then instructlab/instructlab would not be able to bump vllm because it would depend on a PyTorch outside our allowed range.
This change increases the upper version of PyTorch to allow version 2.5.
Resolves #462