You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Changeset 9d623ce introduced a bug: nodes with no GPU will now signal gpufail=1. This is because the gpu probe only distinguishes between found and not found, and the PS code takes the latter to be an error, but this is wrong and was caused by this patch, which previously would signal gpufail=1 only if the probe returned an Err, it being a Result previously with an empty-array return for no GPUs.
This is a problem on main only, 0.12 is unaffected.
The fact that this bug made it through is a result of not having good test cases for the gpu probes, #217. We need to be able to run a test suite on hardware that has no GPUs, NVIDIA gpus, AMD gpus and then be able to state expectations for the tests on the various nodes. It's a little involved and won't work as part of CI, but it's definitely something we want.
The text was updated successfully, but these errors were encountered:
Changeset 9d623ce introduced a bug: nodes with no GPU will now signal gpufail=1. This is because the gpu probe only distinguishes between found and not found, and the PS code takes the latter to be an error, but this is wrong and was caused by this patch, which previously would signal gpufail=1 only if the probe returned an Err, it being a Result previously with an empty-array return for no GPUs.
This is a problem on main only, 0.12 is unaffected.
The fact that this bug made it through is a result of not having good test cases for the gpu probes, #217. We need to be able to run a test suite on hardware that has no GPUs, NVIDIA gpus, AMD gpus and then be able to state expectations for the tests on the various nodes. It's a little involved and won't work as part of CI, but it's definitely something we want.
The text was updated successfully, but these errors were encountered: