Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rocm jaxlib v0.4.30 qa cleanup #35

Merged
merged 10 commits into from
Aug 29, 2024

Conversation

hsharsha
Copy link

No description provided.

@hsharsha
Copy link
Author

@draganmladjenovic take a look at the failures for MLIR tests at 79b3692

Most of them are failing due to bigger thread size and smaller block size in AMD GPUs compared to NVIDIA. Some tests run into infinite loop on MI200 and hence I have commented out RunAndCompare

Copy link

@i-chaochen i-chaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're ok on CublasDot at 0.4.28, but it's failed on 0.4.30?

wondering do we have a ticket to track on it?

@hsharsha
Copy link
Author

I think we're ok on CublasDot at 0.4.28, but it's failed on 0.4.30?

wondering do we have a ticket to track on it?

It needs more investigation. It depends on the choice of autotune https://github.com/ROCm/frameworks-internal/issues/9088

@hsharsha hsharsha force-pushed the rocm-jaxlib-v0.4.30-qa-cleanup branch from e1d02ab to be97509 Compare August 23, 2024 20:07
@hsharsha hsharsha force-pushed the rocm-jaxlib-v0.4.30-qa-cleanup branch from dd85325 to 4bc36d5 Compare August 28, 2024 20:02
@hsharsha hsharsha force-pushed the rocm-jaxlib-v0.4.30-qa-cleanup branch from 4bc36d5 to 339dde0 Compare August 28, 2024 20:05
switch (algorithm) {
case PrecisionConfig::ALG_DOT_ANY_F8_ANY_F8_F32:
case PrecisionConfig::ALG_DOT_ANY_F8_ANY_F8_F32_FAST_ACCUM:
// Other F8 types are actually not supported by NVIDIA GPUs.
return is_cuda_ge_ada &&
return (is_cuda_ge_ada || is_rocm_mi100_and_above) &&

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is really correct. I guess FP8 support begins from MI300 arch? But I see that we have the same check also on upstream

Copy link

@pemeliya pemeliya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed gemm-related changes and buffer_comparator, leaving the remaining LLVM changes to Dragan and Chao

@hsharsha hsharsha merged commit 5945307 into rocm-jaxlib-v0.4.30-qa Aug 29, 2024
3 of 5 checks passed
@pramenku
Copy link

pramenku commented Nov 7, 2024

Is this PR merged in QA-31 too? @hsharsha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants