Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R0.4.38 launch dims pjrt fix #98

Merged
merged 4 commits into from
Jan 23, 2025

Conversation

hsharsha
Copy link

No description provided.

hsharsha and others added 4 commits January 22, 2025 13:35
Imported from GitHub PR openxla#19582

Owing to checks in https://github.com/openxla/xla/blob/main/xla/service/gpu/parallel_loop_emitter.cc#L169-L171 launch dimension can be of the form ((block.x, 1, 1), (thread.x, thread.y, 1)). And in ROCm it is expected that (block.x * thread.x) <= 0xFFFFFFFF
Copybara import of the project:

--
9a46402 by Harsha HS <[email protected]>:

[ROCm] Fix kernel launch dimension

Launch dimension should be of the form
((block.x, 1, 1), (thread.x, thready, 1)) to accommodate checks in
(parallel_loop_emitter.cc)[https://github.com/openxla/xla/blob/main/xla/service/gpu/parallel_loop_emitter.cc#L169-L171]

Merging this change closes openxla#19582

COPYBARA_INTEGRATE_REVIEW=openxla#19582 from ROCm:ci_fix_launch_dim_20241121 9a46402
PiperOrigin-RevId: 709138523
MIOpen needs more memory for autotuning when matrix size are bigger
@hsharsha
Copy link
Author

Following tests are failing

//xla/service/gpu/fusions/triton:dot_algorithms_test_gpu_amd_any        
  
//xla/service/gpu/fusions/triton:triton_fusion_emitter_device_legacy_test_gpu_amd_any 
  
//xla/service/gpu/fusions/triton:triton_fusion_emitter_parametrized_test_gpu_amd_any 

//xla/service/gpu/fusions/triton:triton_support_legacy_test_gpu_amd_any  
  
//xla/service/gpu/fusions/triton:triton_support_test                     
  
//xla/service/gpu/tests:gpu_kernel_tiling_test_gpu_amd_any               
 
//xla/service/gpu/tests:gpu_triton_custom_call_test_gpu_amd_any          
  
//xla/service/gpu/transforms:dot_dimension_sorter_test_gpu_amd_any  

Copy link

@Ruturaj4 Ruturaj4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to cherry-pick this on 0.5.0 branch. I didn't run it myself but looks good.

@hsharsha hsharsha merged commit 1f9643e into rocm-jaxlib-v0.4.38 Jan 23, 2025
6 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants