-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA LLVM Debug Info Segfault #576
Comments
So James and I just run into this when playing around with your reproducer from #511 (comment)
Running the above with |
This happens for simple CUDA code, and presumably is related to the recent GPUCompiler.jl/Enzyme debug info pieces. using CUDA
using Enzyme
function mul_kernel(A)
i = threadIdx().x
if i <= length(A)
A[i] *= A[i]
end
return nothing
end
function grad_mul_kernel(A, dA)
Enzyme.autodiff_deferred(mul_kernel, Const, Duplicated(A, dA))
return nothing
end
A = CUDA.ones(64,)
dA = similar(A)
dA .= 1
@cuda threads=length(A) grad_mul_kernel(A, dA) |
Has no debuginfo attached and the NVPTX backend failed on the The function itself has no caller anymore and got inlined into the kernel function
|
@wsmoses already post-enzyme we are missing |
I am on Enzyme 0.10.15, CUDA 3.12.1 and Julia 1.8.2.
sync_threads
in a GPU kernel causes a segfault:Errror with
Enzyme.API.printall!(true)
:CUDA version info:
The text was updated successfully, but these errors were encountered: