Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quant_cuda_kernel.cu(212): error: identifier "__hfma2" is undefined #23

Open
HueCheng1021 opened this issue May 11, 2023 · 1 comment
Open

Comments

@HueCheng1021
Copy link

an error is reported when compiling the quant_cuda kernel.

in my case,
Cuda compilation tools, release 12.0, V12.0.140

@efrantar
Copy link
Member

Our kernels were developed with CUDA 11.4. However, this function still seems to exist in the newest CUDA API, so I am unfortunately not sure what's causing the error. If you don't need our fastest FP16 kernels (e.g. if you aren't on an A100 for which they were actually developed), you could perhaps try commenting out the corresponding code in quant_cuda.cpp and quant_cuda_kernel.cu and using the FP32 version (omitting the option --faster-kernel).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants