You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When I use wmma with a uint8_t parameter, I find that the program does not make the call to the local_join function and reports an unaligned error:code=716(cudaErrorMisalignedAddress) "cudaDeviceSynchronize()"
I would like to ask if there is support for uint8_t or int8_t matrix calculations in local_join?
I get an error when using the following uint8_t. And equipment :
NVIDIA GeForce RTX 4090 Driver Version: 560.28.03 CUDA Version: 12.6
And I'd like to ask if the current NNDescent supports local_join calculations other than __half
The text was updated successfully, but these errors were encountered:
Hi @zben777 thanks for opening an issue. Can you please provide a minimal reproducible code snippet / example that we can use to reproduce this behavior? nn-descent is only instantiated for specific supported types and it's not header-only. Are you calling public APIs or internal APIs?
Describe the bug
When I use wmma with a uint8_t parameter, I find that the program does not make the call to the local_join function and reports an unaligned error:code=716(cudaErrorMisalignedAddress) "cudaDeviceSynchronize()"
I would like to ask if there is support for uint8_t or int8_t matrix calculations in local_join?
I get an error when using the following uint8_t. And equipment :
NVIDIA GeForce RTX 4090 Driver Version: 560.28.03 CUDA Version: 12.6
And I'd like to ask if the current NNDescent supports local_join calculations other than __half
The text was updated successfully, but these errors were encountered: