You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We hard-code a workgroup size of 64, even if the group has capacity 1. This will always make 63 of the GPU threads go to waste. We should cap the workgroup size at the capacity.
The text was updated successfully, but these errors were encountered:
Good point. On the other hand, the hardware will never spawn less than its internal workgroup size limit, which is at least 32 for some vendors, and 64 for others. So the threads will go to waste either way. Having a hard-coded limit simplifies a tiny bit some places (hard-coded conversions, instead of having to inject a size).
We hard-code a workgroup size of 64, even if the group has capacity 1. This will always make 63 of the GPU threads go to waste. We should cap the workgroup size at the capacity.
The text was updated successfully, but these errors were encountered: