You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all thanks for setting up this package :) It's super helpful, thanks
I'm wondering, is there a way to use a smaller block size ? I tried modifying the python code so that no errors are thrown, however I'm hitting a
RuntimeError: CUDA error: an illegal memory access was encountered
error when calling the cuda kernel. I tried to look a bit into the kernel code, and it seems that the block_size argument is not used. So I'm curious how the kernel knows to expect a minimal size of 32.
Any clarifications would be super helpful!
Thanks
The text was updated successfully, but these errors were encountered:
Hi,
First of all thanks for setting up this package :) It's super helpful, thanks
I'm wondering, is there a way to use a smaller block size ? I tried modifying the python code so that no errors are thrown, however I'm hitting a
RuntimeError: CUDA error: an illegal memory access was encountered
error when calling the cuda kernel. I tried to look a bit into the kernel code, and it seems that the
block_size
argument is not used. So I'm curious how the kernel knows to expect a minimal size of 32.Any clarifications would be super helpful!
Thanks
The text was updated successfully, but these errors were encountered: