Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[08章中的规约算法]如果数组长度不是block_size的整倍数为什么不会越界呢 #38

Open
Hukongtao opened this issue Nov 28, 2024 · 5 comments

Comments

@Hukongtao
Copy link

代码段:
https://github.com/brucefan1983/CUDA-Programming/blob/master/src/08-shared-memory/reduce2gpu.cu#L48
image
如果数组长度不能整除以block_size,那么对于最后一个线程块,tid + offset不是会大于block_size吗?为什么不会越界呢?而且计算结果也是正确的。

@fever-Wong
Copy link

fever-Wong commented Nov 28, 2024 via email

@brucefan1983
Copy link
Owner

如果不能整除,是会有问题的。这也就是为啥后面会讨论使用共享内存的做法。

@brucefan1983
Copy link
Owner

书也强调了这一点的。你可以看src/文件夹中的PDF书稿。

@Hukongtao
Copy link
Author

书也强调了这一点的。你可以看src/文件夹中的PDF书稿。

对 我看书上是说需要整除。但是我跑的时候给了个不能整除的例子,发现结果也是没有问题的。所以比较迷惑

@brucefan1983
Copy link
Owner

那也许是运气了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants