Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

第九章中初次介绍原子操作用于归约时编译报错 #35

Open
zzyuanyi opened this issue Sep 22, 2024 · 2 comments
Open

第九章中初次介绍原子操作用于归约时编译报错 #35

zzyuanyi opened this issue Sep 22, 2024 · 2 comments

Comments

@zzyuanyi
Copy link

输入的编译命令为:nvcc _O3 _DUSE_DP xxx.cu。报错信息如下:9.cu(37): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (real *, real)
atomicAdd(&d_y[0],x[0]);
^
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include\sm_20_atomic_functions.hpp(82): note #3326-D: function "atomicAdd(float *, float)" does not match because argument #1 does not match parameter
static __inline __declspec(device) float atomicAdd(float *address, float val)
^
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include\device_atomic_functions.hpp(224): note #3326-D: function "atomicAdd(unsigned long long *, unsigned long long)" does not match because argument #1 does not match parameter
static __inline __declspec(device) unsigned long long int atomicAdd(unsigned long long int *address, unsigned long long int val)
^
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include\device_atomic_functions.hpp(110): note #3326-D: function "atomicAdd(unsigned int *, unsigned int)" does not match because argument #1 does not match parameter
static __inline __declspec(device) unsigned int atomicAdd(unsigned int *address, unsigned int val)
^
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\include\device_atomic_functions.hpp(105): note #3326-D: function "atomicAdd(int *, int)" does not match because argument #1 does not match parameter
static __inline __declspec(device) int atomicAdd(int *address, int val)
GPU型号为3060-laptop,os为win11

@brucefan1983
Copy link
Owner

书中应该说了,使用双精度的原子加函数时需要用6.0或以上的架构设置。所以编译命令应该需要:

nvcc  -O3 -DUSE_DP -arch=sm_60 xxx.cu  

@zzyuanyi
Copy link
Author

十分感谢,问题已解决,多有叨扰,非常抱歉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants