-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to compile for SU(4) for A100 #94
Comments
Trying some basic shortening reveals a subsequent |
Hi @edbennett, the issue you are encoutering is related to the actual size of parameters, not their name. |
Thanks for explaining, Antonin; I made assumptions based on the compiler's relatively unhelpful message. Places where this occurs: Modules/MContraction/Gamma3pt:
Modules/MContraction/WeakEye3pt:
Modules/MSource/Gauss:
For Alessandro's code the first two aren't needed, but the latter is. In each case this is the complete error (In the above function names This is the
|
Thanks, figuring out which lines of code these lambda functions came from would be useful, and could help producing a minimal reproducible example to report on Grid side. On the Hadrons side we can't do much about it, but if you do not need these specific modules you could deactivate them in the list of things to compile. Also do you confirm that you compiled the Nc=3 version without encountering that? |
I have now tested that and indeed do not encounter the issue.
OK, I've narrowed the error in
…which seems a very strange place to encounter an issue. Minimal failing example based on this:
Compiled with (or more accurately, attempted but did not succeed to compile with):
(This does compile successfully when using Grid build with Nc=3.) |
Hi @edbennett, thanks I think I understand now. In Grid all argument of an expression are captured by value and made into a CUDA kernel. A double precision SU(4) propagator has size 16(Nc)x16(spin)x16B = 4096B, and in the expression you shared the identity matrix has this type. Avoiding to use this kind of constant site value in expressions might solve the issue. |
In principle, presumably setting this in single precision would reduce this by half, which would allow going up to SU(5), and since in this case it is an identity matrix should result in no loss in precision. (Of course, a better solution would be needed to go to general |
I'm working with @LupoA trying to benchmark Hadrons on Tursa, and am hitting an issue that the nest of templates seemingly prevents Hadrons compiling with CUDA. A number of modules (including
MGauss
, which we need) give the errror:CPU compilation is fine.
Do you have any idea how fixing this could be approached, beyond going into Grid and renaming all of the type names to something shorter?
Thanks!
The text was updated successfully, but these errors were encountered: