-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regent_cuda.cu
build failure on PPC
#1511
Comments
@elliottslaughter could you please add this to #1032? thanks |
@muraj What was the motivation for adding the |
Same questions for other parts of this merge request for PPC: |
@cmelone as a workaround you could try building with |
I get the same issue building with that flag |
Ok, unset that and try commenting out: https://gitlab.com/StanfordLegion/legion/-/blob/master/runtime/realm/timers.h?ref_type=heads#L31 |
that allows it to build successfully |
Ok, so then we seem to have two issues:
|
I think this is not actually the issue. As you can see in the commit you referenced, the commit not only added the
|
@cmelone it might help us track this down to know what OS, distro and compiler you are on (including versions, as appropriate). |
Lassen is using RHEL 7.9 and GCC 8.3.1
|
@cmelone Sorry about this, I don't have a ppc compilation nor test in CI for this. __ppc_get_timebase_freq should come from sys/platform/ppc.h which is included in line 44 of timers.inl. I had originally used some inline assembly for this path, but there wasn't a lot of concrete documentation on how to retrieve the ppc timebase frequency for calibration, so figured it would be okay to use the glibc builtins for this. As to the fact that REALM_TIMERS_USE_RDTSC=0 doesn't work, I'm not sure why, I'm lookint into it now, I think there's still some #ifdef rather than #ifs in timers.inl which is causing the problem. Gimme just a moment to fix. |
@cmelone can you also provide the version of glibc you have? I am seeing the following in ppc.h: This is where the __builtin_ppc_get_timebase is being referenced, which seems to require at least gcc 4.8. What I wonder is if nvcc is actually causing an issue here. Can you try compiling the following code snippet from godbolt (which seems to work on all the gcc versions supported there) and see if that works for you? If so, then maybe we need to not use nvcc as our main compiler for all our source files (that's usually not a good idea in general...). https://godbolt.org/z/6e7hMG9ja Thanks and apologies for the issue. |
@elliottslaughter the build failure with RDTSC define set to zero is because it's the include that is causing the failure, and that was only protected under a #ifdef not a #if. Fix incoming for that. |
@muraj, you're all good! I'm happy to test anything on ppc in the future if that would be helpful Lassen's glibc version...2.17. I ran the test snippet and it compiles with both g++ and nvcc |
That's really weird that the test snippet works, but the same thing in realm doesn't. Also, I double checked, glibc code here hasn't changed in 2.17. I'm really not sure why this would fail. Are you sure you're using the same environment? Also, noticed this: |
Yup, 100% sure they are the same environment. this is what I'm running to compile the snippet:
both succeed |
Okay, I have no idea how that is possible. Anyway, I just merged a change to master to be able to skip over this path and effectively disable rdtsc for ppc, so you can try disabling it as Sean suggested earlier with the latest master branch. |
sounds good, thanks. to illustrate, this is how I'm setting the environment. I doubt it, but I'm not sure if
|
Can confirm that the extra flag works to compile legion. The solver itself fails to build with the same error and adding the flag doesn't seem to help. This machine is where the vast majority of our users run the code and if possible, I'd like to avoid adding to our already complex build instructions given that this issue wasn't present a couple weeks ago. Is there anything else I can do to help debug this, or is this something on the system site that needs to be resolved? thanks again for all your efforts Edit: went back to re-verify the old builds which disproves any issue with the system itself Succeeds: |
Hi @muraj just following up |
Running control replication on Lassen. This error started popping up after
79ef214c
. I think it might be related to this change, but not too sure.The text was updated successfully, but these errors were encountered: