-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 825 #850
Conversation
@roiser, ok this is good to be squashed and merged. Olivier |
…aph5#850 - NB Parameters.h changes in (almost?) all processes!
Hi @oliviermattelaer I had a look since I saw that you posted this. I confirm that #825 seems fixed: the cross section from fortran and cuda/c++ now matches in susy_gg_tt. Conversely for #826 I still see that there is NO cross section for susy_gg_t1t1. So I am a bit surprised when you say that you do see a cross section. Maybe it depends on the number of events and the exact setup? Anyway, I agree with you that this needs more investigation. One point, I see that this patch also changes the Parameters.h files in other processes including eemumu. Is this expected? Anyway, I will run a few tests on all processes and see what this gives. Just in case, this is in https://github.com/valassi/madgraph4gpu/tree/susy (but I will not make a PR out of this, at least not yet) Thanks! Andrea |
Hi Andrea, in Parameters.h this sounds surprising (I do not see any of those files changed in your repo actually) For #826, I will "first" try to fix the missmatch that I do observe on my laptop, and then we might need to work together to understand why your machine is just crashing (if it is still happens at that stage obviously). |
OPS! SORRY! Yes I meant that HelAmps.h changed, not Paramaters.h! |
…aph5#850 - note that HelAmps.h changes in (almost?) all processes! (NB it is HelAmps.h that changes and not Parameters.h as I wrote elsewhere by mistake)
…adgraph5#850 fixing madgraph5#825 - all ok no change
… susy xsec mismatch madgraph5#825: add Ccoeff to all HelAmps.h files
For me this is good to go and should be merged. However it will then also need an update of refence files for runTest (#859), which is in my PR #860. So I suggest,
|
I merged my #860. As I expected, github automatically closed this one as merged too. Thanks Olivier! |
…raph5#850 and madgraph5#860 for xsec mismatch madgraph5#825) into susy2 Fix conflicts: epochX/cudacpp/CODEGEN/generateAndCompare.sh
…raph5#850 and madgraph5#860 for xsec mismatch madgraph5#825) into tmad
…raph5#850 and madgraph5#860 for xsec mismatch madgraph5#825) into susy Fix conflicts: epochX/cudacpp/CODEGEN/generateAndCompare.sh
…aph5#860 and madgraph5#850 for Ccoeff madgraph5#825) STARTED AT Mon Jun 3 05:51:20 PM CEST 2024 ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean ENDED(1) AT Mon Jun 3 06:12:34 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean ENDED(2) AT Mon Jun 3 06:20:51 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean ENDED(3) AT Mon Jun 3 06:29:05 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst ENDED(4) AT Mon Jun 3 06:31:55 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst ENDED(5) AT Mon Jun 3 06:34:42 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common ENDED(6) AT Mon Jun 3 06:37:37 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -mix -hrd -makej -susyggtt -susyggt1t1 -smeftggtttt -heftggbb -makeclean ENDED(7) AT Mon Jun 3 06:47:12 PM CEST 2024 [Status=0] No errors found in logs
…g AS-IS Olivier's patches from the latest fix_826 branch for PR madgraph5#850 The gg_ttgg test still crashes (rotxxx madgraph5#855?) ./tmad/madX.sh -ggttgg -iconfig 104 -makeclean *** (2-none) EXECUTE MADEVENT_CPP x1 (create events.lhe) *** Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. Backtrace for this error: 0 0x7fce5ec23860 in ??? 1 0x7fce5ec22a05 in ??? 2 0x7fce5e854def in ??? 3 0x44b5ff in ??? 4 0x4087df in ??? 5 0x409848 in ??? 6 0x40bb83 in ??? 7 0x40d1a9 in ??? 8 0x45c804 in ??? 9 0x434269 in ??? 10 0x40371e in ??? 11 0x7fce5e83feaf in ??? 12 0x7fce5e83ff5f in ??? 13 0x403844 in ??? 14 0xffffffffffffffff in ??? ./tmad/madX.sh: line 387: 3913008 Floating point exception(core dumped) $timecmd $cmd < ${tmpin} > ${tmp} The susy_gg_t1t1 test also still crashes (see madgraph5#826?), this looks like the same crash as ggttgg above ./tmad/madX.sh -susyggt1t1 -iconfig 2 -makeclean *** (2-none) EXECUTE MADEVENT_CPP x1 (create events.lhe) *** Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. Backtrace for this error: 0 0x7f9f03423860 in ??? 1 0x7f9f03422a05 in ??? 2 0x7f9f03054def in ??? 3 0x43809f in ??? 4 0x40581f in ??? 5 0x4067b1 in ??? 6 0x408c71 in ??? 7 0x40a0a9 in ??? 8 0x444fdf in ??? 9 0x42bb38 in ??? 10 0x40371e in ??? 11 0x7f9f0303feaf in ??? 12 0x7f9f0303ff5f in ??? 13 0x403844 in ??? 14 0xffffffffffffffff in ??? ./tmad/madX.sh: line 387: 3907179 Floating point exception(core dumped) $timecmd $cmd < ${tmpin} > ${tmp} The gqttq test also still crashes intermittently, i.e. only on the second execution (madgraph5#845?) ./tmad/teeMadX.sh -gqttq +10x -fltonly -makeclean ./tmad/teeMadX.sh -gqttq +10x -fltonly Executing ' ./build.512z_f_inl0_hrd0/madevent_cpp < /tmp/avalassi/input_gqttq_x1_cudacpp > /tmp/avalassi/output_gqttq_x1_cudacpp' Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. Backtrace for this error: 0 0x7fbafa623860 in ??? 1 0x7fbafa622a05 in ??? 2 0x7fbafa254def in ??? 3 0x7fbafad24034 in ??? 4 0x7fbafa9a1575 in ??? 5 0x7fbafad20c89 in ??? 6 0x7fbafad2abfd in ??? 7 0x7fbafad30491 in ??? 8 0x43008b in ??? 9 0x431c10 in ??? 10 0x432d47 in ??? 11 0x433b1e in ??? 12 0x44a921 in ??? 13 0x42ebbf in ??? 14 0x40371e in ??? 15 0x7fbafa23feaf in ??? 16 0x7fbafa23ff5f in ??? 17 0x403844 in ??? 18 0xffffffffffffffff in ??? ./madX.sh: line 387: 3922797 Floating point exception(core dumped) $timecmd $cmd < ${tmpin} > ${tmp} ERROR! ' ./build.512z_f_inl0_hrd0/madevent_cpp < /tmp/avalassi/input_gqttq_x1_cudacpp > /tmp/avalassi/output_gqttq_x1_cudacpp' failed
This does fix #825 where the problem was that the unary handling was not propagated within the aloha routine.
This patch include such handling now but I'm a bit worried about other processes that can be impacted.
This PR (via the CI) is a simple way to test many processes to spot potential issue, if it goes trough, I will anyway have to check the corner case, I'm worried about...