Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the "compute good helicities" algorithm in cudacpp? #565

Open
valassi opened this issue Dec 11, 2022 · 2 comments
Open

Change the "compute good helicities" algorithm in cudacpp? #565

valassi opened this issue Dec 11, 2022 · 2 comments

Comments

@valassi
Copy link
Member

valassi commented Dec 11, 2022

While doing some followup on #563 and #564, about printing out how many good helicities are used in cudacpp and fortran, and understanding the differences between the calculation of good helicities in fortran and cudacpp, I realised something I do not like:

  • in cudacpp, the calculation of good helicities was designed for the initial standalone app, where we use cycles of very large grids of events
  • as a consequence, the calculation of good helicities in cudacpp is always done ONLY on one grid of events through the Bridge
  • currently, this is not a problem because we always use at least 16 or 32 events, even in c++, but there may be cases where we use much smaller grids?
  • in any case, there is always the alternative of doing it like in fortran? there is no separate computegoodhelicity call, you just keep computing matrix elements and then at some point you stop?

Probably not... probably should just hardcode that at least 16 events must be computed - and this must be documented everywhere.

Open this issue anyway for info...

@valassi
Copy link
Member Author

valassi commented Dec 11, 2022

In any case, it may be useful to remove sigmaKin_getGoodHel. This makes it seem that there is some magic different operation, while actually it is as simple as that: you compute the first grid of events, and from those you get the helicities.

The only reason why this has been kept separate is to be able to compute throughputs in MEs/s which are computed only on the correct number of helicities...

Or maybe actually this is a reasonable requirement and we better keep sigmakin_getgoodhel...

@valassi
Copy link
Member Author

valassi commented Dec 11, 2022

Yes actually this was added explicitly, see #461. With cuda, and only one cycle of one grid, your apparent throughputs are a factor 2 slower if you do not precompute helicities.

** Essentially: in a cudacpp bridge, the first grid goes through ME calculation twice, the first time for helicity filtering, the second time for MEs **

Functionally, the above is a nonsense. But since we are focusing on throughputs, and we often run only one grid in cuda, it is important to keep these separate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant