Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable parthenon::par_reduce for MD loops with Kokkos 1D Range #1130

Merged

Conversation

acreyes
Copy link
Contributor

@acreyes acreyes commented Jun 24, 2024

PR Summary

Replaces KOKKOS_LAMBDAs with templated functors in par_dispatch(LoopPatternFlatRange, ...) that deduce the signature of the provided function to allow for extra arguments used in reductions.

Makes it possible to use par_reduce(DEFAULT_LOOP_PATTERN, ..) & par_reduce(name, ...)

PR Checklist

  • Code passes cpplint
  • New features are documented.
  • Adds a test for any bugs fixed. Adds tests for new features.
  • Code is formatted
  • Changes are summarized in CHANGELOG.md
  • Change is breaking (API, behavior, ...)
    • Change is additionally added to CHANGELOG.md in the breaking section
    • PR is marked as breaking
    • Short summary API changes at the top of the PR (plus optionally with an automated update/fix script)
  • CI has been triggered on Darwin for performance regression tests.
  • Docs build
  • (@lanl.gov employees) Update copyright on changed files

Copy link
Collaborator

@Yurlungur Yurlungur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice fix. Thanks for implementing it!

Copy link
Collaborator

@lroberts36 lroberts36 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix!

Comment on lines 269 to 288
template <typename Function, typename R, typename T, typename Index, typename... FArgs>
class FlatFunctor<Function, R (T::*)(Index, Index, Index, FArgs...) const> {
int NjNi, Nj, Ni, kl, jl, il;
Function function;

public:
FlatFunctor(const Function _function, const int _NjNi, const int _Nj, const int _Ni,
const int _kl, const int _jl, const int _il)
: function(_function), NjNi(_NjNi), Nj(_Nj), Ni(_Ni), kl(_kl), jl(_jl), il(_il) {}
KOKKOS_INLINE_FUNCTION
void operator()(const int &idx, FArgs &&...fargs) const {
int k = idx / NjNi;
int j = (idx - k * NjNi) / Ni;
int i = idx - k * NjNi - j * Ni;
k += kl;
j += jl;
i += il;
function(k, j, i, std::forward<FArgs>(fargs)...);
}
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not for this PR, but I wonder if we could make this work for an arbitrary dimensional index space using IndexRange in utils/ with a little bit of template magic.

Copy link
Contributor Author

@acreyes acreyes Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could work. This seems to work and at least gets part way there https://godbolt.org/z/fbYP5vW6r

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (https://godbolt.org/z/n48951jGK) seems to work, but I didn't explicitly pull out the function signature like you did (which I think matters for references).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I remember cuda builds had given me trouble with that. This one uses some more template helper structs to also handle IndexRanges in the constructor. I imagine you could use similar things to handle all the par_dispatch overloads as well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saved for future reference here #1134

@Yurlungur Yurlungur enabled auto-merge June 24, 2024 19:46
auto-merge was automatically disabled June 25, 2024 06:10

Head branch was pushed to by a user without write access

@pdmullen pdmullen enabled auto-merge June 25, 2024 12:09
Copy link
Collaborator

@pgrete pgrete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Out of curiosity, was there a specific use and/or performance consideration that inspired these changes?

src/kokkos_abstraction.hpp Outdated Show resolved Hide resolved
auto-merge was automatically disabled July 3, 2024 13:24

Head branch was pushed to by a user without write access

@acreyes
Copy link
Contributor Author

acreyes commented Jul 3, 2024

Thanks for the PR!

Out of curiosity, was there a specific use and/or performance consideration that inspired these changes?

Not any performance consideration. I was wanting to use par_reduce with similar default arguments as I was with par_for

@pgrete pgrete enabled auto-merge (squash) July 4, 2024 07:11
@pgrete pgrete disabled auto-merge July 4, 2024 12:49
@pgrete
Copy link
Collaborator

pgrete commented Jul 4, 2024

All tests pass. I'm force merging.

@pgrete pgrete merged commit 801766e into parthenon-hpc-lab:develop Jul 4, 2024
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants