Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft kokkos-comm initialization/finalization using MPI_Sessions #68

Draft
wants to merge 20 commits into
base: develop
Choose a base branch
from

Conversation

dssgabriel
Copy link
Collaborator

This PR is a first attempt at initialization and finalization for kokkos-comm relying on MPI_Sessions.

Some noteworthy additions brought by this PR:

  • a Communicator class that wraps the MPI_Comm and a Kokkos execution space
  • a Universe class that holds the handle to the MPI sessions, as well as a session-associated communicator.

I expect lots of changes in the API, which I find kind of clunky at the moment. Reviews and comments on how to improve are welcome.

@dssgabriel dssgabriel marked this pull request as draft May 23, 2024 14:40
comm_kind = CommunicatorKind::User;
} else {
fprintf(stderr, "[KokkosComm] error: intercommunicators are not supported (yet).\n");
std::terminate();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't just terminate. Either throw an exception or call MPI_Abort (I've seen applications hang where one process aborted without calling MPI_Abort, it's nasty)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific error code you suggest?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, anything non-zero would work. This ties in with #29 so anything you do here is probably temporary anyway :)


static auto dup(const Communicator &other) -> Communicator { return Communicator::dup_raw(other.as_raw()); }

static auto split_raw(MPI_Comm raw, Color color, Key key) -> Communicator {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why this isn't just an overload of split?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to make it explicit that we're splitting from a raw MPI_Comm handle, not a wrapped KokkosComm::Communicator, but I guess we can also simply overload.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's a design decision that the Kokkos community should make (to be consistent with the other projects)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the Raw. We typically are happy with overloads.

Comment on lines 86 to 88
inline static auto self(void) -> Communicator { return Communicator::from_raw_unchecked(MPI_COMM_SELF); }

inline static auto world(void) -> Communicator { return Communicator::from_raw_unchecked(MPI_COMM_WORLD); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't belong into the communicator. Since we use sessions, there should be no use of the world process model. It could even be that no one has called MPI_Init.

}

template <KokkosExecutionSpace ExecSpace>
auto initialize(int &argc, char *argv[]) -> Universe<ExecSpace> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious that once version of initialize uses the WPM and one uses the SPM. I'd rather stay away from the WPM tbh.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the one using the WPM so that we only initialize using the SPM.

src/KokkosComm_communicator.hpp Show resolved Hide resolved
@dssgabriel
Copy link
Collaborator Author

Thank you very much for the first review, @devreal!

Regarding C++ destruction and the SPM finalization semantics: is MPI_Comm_free required to be called before MPI_Session_finalize? And if so, how can I destroy the communicator without having to call its dtor explicitly?

@devreal
Copy link
Collaborator

devreal commented May 23, 2024

Just a PSA: OMPI in combination with UCX won't support Sessions until the next major release of OMPI (6.0): open-mpi/ompi#12566 (comment)

@dssgabriel
Copy link
Collaborator Author

It looks like Linux's OpenMPI is too old to support MPI_Sessions (which the initial support for was added in OpenMP 5.0), hence the failing CI tests.

@cwpearson Is there any way we can specifically install OpenMPI 5.x in the CI? The package from Ubuntu repos is simply not up-to-date.

@devreal
Copy link
Collaborator

devreal commented May 24, 2024

Regarding C++ destruction and the SPM finalization semantics: is MPI_Comm_free required to be called before MPI_Session_finalize? And if so, how can I destroy the communicator without having to call its dtor explicitly?

The standard says that the application is required to clean up its objects before MPI_Session_finalize:

The call to MPI_SESSION_FINALIZE does not free objects created by MPI calls; these
objects are freed using MPI_XXX_FREE, MPI_COMM_DISCONNECT, or
MPI_FILE_CLOSE calls.
Once MPI_SESSION_FINALIZE returns, no MPI procedure may be called in the Sessions
Model that are related to this session (not even freeing objects that are derived from
this session), except for those listed in Section 11.4.1.

We had a similar issue in other projects and came down to a registry in which objects that needed destruction were referenced. There are two options they can get destroyed:

  1. Their destructor is called before the Session is destroyed. Then the object just removes itself from the registry.
  2. They are released as part of the destruction of the session object (before MPI_Session_finalize is called) and the object becomes an empty shell, whose destructor does not do anything.

All of this must be thread-safe etc but communicators are heavy objects that are not created and destroyed regularly so I think the overhead involved in such a scheme is acceptable.

@janciesko
Copy link
Collaborator

janciesko commented Oct 9, 2024

Would you like to rebase this on top of current develop?
@devreal, what's the status of sessions in some common MPI implementations? Can we drop this in or rather put it in an Experimental namespace?

@devreal
Copy link
Collaborator

devreal commented Oct 9, 2024

@devreal, what's the status of sessions in some common MPI implementations? Can we drop this in or rather put it in an Experimental namespace?

Open MPI's session support has not been released yet. It will be included in the next major release (tbd when that comes out)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants