-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zero overhead low level api #50
Conversation
Different problems requires different communication strategies. Therefore, using |
In general: what is the benefit of having two ways of doing the same thing? There are free-standing functions and this introduces member functions that (ideally) do the same. Can you explain a little more what the upside of member functions is? |
@devreal What I advocate for is not having two ways of doing the same thing. Once again this is not meant to be merged as is, and the only reason why I did not use the already implemented internals is because of their tight coupling with the type erased The first idea I advocate for is to decouple the wrapping of the raw MPI primitives from the communication scheme considerations. The current implementation forces the user to use the type-erased solution, which comes with some overhead, and which is not the best solution for all problems. I would even argue that this would be a strong argument against using The second idea I advocate for is that a member function API would be better than the current free function implementation for the following reasons:
|
I have made the necessary changes to illustrate the potential of the approach. Also tagging @cwpearson for reference |
Do you mean "decoupling communication and resource management"? We could have a way to allow users to query the requirements for a given view to manage resources explicitly, both for datatypes and temporary buffers: // query whether we need a temporary buffer or a derived datatype
std::variant<size_t, KokkosComm::Type> opt = KokkosComm::query(view);
if (std::holds_alternative<size_t>(opt)) {
// provide a temporary buffer for packing, takes ownership until completion
ensure_size(std::get<size_t>(opt), my_properly_sized_buf);
KokkosComm::isend(view, comm, ..., my_properly_sized_buf);
} else {
// provides a datatype we have cached (could be built-in for contiguous views)
KokkosComm::isend(view, comm, ..., std::get<KokkosComm::Type>(opt));
}
// no cached information for another_view, let kokkos figure it out
KokkosComm::isend(another_view, comm, ...,); |
Yes @devreal ! We should have a zero-overhead abstraction layer on top of MPI which simply does the encapsulation with the same semantics, but exposing a nicer and more kokkos-friendly API, and then build on top of that all the additional fancy functionalities, rather than have do-it-all wrappers as is currently the case...
This seems reasonable to me, although I dont quite get where you get your |
@devreal considering your example, an interesting approach could be to have the // no cached information for another_view, let kokkos figure it out
KokkosComm::isend(another_view, comm, ...,); |
I like this approach. We had already considered adding a
I think this is indeed an elegant solution that reduces the verbosity of the code, and it's also the approach taken in other languages implementing MPI bindings (e.g., see rsmpi). This effort needs a few more iterations, but we should consider it. |
some unit tests please! |
Do you prefer to test by source file or by feature/overload set? (I would go for the latter) |
src/impl/KokkosComm_mpi.hpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a good reason to consolidate a bunch of things into a single header here. Let's split it up into
- The special handling around #include <mpi.h>
- KokkosComm_config.h (version-related stuff, a future home for other config things)
- the
mpi_type
thing - The communicator class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by the version-related stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a good reason to consolidate a bunch of things into a single header here
The idea was to keep contained in a single place the raw MPI_***
primitives, and have all the other file use the type-safe API
template <KokkosView SendView> | ||
void send(const SendView &sv, int dest, int tag, MPI_Comm comm) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay to have the proposed interface (again, should call below with DefaultExecutionSpace()
as first argument, but keep this one as well as a very low-level wrapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this the other way around: have the more complex APIs be implemented in terms of the low level ones
namespace KokkosComm::Impl { | ||
template <KokkosExecutionSpace ExecSpace, KokkosView SendView, KokkosView RecvView> | ||
void reduce(const ExecSpace &space, const SendView &sv, const RecvView &rv, MPI_Op op, int root, MPI_Comm comm) { | ||
template <KokkosView SendView, KokkosView RecvView> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should just call the below with DefaultExecutionSpace()
as the first argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And at least a skeleton of documentation, and if the documentation is incomplete, open one or more issues against yourself 😄
@cwpearson Noted. Maybe I'll wait to be sure we are all on the same page before writing the doc though 😉 |
This PR proposes a design which we've had some success with in our in-house solution. This does not have to be integrated as is, the goal is rather to initiate a discussion on the design aspects. I believe this design enables to support the current type-erased solution implemented in the
Req
class, but is also compatible with other semantics/tradeoffs