Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: automatic bulk handle #287

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

WIP: automatic bulk handle #287

wants to merge 2 commits into from

Conversation

mdorier
Copy link
Contributor

@mdorier mdorier commented Sep 3, 2024

Most often when using bulk operations, the code looks as follows:

  • The client wraps its localmemory into an hg_bulk_t;
  • The client sends that hg_bulk_t to a server;
  • The server allocates a local buffer;
  • The server wraps the local buffer into its own hg_bulk_t;
  • The server issues margo_bulk_transfer operations between the two hg_bulk_t.

The feature proposed by this PR, called margo_auto_bulk_t, is a way to simplify this. With this feature, the above pattern changes into the following:

  • The client wraps its local memory into a margo_auto_bulk_t;
  • The client sends that margo_auto_bulk_t to a server;
  • The server can immediately call margo_auto_bulk_access to get a pointer to a local buffer (allocated by Margo) mirroring the client's buffer. If the client has specified MARGO_PULL_ON_ACCESS as a flag, the first call to margo_auto_bulk_access will pull the content of the client's memory into the server's.
  • The server can use margo_auto_bulk_pull to pull data from the client's memory to the server's local buffer, and margo_auto_bulk_push to push data from the server's local buffer to the client's memory;
  • If the client has specified MARGO_PUSH_ON_DESTROY as a flag, margo_auto_bulk_push is implicitly called when the margo_auto_bulk_t is destroyed.

In most cases, margo_auto_bulk_push and margo_auto_bulk_pull are not needed, and transfers will happen automatically on margo_auto_bulk_access and/or when calling margo_free_input on an RPC input that contains a margo_auto_bulk_t.

Important to note is the fact that the margo_auto_bulk_t will optimize the whole thing when client and server live in the same process: margo_auto_bulk_access will simply provide the server with the address of the buffer provided by the client, rather than do an extra allocation, and all the copies will be bypassed.

@carns Before I implement this feature, do you have any comment?

Copy link

codecov bot commented Sep 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.65%. Comparing base (e973cd6) to head (733ca89).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #287   +/-   ##
=======================================
  Coverage   57.65%   57.65%           
=======================================
  Files          70       70           
  Lines       10146    10146           
  Branches     1335     1335           
=======================================
  Hits         5850     5850           
  Misses       3462     3462           
  Partials      834      834           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@carns
Copy link
Member

carns commented Sep 24, 2024

This is a little bit more complicated than I expected. I'm trying to think about what things could maybe make it simpler.

From the use case description, I think the biggest difference between the conventional pattern and the proposed pattern is on the server side.

Could we accomplish something similar without adding a new bulk type, but by instead by just adding two convenience functions on the server, something like margo_auto_bulk_create() and margo_auto_bulk_destroy()? The former would create a local hg_bulk_t, allocate memory for it, and if a PULL flag is set do the bulk_transfer, all in one call from an API point of view. The latter would (if a PUSH flag is set) do a bulk transfer, destroy the hg_bulk_t, and free the memory.

I could see that being helpful to hide the transfer and allocation steps. I'm not sure that we need a new type or new push or pull functions for it, though? The underlying functions are available for anyone doing more sophisticated patterns.

@mdorier
Copy link
Contributor Author

mdorier commented Sep 24, 2024

In your scenario, you still need a new bulk type wrapping the local bulk + local buffer, don't you? Then the only difference between my API and yours is that (1) your solution doesn't involve creating an auto-bulk on the client side, and (2) my solution does the transfer when first calling margo_auto_bulk_access (because the auto_bulk will have already been created via deserialization), while in your API it's done when creating the auto-bulk?

@carns
Copy link
Member

carns commented Sep 24, 2024

margo_bulk_mirror_create() and margo_bulk_mirror_free() seem like better names after offline discussion, just recording here so we don't forget.

The create() can provide both hg_bulk_t* and a void* output arguments, which may refer to new allocations or may reference the peer's memory if the peer is local.

Still under discussion: how (and should we) handle different allocation modes, like being able to use existing memory, or being able to use poolsets, or being able to use a provider-specific allocator?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants