-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relax NumPy requirement in UCX #3731
Conversation
While it works to have this be a single `int` (as it will be coerced to a `tuple`), go ahead and make it a `tuple` for clarity and to match more closely to the Numba case.
This is equivalent to using NumPy's `uint8`, but has the added benefit of not requiring NumPy be imported to work.
Matches the variable name in the `send` case to make things easier to follow.
As `struct.pack` and `struct.unpack` are able to build `bytes` objects containing the frame metadata needed by UCX easily, just use these functions instead of creating NumPy arrays each time. Helps soften the NumPy requirement a bit.
Matches more closely to the name used by RMM and Numba.
To relax the NumPy requirement completely, add a function to allocate arrays on host. If NumPy is not present, this falls back to just allocating `bytearray` objects, which work just as well.
6e67585
to
258bdca
Compare
Is there any expected performance improvement by using struct vs numpy ? cc @madsbk |
Mostly hoping this simpler implementation avoids red herrings while debugging issues. |
In terms of performance this does improve things a little bit, but it probably doesn't impact the overall send/receive process meaningfully (though I could be wrong about that). Again this isn't really the motivation for it. In [1]: import struct
In [2]: import numpy
In [3]: l = 4 * [True, False]
In [4]: %timeit struct.pack(len(l) * "?", *l)
389 ns ± 4.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: %timeit numpy.array(l)
838 ns ± 2.67 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Avoids multiple calls to `len(frames)`, is a bit easier to read, and matches the receive code path more closely.
To send fewer and larger messages, pack both which frames are on device and how large each frame is into one message.
That's a good point.
…On Mon, Apr 20, 2020, 10:14 PM jakirkham ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In distributed/comm/ucx.py
<#3731 (comment)>:
> @@ -59,34 +60,42 @@ def init_once():
ucp.init(options=ucx_config, env_takes_precedence=True)
+ # Find the function, `host_array()`, to use when allocating new host arrays
+ try:
That may be so. However that would be like saying NumPy can rely on Pandas
dependencies for code that Pandas uses from NumPy. In general we can't
expect that to hold and it's a bit fragile. We should either make NumPy a
requirement here or add a fallback. Seems safer to add this fallback here
and it doesn't cost us much. Plus it should save us suffering later should
things change in UCX-Py and we forget about this line here 🙂
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#3731 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKWW6CZFEHUQUSFF7Z3ILTRNT6RDANCNFSM4MMXVHHA>
.
|
Went ahead and grabbed the header consolidation commit from PR ( #3732 ) and pushed it here. That doesn't really change how the frames are serialized, but it does allow us to send all of the per frame metadata in one message. |
test_ucx.py passed locally with one exception:
though this is probably due to hostname issues. While I don't think this changes the overall scheme described in the notes, lines like:
will be hard to read tomorrow. Would you mind adding a note or two describe what packing is doing ? |
19aabd7
to
4cc672e
Compare
4cc672e
to
c59f95d
Compare
Thank you @jakirkham for taking the time to add those comments |
Thanks Ben! 😄 Yeah I've never gotten that test to run successfully. 😞 Probably we should figure out how the config issue should be fixed and talk to OPS so it can be integrated into their provisioning scripts. Was just about to ask if those comments seemed reasonable. Happy to adjust if needed 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to merge when you're ready.
Thanks all! 😄 |
This relaxes the NumPy requirement in UCX communication. We do this by leveraging things like
struct.pack
andstruct.unpack
for metadata about the frames, which gives us simplebytes
objects to send over the wire. Also we create ahost_array
function that will use NumPy when available, but will fallback to things likebytearray
if not.