-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate messages in UCX #3732
base: main
Are you sure you want to change the base?
Commits on Apr 21, 2020
-
Provides a function to let us coerce our underlying `__cuda_array_interface__` objects into something that behaves more like an array. Prefers CuPy if possible, but will fallback to Numba if its not available.
Configuration menu - View commit details
-
Copy full SHA for 070a19f - Browse repository at this point
Copy the full SHA 070a19fView commit details -
Send/recv host and device frames in a message each
To cutdown on the number of send/recv operations and also to transmit larger amounts of data at a time, this condenses all frames into a host buffer and a device buffer, which are sent as two separate transmissions.
Configuration menu - View commit details
-
Copy full SHA for bee6f0b - Browse repository at this point
Copy the full SHA bee6f0bView commit details
Commits on Apr 23, 2020
-
Configuration menu - View commit details
-
Copy full SHA for a9b3161 - Browse repository at this point
Copy the full SHA a9b3161View commit details -
Fast path cases with 0 or 1 frames
No need to concatenate them together in this case.
Configuration menu - View commit details
-
Copy full SHA for 5ed7332 - Browse repository at this point
Copy the full SHA 5ed7332View commit details -
To optimize concatenation in the case where NumPy and CuPy are around, just use their `concatenate` functions. However when they are absent fallback to some hand-rolled concatenate routines.
Configuration menu - View commit details
-
Copy full SHA for 0473527 - Browse repository at this point
Copy the full SHA 0473527View commit details -
To optimize the case where NumPy and CuPy are around, simply use their `split` function to pull apart large frames into smaller chunks.
Configuration menu - View commit details
-
Copy full SHA for 610e864 - Browse repository at this point
Copy the full SHA 610e864View commit details
Commits on Apr 28, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 87c85cf - Browse repository at this point
Copy the full SHA 87c85cfView commit details -
Only return
DeviceBuffer
s/memoryview
sMake sure that we extract and return the underlying `DeviceBuffer`s/`memoryview`s.
Configuration menu - View commit details
-
Copy full SHA for 0dc0bb0 - Browse repository at this point
Copy the full SHA 0dc0bb0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3d325f8 - Browse repository at this point
Copy the full SHA 3d325f8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 107a2db - Browse repository at this point
Copy the full SHA 107a2dbView commit details -
Configuration menu - View commit details
-
Copy full SHA for dbd57cf - Browse repository at this point
Copy the full SHA dbd57cfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6fba794 - Browse repository at this point
Copy the full SHA 6fba794View commit details -
Configuration menu - View commit details
-
Copy full SHA for c04bb39 - Browse repository at this point
Copy the full SHA c04bb39View commit details -
Configuration menu - View commit details
-
Copy full SHA for 820fbc4 - Browse repository at this point
Copy the full SHA 820fbc4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 18d4331 - Browse repository at this point
Copy the full SHA 18d4331View commit details -
Move sync before send/recv of device buffers
This limits synchronization to cases where only non-trivial device buffers are being sent.
Configuration menu - View commit details
-
Copy full SHA for 19dfbf6 - Browse repository at this point
Copy the full SHA 19dfbf6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1a4a324 - Browse repository at this point
Copy the full SHA 1a4a324View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5bf32e0 - Browse repository at this point
Copy the full SHA 5bf32e0View commit details -
Configuration menu - View commit details
-
Copy full SHA for fb6ba72 - Browse repository at this point
Copy the full SHA fb6ba72View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0dcbd5c - Browse repository at this point
Copy the full SHA 0dcbd5cView commit details -
This will result in an extra empty frame getting added to the back of the list of frames, which we don't need. So go ahead and drop the last length as split already will grab until the end.
Configuration menu - View commit details
-
Copy full SHA for 5c3ad3a - Browse repository at this point
Copy the full SHA 5c3ad3aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5663983 - Browse repository at this point
Copy the full SHA 5663983View commit details -
Configuration menu - View commit details
-
Copy full SHA for 791fb26 - Browse repository at this point
Copy the full SHA 791fb26View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6eac4d4 - Browse repository at this point
Copy the full SHA 6eac4d4View commit details
Commits on Apr 30, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 99a73a0 - Browse repository at this point
Copy the full SHA 99a73a0View commit details -
Configuration menu - View commit details
-
Copy full SHA for c4c6801 - Browse repository at this point
Copy the full SHA c4c6801View commit details -
Configuration menu - View commit details
-
Copy full SHA for 62f7f12 - Browse repository at this point
Copy the full SHA 62f7f12View commit details -
Configuration menu - View commit details
-
Copy full SHA for 877dab4 - Browse repository at this point
Copy the full SHA 877dab4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 81f718b - Browse repository at this point
Copy the full SHA 81f718bView commit details -
Configuration menu - View commit details
-
Copy full SHA for ee528d5 - Browse repository at this point
Copy the full SHA ee528d5View commit details -
Configuration menu - View commit details
-
Copy full SHA for af94abb - Browse repository at this point
Copy the full SHA af94abbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 94aee85 - Browse repository at this point
Copy the full SHA 94aee85View commit details -
Configuration menu - View commit details
-
Copy full SHA for a9900a0 - Browse repository at this point
Copy the full SHA a9900a0View commit details -
Rewrite
device_split
to usecupy.copyto
As `.copy()` calls `memcpy`, which is synchronous, performance is worse as we synchronize after copying each part of the buffer. To fix this, we switch to `cupy.copyto` with calls `memcpyasync`. This lets us avoid having a synchronize after each copy.
Configuration menu - View commit details
-
Copy full SHA for f37951a - Browse repository at this point
Copy the full SHA f37951aView commit details -
Shouldn't be needed as copying should occur before deletion of the original buffer as it is stream ordered.
Configuration menu - View commit details
-
Copy full SHA for a236401 - Browse repository at this point
Copy the full SHA a236401View commit details -
Configuration menu - View commit details
-
Copy full SHA for b26b58d - Browse repository at this point
Copy the full SHA b26b58dView commit details -
Configuration menu - View commit details
-
Copy full SHA for a05293b - Browse repository at this point
Copy the full SHA a05293bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0d046d7 - Browse repository at this point
Copy the full SHA 0d046d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for e4a6d1e - Browse repository at this point
Copy the full SHA e4a6d1eView commit details