You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The above fails with OOM and sometimes random UCX errors (after hours off waiting) using the somewhat new cudf_spilling manager using both explicit-comms and regular tasks. Note that the above script limits the number of GPUs to 4 -- the equivalent of 128GB of GPU memory (significantly less than the total data size) :
time DASK_EXPLICIT_COMM=False CUDF_SPILL=1 python ooc-merge.py 2>&1 | tee merge-res.txt
The hope is that p2p shuffling provides stable infrastructure for accomplishing this out-of-core shuffle/merge
The text was updated successfully, but these errors were encountered:
I have a somewhat representative (and currently failing) example of merging two dataframes in a resources constrained environment:
df_base = 295GB and 10674 partitions
df_other = 466GB and 2576 partitions
Each dataframe has two random int columns
Key
andPayload
, both int64Here's a more complete script:
The above fails with OOM and sometimes random UCX errors (after hours off waiting) using the somewhat new cudf_spilling manager using both explicit-comms and regular tasks. Note that the above script limits the number of GPUs to 4 -- the equivalent of 128GB of GPU memory (significantly less than the total data size) :
The hope is that p2p shuffling provides stable infrastructure for accomplishing this out-of-core shuffle/merge
The text was updated successfully, but these errors were encountered: