You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The unification of the API layers and removal of the mg namespace as described in rapidsai/cuvs#357 require some changes on the RAFT end. Namely, the NCCL clique should now be a core type and its presence in the resource handle inform on the algorithm implementation to run. The PR resolving this issue on the RAFT side should :
Set the NCCL clique as a core type
Separate NCCL clique initialization from its access and improve the initialization process
Leave a separate access function to be used internally by the cuVS library
Describe the solution you'd like
The nccl_clique.hpp file should be placed in the raft/core directory and the nccl_clique struct should be placed in the raft::core namespace.
A raft::resource::initialize_nccl_clique() function to initialize a NCCL clique and add it to a resource handler. This function would be called before calling an algorithm implementation. The presence of the NCCL clique resource on the resource handler would inform the willingness to run the algorithms in multi-GPU mode. The function could also allow the configuration of the GPUs to include during clique initialization and the percentage of device memory to pre-allocate as a memory pool on each.
A raft::resource::get_nccl_clique() function to access the NCCL clique internally inside of implementations.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
The unification of the API layers and removal of the
mg
namespace as described in rapidsai/cuvs#357 require some changes on the RAFT end. Namely, the NCCL clique should now be a core type and its presence in the resource handle inform on the algorithm implementation to run. The PR resolving this issue on the RAFT side should :Describe the solution you'd like
nccl_clique.hpp
file should be placed in theraft/core
directory and the nccl_clique struct should be placed in theraft::core
namespace.raft::resource::initialize_nccl_clique()
function to initialize a NCCL clique and add it to a resource handler. This function would be called before calling an algorithm implementation. The presence of the NCCL clique resource on the resource handler would inform the willingness to run the algorithms in multi-GPU mode. The function could also allow the configuration of the GPUs to include during clique initialization and the percentage of device memory to pre-allocate as a memory pool on each.raft::resource::get_nccl_clique()
function to access the NCCL clique internally inside of implementations.The text was updated successfully, but these errors were encountered: