Use CUDA virtual memory for pinned memory allocator. #10850

trivialfis · 2024-09-25T21:46:06Z

This is to avoid obtaining CUDA memory-related global locks (like those used by cudaFree), and to fix #10312 . Only works with cudart>=12.5. The implementation is based on this blog post, with additional support for host numa allocation.

Running MGPU column split tests, haven't been able to observe any hang with the PR. (yet)

Add a grow-only virtual memory allocator.
Define a driver API wrapper. Split up the runtime API wrapper.

- Add a grow-only virtual memory allocator. - Define a driver API wrapper. Split up the runtime API wrapper.

trivialfis · 2024-09-26T20:18:55Z

At the moment, this is not being tested on the CI as the feature requires CTK12.5+ to be stable.

trivialfis · 2024-09-27T11:09:00Z

cc @hcho3 @rongou @RAMitchell .

trivialfis · 2024-09-27T13:53:53Z

src/common/threading_utils.cc

@@ -118,6 +120,14 @@ std::int32_t OmpGetNumThreads(std::int32_t n_threads) {
  return n_threads;
 }

+[[nodiscard]] bool GetCpuNuma(unsigned int* cpu, unsigned int* numa) {


This is not used, I keep it to check the numa node when debugging.

trivialfis force-pushed the virtual-mem branch 3 times, most recently from 468b590 to 3fafe8e Compare September 26, 2024 08:15

Use virtual memory for CUDA host pinned allocation.

dc9b87d

- Add a grow-only virtual memory allocator. - Define a driver API wrapper. Split up the runtime API wrapper.

trivialfis force-pushed the virtual-mem branch from 6142fcf to dc9b87d Compare September 26, 2024 14:33

lint.

6601948

trivialfis changed the title ~~[WIP] Use CUDA virtual memory for pinned memory allocator.~~ Use CUDA virtual memory for pinned memory allocator. Sep 26, 2024

trivialfis marked this pull request as ready for review September 26, 2024 20:25

trivialfis added 5 commits September 27, 2024 05:29

win.

453eaa8

Fix

6175965

version.

6f7cb98

Cleanup.

fd8e696

Cleanup.

bcca061

CTK11.

cacd76c

trivialfis commented Sep 27, 2024

View reviewed changes

rongou approved these changes Sep 27, 2024

View reviewed changes

Use driver version instead.

31f69e3

trivialfis merged commit 271f4a8 into dmlc:master Sep 27, 2024
30 checks passed

trivialfis deleted the virtual-mem branch September 27, 2024 20:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use CUDA virtual memory for pinned memory allocator. #10850

Use CUDA virtual memory for pinned memory allocator. #10850

trivialfis commented Sep 25, 2024 •

edited

Loading

trivialfis commented Sep 26, 2024

trivialfis commented Sep 27, 2024 •

edited

Loading

trivialfis Sep 27, 2024

Use CUDA virtual memory for pinned memory allocator. #10850

Use CUDA virtual memory for pinned memory allocator. #10850

Conversation

trivialfis commented Sep 25, 2024 • edited Loading

trivialfis commented Sep 26, 2024

trivialfis commented Sep 27, 2024 • edited Loading

trivialfis Sep 27, 2024

Choose a reason for hiding this comment

trivialfis commented Sep 25, 2024 •

edited

Loading

trivialfis commented Sep 27, 2024 •

edited

Loading