Skip to content

Commit

Permalink
Add expected behavior to unified memory
Browse files Browse the repository at this point in the history
  • Loading branch information
neon60 committed Sep 26, 2024
1 parent 8aa80d7 commit 7e27ba9
Show file tree
Hide file tree
Showing 2 changed files with 58 additions and 0 deletions.
56 changes: 56 additions & 0 deletions docs/how-to/unified_memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,62 @@ system requirements` and :ref:`checking unified memory management support`.
offers an easy transition from a CPU written C++ code to a HIP code as the
same system allocation API is used.

To ensure the proper functioning of unified memory features on Heterogeneous Memory Management (HMM) supported graphics cards, it is essential to configure the environment variable ``XNACK=1``. Without this configuration, the behavior will be similar to that of systems without HMM support. For more details, visit `GPU memory <https://rocm.docs.amd.com/en/latest/conceptual/gpu-memory.html#xnack>`_.

The chart below illustrates the expected behavior of managed and unified memory functions in ROCm and CUDA environments, both with and without HMM support:

.. list-table:: Comparison of expected behavior of managed and unified memory functions in ROCm
:widths: 40, 25, 25
:header-rows: 2

* - call
- ROCm 5 or 6 without HMM or with ``XNACK=0``
- ROCm 5 or 6 with HMM and with ``XNACK=1``
* - OS support
- RHEL 8.4 or SLES 15 SP2
- RHEL 8.7+ or SLES 15 SP4
* - ``malloc()``, ``new``, system allocator
- host (not accessible on device)
- host, page-fault migration
* - ``hipMalloc()``
- device, zero copy
- device, zero copy
* - ``hipMallocManaged()``, ``__managed__``
- host, pinned, zero copy
- host, page-fault migration
* - ``hipHostRegister()``
- undefined behavior
- host, page-fault migration
* - ``hipHostMalloc()``
- host, pinned, zero copy
- host, pinned, zero copy

.. list-table:: Comparison of expected behavior of managed and unified memory functions in CUDA
:widths: 40, 25, 25
:header-rows: 2

* - call
- CUDA 11 or 12 without HMM
- CUDA 11 or 12 with HMM
* - OS support
- RHEL 7.9 or SLES 15 SP2
- kernel 6.1.24+, 6.2.11+ or 6.3+, x64, CUDA 12.2+
* - ``malloc()``, ``new``, system allocator
- host (not accessible on device)
- first touch, page-fault migration
* - ``cudaMalloc()``
- device (not accessible on host)
- device, page-fault migration
* - ``cudaMallocManaged()``, ``__managed__``
- host, page-fault migration
- first touch, page-fault migration
* - ``cudaHostRegister()``
- host, page-fault migration
- host, page-fault migration
* - ``cudaMallocHost()``
- host, pinned, zero copy
- host, pinned, zero copy

.. _checking unified memory management support:

Checking unified memory management support
Expand Down
2 changes: 2 additions & 0 deletions include/hip/hip_runtime_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -3523,6 +3523,8 @@ hipError_t hipExtHostAlloc(void** ptr, size_t size, unsigned int flags);
*
* The API returns the allocation pointer, managed by HMM, can be used further to execute kernels
* on device and fetch data between the host and device as needed.
*
* If HMM is not supported, the function behaves the same as @p hipMallocHost .
*
* @note It is recommend to do the capability check before call this API.
*
Expand Down

0 comments on commit 7e27ba9

Please sign in to comment.