Skip to content

Commit

Permalink
Docs: Unified memory review update
Browse files Browse the repository at this point in the history
  • Loading branch information
MKKnorr committed Nov 7, 2024
1 parent 461f7d8 commit 0c4abec
Showing 1 changed file with 17 additions and 19 deletions.
36 changes: 17 additions & 19 deletions docs/how-to/hip_runtime_api/memory_management/unified_memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Unified memory management
In conventional architectures CPUs and attached devices have their own memory
space and dedicated physical memory backing it up, e.g. normal RAM for CPUs and
VRAM on GPUs. This way each device can have physical memory optimized for its
use case. Especially GPUs usually have specialized memory whose bandwidth is a
use case. GPUs usually have specialized memory whose bandwidth is a
magnitude higher than the RAM attached to CPUs.

While providing exceptional performance, this setup typically requires explicit
Expand Down Expand Up @@ -74,13 +74,12 @@ System requirements
================================================================================

Unified memory is supported on Linux by all modern AMD GPUs from the Vega
series onward. Unified memory management can be achieved with managed memory
allocation and, for the latest GPUs, with a system allocator.
series onward. Unified memory management can be achieved by explicitly
allocating managed memory using :cpp:func:`hipMallocManaged` or marking variables
with the ``__managed__`` attribute, or, for the latest GPUs, with the normal
system allocator, as shown in the following table.

The table below lists the supported allocators for the different architectures.
The allocators are described in the next section.

.. list-table:: Supported Unified Memory Allocators
.. list-table:: Supported Unified Memory Allocators by GPU architecture
:widths: 40, 25, 25
:header-rows: 1
:align: center
Expand Down Expand Up @@ -139,9 +138,10 @@ system requirements` and :ref:`checking unified memory management support`.

To ensure the proper functioning of system allocated unified memory on supported
graphics cards, it is essential to configure the environment variable
``XNACK=1`` and use a kernel that supports Heterogeneous Memory Management
(HMM). Without this configuration, the behavior will be similar to that of
systems without HMM support. For more details, visit
``XNACK=1`` and use a kernel that supports `Heterogeneous Memory Management
(HMM) <https://www.kernel.org/doc/html/latest/mm/hmm.html>`_. Without this
configuration, the behavior will be similar to that of systems without HMM
support. For more details, visit
`GPU memory <https://rocm.docs.amd.com/en/latest/conceptual/gpu-memory.html#xnack>`_.

The table below illustrates the expected behavior of managed and unified memory
Expand Down Expand Up @@ -571,8 +571,8 @@ Memory advice
Unified memory runtime hints can be set with :cpp:func:`hipMemAdvise()` to help
improve the performance of your code if you know the memory usage pattern. There
are several different types of hints as specified in the enum
:cpp:enum:`hipMemoryAdvise`, e.g. whether a certain device mostly reads the
memory region, where it should ideally be located, and even whether that
:cpp:enum:`hipMemoryAdvise`, for example, whether a certain device mostly reads
the memory region, where it should ideally be located, and even whether that
specific memory region is accessed by a specific device.

For the best performance, profile your application to optimize the
Expand Down Expand Up @@ -728,10 +728,8 @@ memory range. The attributes are given in :cpp:enum:`hipMemRangeAttribute`.
Asynchronously attach memory to a stream
--------------------------------------------------------------------------------

The :cpp:func:`hipStreamAttachMemAsync()` function is able to asynchronously attach
memory to a stream, which can help concurrent execution when using streams.

Currently, this function is a no-operation (NOP) function on AMD GPUs. It simply
returns success after the runtime memory validation passed. This function is
necessary on Microsoft Windows, and HMM is not supported on this operating
system with AMD GPUs at the moment.
The :cpp:func:`hipStreamAttachMemAsync()` function attaches memory to a stream,
which can reduce the amount of memory transferred, when managed memory is used.
When the memory is attached to a stream using this function, it only gets
transferred between devices, when a kernel that is launched on this stream needs
access to the memory.

0 comments on commit 0c4abec

Please sign in to comment.