Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UniqueToken.md: API documentation for UniqueToken.md #179

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/source/API/core/ParallelDispatch.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ The following parallel pattern tags are used to call the correct overload for te
./parallel-dispatch/parallel_reduce
./parallel-dispatch/parallel_scan
./parallel-dispatch/fence
./parallel-dispatch/UniqueToken.rst
./parallel-dispatch/ParallelForTag
./parallel-dispatch/ParallelReduceTag
./parallel-dispatch/ParallelScanTag
71 changes: 71 additions & 0 deletions docs/source/API/core/parallel-dispatch/UniqueToken.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
``UniqueToken``
===============

.. role:: cppkokkos(code)
:language: cppkokkos

Header File: ``Kokkos_Core.hpp``


Description
------------

``UniqueToken`` is a portable way to acquire a unique ID for calling a thread (``thread-id`` is not portable execution environments). ``UniqueToken`` is thus analogous to ``thread-id``, and has a ``UniqueTokenScope`` template parameter (default: ``Instance``, but can be ``Global``).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
``UniqueToken`` is a portable way to acquire a unique ID for calling a thread (``thread-id`` is not portable execution environments). ``UniqueToken`` is thus analogous to ``thread-id``, and has a ``UniqueTokenScope`` template parameter (default: ``Instance``, but can be ``Global``).
``UniqueToken`` is a portable way to acquire a unique ID identifying a thread. The scope of the unique ID is selected via the ``UniqueTokenScope`` template parameter (defaults to ``Instance``, but can be set to``Global``).


Interface
---------

.. cppkokkos:class:: template <class ExecutionSpace, UniqueTokenScope :: Global> UniqueToken
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. cppkokkos:class:: template <class ExecutionSpace, UniqueTokenScope :: Global> UniqueToken
.. cppkokkos:class:: template <class ExecutionSpace=DefaultExecutionSpace, UniqueTokenScope = UniqueTokenScope::Instance> UniqueToken




Parameters
-----------

* ``ExecutionSpace``: See `Execution Spaces <../execution_spaces.html>`_

.. note::
In a parallel region, before the main computation, a pool of ``UniqueToken`` (integer) Id is generated, and each Id is released following iteration.

.. warning::
``UniqueToken <ExecutionSpace> token`` *can* be called inside a parallel region, *but* must be released at the end of *each* iteration.


Comment on lines +22 to +33
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Parameters
-----------
* ``ExecutionSpace``: See `Execution Spaces <../execution_spaces.html>`_
.. note::
In a parallel region, before the main computation, a pool of ``UniqueToken`` (integer) Id is generated, and each Id is released following iteration.
.. warning::
``UniqueToken <ExecutionSpace> token`` *can* be called inside a parallel region, *but* must be released at the end of *each* iteration.
Parameters
-----------
* ``ExecutionSpace``: See `Execution Spaces <../execution_spaces.html>`_
* ``UniqueTokenScope``: defaults to ``Instance`` which results in every instance of ``UniqueToken`` being independent. In contrast ``Global`` uses one set of IDs for all instances.

* ``UniqueTokenScope``: defaults to ``Instance``, but ``Global`` can be employed when thread awareness is needed for more than one ``ExecutionSpace`` instance, as in the case of submitting concurrent kernels to CUDA streams.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ``UniqueTokenScope``: defaults to ``Instance``, but ``Global`` can be employed when thread awareness is needed for more than one ``ExecutionSpace`` instance, as in the case of submitting concurrent kernels to CUDA streams.

moved to the parameters above



Constructors
-------------
.. cppkokkos:function:: UniqueToken(size_t max_size, ExecutionSpace execution_space, UniqueTokenScope :: Global)




Comment on lines +37 to +43
Copy link
Contributor

@JBludau JBludau Sep 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Constructors
-------------
.. cppkokkos:function:: UniqueToken(size_t max_size, ExecutionSpace execution_space, UniqueTokenScope :: Global)
Constructors
-------------
.. cppkokkos:function:: UniqueToken(ExecutionSpace execution_space = ExecutionSpace())
Additionally for ``UniqueTokenScope==Instance``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. cppkokkos:function:: UniqueToken(size_t max_size, ExecutionSpace execution_space=ExecutionSpace())
Sets an upper bound to the number of tokens. Requires ``max_size <= execution_space.concurrency()``
Token Interface
---------------
.. cppkokkos:function:: size_type size()
Returns the size of the token pool
.. cppkokkos:function:: size_type acquire()
Returns the token for the executing tread
.. warning::
Acquired tokens *must* be released at the end of the parallel region in which they were acquired.
.. cppkokkos:function:: void release(size_type idx)
Releases the passed token

@crtrott can you verify that it is max_size <= execution_space.concurrency() ... the comment in https://github.com/kokkos/kokkos/blob/d4af7c8302ecb4280c6215497bb2e39f9ed83c3f/core/src/Kokkos_UniqueToken.hpp#L78 would suggest otherwise, but the following lines suggest the opposite in my understanding. But I might be wrong here :-D

Examples
---------

.. code-block:: cpp

// UniqueToken on an Execution Space Instance
UniqueToken < ExecutionSpace > token ;
int number_of_uniqe_ids = token.size ();
RandomGenPool pool ( number_of_unique_ids , seed );

parallel_for ("L", N, KOKKOS_LAMBDA ( int i) {
int id = token . acquire ();
RandomGen gen = pool (id );
// Computation Body
token . release (id );
Comment on lines +55 to +58
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int id = token . acquire ();
RandomGen gen = pool (id );
// Computation Body
token . release (id );
auto id = token.acquire ();
RandomGen gen = pool (id );
// Computation Body
token.release (id );

});

// Submitting concurrent kernels to (e.g., CUDA) streams

void foo () {
UniqueToken < ExecSpace , UniqueTokenScope :: Global > token_foo ;
parallel_for ("L", RangePolicy < ExecSpace >( stream1 ,0,N), functor_a ( token_foo ));}
Comment on lines +63 to +65
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void foo () {
UniqueToken < ExecSpace , UniqueTokenScope :: Global > token_foo ;
parallel_for ("L", RangePolicy < ExecSpace >( stream1 ,0,N), functor_a ( token_foo ));}
void foo () {
UniqueToken < ExecSpace , UniqueTokenScope :: Global > token_foo ;
parallel_for ("L", RangePolicy < ExecSpace >( stream1 ,0,N), functor_a ( token_foo ));
}


void bar () {
UniqueToken < ExecSpace , UniqueTokenScope :: Global > token_bar ;
parallel_for ("L", RangePolicy < ExecSpace >( stream2 ,0,N), functor_b ( token_bar ));}
Comment on lines +67 to +69
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void bar () {
UniqueToken < ExecSpace , UniqueTokenScope :: Global > token_bar ;
parallel_for ("L", RangePolicy < ExecSpace >( stream2 ,0,N), functor_b ( token_bar ));}
void bar () {
UniqueToken < ExecSpace , UniqueTokenScope :: Global > token_bar ;
parallel_for ("L", RangePolicy < ExecSpace >( stream2 ,0,N), functor_b ( token_bar ));
}



Loading