Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
sleeepyjack committed Oct 17, 2024
1 parent 69a4223 commit fbbee7d
Show file tree
Hide file tree
Showing 4 changed files with 54 additions and 34 deletions.
29 changes: 28 additions & 1 deletion include/cuco/detail/open_addressing/kernels.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -399,7 +399,34 @@ CUCO_KERNEL __launch_bounds__(BlockSize) void find(InputIt first,
}
}

// TODO docs
/**
* @brief Retrieves the equivalent container elements of all keys in the range `[input_probe,
* input_probe + n)`.
*
* If key `k = *(input_probe + i)` has one or more matches in the container, copies `k` to
* `output_probe` and associated slot contents to `output_match`, respectively. The output order is
* unspecified.
*
* @tparam IsOuter Flag indicating whether it's an outer count or not
* @tparam block_size The size of the thread block
* @tparam InputProbeIt Device accessible input iterator
* @tparam OutputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the `InputProbeIt`'s `value_type`
* @tparam OutputMatchIt Device accessible input iterator whose `value_type` is
* convertible to the container's `value_type`
* @tparam AtomicCounter Integral atomic type that follows the same semantics as
* `cuda::(std::)atomic(_ref)`
* @tparam Ref Type of non-owning device ref allowing access to storage
*
* @param input_probe Beginning of the sequence of input keys
* @param n Number of the keys to query
* @param output_probe Beginning of the sequence of keys corresponding to matching elements in
* `output_match`
* @param output_match Beginning of the sequence of matching elements
* @param atomic_counter Pointer to an atomic object of integral type that is used to count the
* number of output elements
* @param ref Non-owning container device ref used to access the slot storage
*/
template <bool IsOuter,
int32_t BlockSize,
class InputProbeIt,
Expand Down
10 changes: 4 additions & 6 deletions include/cuco/detail/open_addressing/open_addressing_impl.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -585,10 +585,9 @@ class open_addressing_impl {
*
* This function synchronizes the given CUDA stream.
*
* @tparam InputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* @tparam InputProbeIt Device accessible input iterator
* @tparam OutputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* convertible to the `InputProbeIt`'s `value_type`
* @tparam OutputMatchIt Device accessible input iterator whose `value_type` is
* convertible to the container's `value_type`
* @tparam Ref Type of non-owning device container ref allowing access to storage
Expand Down Expand Up @@ -630,10 +629,9 @@ class open_addressing_impl {
*
* This function synchronizes the given CUDA stream.
*
* @tparam InputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* @tparam InputProbeIt Device accessible input iterator
* @tparam OutputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* convertible to the `InputProbeIt`'s `value_type`
* @tparam OutputMatchIt Device accessible input iterator whose `value_type` is
* convertible to the container's `value_type`
* @tparam Ref Type of non-owning device container ref allowing access to storage
Expand Down
39 changes: 18 additions & 21 deletions include/cuco/detail/open_addressing/open_addressing_ref_impl.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -989,23 +989,22 @@ class open_addressing_ref_impl {
* Use `count()` to determine the size of the output range.
*
* @tparam BlockSize Size of the thread block this operation is executed in
* @tparam InputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* @tparam InputProbeIt Device accessible input iterator
* @tparam OutputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* convertible to the `InputProbeIt`'s `value_type`
* @tparam OutputMatchIt Device accessible input iterator whose `value_type` is
* convertible to the container's `value_type`
* @tparam AtomicCounter Atomic counter type that follows the same semantics as
* `cuda::atomic(_ref)`
* @tparam AtomicCounter Integral atomic counter type that follows the same semantics as
* `cuda::(std::)atomic(_ref)`
*
* @param block Thread block this operation is executed in
* @param input_probe_begin Beginning of the input sequence of keys
* @param input_probe_end End of the input sequence of keys
* @param output_probe Beginning of the sequence of keys corresponding to matching elements in
* `output_match`
* @param output_match Beginning of the sequence of matching elements
* @param atomic_counter Counter that is used to determine the next free position in the output
* sequences
* @param atomic_counter Pointer to an atomic object of integral type that is used to count the
* number of output elements
*/
template <int32_t BlockSize,
class InputProbeIt,
Expand Down Expand Up @@ -1039,23 +1038,22 @@ class open_addressing_ref_impl {
* to the output sequence.
*
* @tparam BlockSize Size of the thread block this operation is executed in
* @tparam InputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* @tparam InputProbeIt Device accessible input iterator
* @tparam OutputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* convertible to the `InputProbeIt`'s `value_type`
* @tparam OutputMatchIt Device accessible input iterator whose `value_type` is
* convertible to the container's `value_type`
* @tparam AtomicCounter Atomic counter type that follows the same semantics as
* `cuda::atomic(_ref)`
* @tparam AtomicCounter Integral atomic counter type that follows the same semantics as
* `cuda::(std::)atomic(_ref)`
*
* @param block Thread block this operation is executed in
* @param input_probe_begin Beginning of the input sequence of keys
* @param input_probe_end End of the input sequence of keys
* @param output_probe Beginning of the sequence of keys corresponding to matching elements in
* `output_match`
* @param output_match Beginning of the sequence of matching elements
* @param atomic_counter Counter that is used to determine the next free position in the output
* sequences
* @param atomic_counter Pointer to an atomic object of integral type that is used to count the
* number of output elements
*/
template <int32_t BlockSize,
class InputProbeIt,
Expand Down Expand Up @@ -1090,23 +1088,22 @@ class open_addressing_ref_impl {
*
* @tparam IsOuter Flag indicating if an inner or outer retrieve operation should be performed
* @tparam BlockSize Size of the thread block this operation is executed in
* @tparam InputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* @tparam InputProbeIt Device accessible input iterator
* @tparam OutputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* convertible to the `InputProbeIt`'s `value_type`
* @tparam OutputMatchIt Device accessible input iterator whose `value_type` is
* convertible to the container's `value_type`
* @tparam AtomicCounter Atomic counter type that follows the same semantics as
* `cuda::atomic(_ref)`
* @tparam AtomicCounter Integral atomic type that follows the same semantics as
* `cuda::(std::)atomic(_ref)`
*
* @param block Thread block this operation is executed in
* @param input_probe_begin Beginning of the input sequence of keys
* @param input_probe_end End of the input sequence of keys
* @param output_probe Beginning of the sequence of keys corresponding to matching elements in
* `output_match`
* @param output_match Beginning of the sequence of matching elements
* @param atomic_counter Counter that is used to determine the next free position in the output
* sequences
* @param atomic_counter Pointer to an atomic object of integral type that is used to count the
* number of output elements
*/
template <bool IsOuter,
int32_t BlockSize,
Expand Down
10 changes: 4 additions & 6 deletions include/cuco/static_multiset.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -487,10 +487,9 @@ class static_multiset {
*
* This function synchronizes the given CUDA stream.
*
* @tparam InputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* @tparam InputProbeIt Device accessible input iterator
* @tparam OutputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* convertible to the `InputProbeIt`'s `value_type`
* @tparam OutputMatchIt Device accessible input iterator whose `value_type` is
* convertible to the container's `value_type`
*
Expand Down Expand Up @@ -524,10 +523,9 @@ class static_multiset {
*
* This function synchronizes the given CUDA stream.
*
* @tparam InputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* @tparam InputProbeIt Device accessible input iterator
* @tparam OutputProbeIt Device accessible input iterator whose `value_type` is
* convertible to the container's `key_type`
* convertible to the `InputProbeIt`'s `value_type`
* @tparam OutputMatchIt Device accessible input iterator whose `value_type` is
* convertible to the container's `value_type`
*
Expand Down

0 comments on commit fbbee7d

Please sign in to comment.