Implementation of histogram with sycl kernel #2027

AlexanderKalistratov · 2024-09-11T21:36:30Z

Implemention of histogram with sycl_kernel.
This PR adds generic histogram kernel which can be used in the future to implement other versions of histogram such as bincount, histogram2d and histogramdd or specialize kernel for special cases like uniform bins.

sycl kernel covers only specific datatype and usm memory types. Unsupported cases are covered by additional copy.

dpnp/dpnp_iface_histograms.py

oleksandr-pavlyk · 2024-09-13T16:03:44Z

Quick validation via independent implementation:

def histogram1d_impl_tensor(data : dpt.usm_ndarray, bins : dpt.usm_ndarray) -> dpt.usm_ndarray:
    assert data.ndim == 1 
    assert bins.ndim == 1
    assert bins.shape[0] > 1
    bin_idx = dpt.searchsorted(bins, data)
    _, c = dpt.unique_counts(dpt.sort(bin_idx))
    return c


In [22]: x = dpnp.random.randn(10**7).get_array()

In [23]: bins = dpnp.asarray([-10, -4, -3, -2, -1, -0.5, -0.25, 0, 0.25, 0.5, 1, 2, 3, 4, 6], dtype=x.dtype).get_array()

In [24]: %time c, _ = dpnp.histogram(dpnp.asarray(x), bins=dpnp.asarray(bins)); print(c)
[    284   13222  214243 1359725 1497540  927356  987069  987352  927886
 1498601 1358597  214704   13114     307]
CPU times: user 10.6 ms, sys: 10.4 ms, total: 21 ms
Wall time: 15.2 ms

In [25]: %time c = histogram1d_impl_tensor(x, bins=bins); print(c)
[    284   13222  214243 1359725 1497540  927356  987069  987352  927886
 1498601 1358597  214704   13114     307]
CPU times: user 711 ms, sys: 581 ms, total: 1.29 s
Wall time: 396 ms

dpnp/backend/extensions/sycl_ext/histogram.cpp

AlexanderKalistratov · 2024-10-09T03:47:48Z

@antonwolfy

dpnp/dpnp_iface_histograms.py

dpnp/backend/extensions/sycl_ext/histogram.cpp

dpnp/backend/extensions/sycl_ext/CMakeLists.txt

dpnp/backend/extensions/sycl_ext/histogram_common.hpp

dpnp/backend/extensions/sycl_ext/histogram.cpp

dpnp/backend/extensions/sycl_ext/histogram.hpp

dpnp/dpnp_iface_histograms.py

oleksandr-pavlyk · 2024-10-16T20:11:44Z

I think this is a bug:


In [4]: data, edges = dpt.concat([dpt.full(10**7, fill_value=2., dtype="f"), dpt.full(10**7, fill_value=1., dtype="f"), dpt.full(10**7, fill_value=4., dtype="f")]), dpt.asarray([-2,1, 2, 4], dtype="f")

In [5]: dpnp.histogram(data, edges)
Out[5]:
(array([       0, 10000000, 20000000]),
 usm_ndarray([-2.,  1.,  2.,  4.], dtype=float32))

In [6]: dpnp.histogram(data, edges, density=True)
Out[6]:
(array([0.       , 0.3333333, 0.3333333], dtype=float32),
 usm_ndarray([-2.,  1.,  2.,  4.], dtype=float32))

The density should be (0, 1/3, 2/3), instead of (0, 1/3, 1/3).

AlexanderKalistratov · 2024-10-17T16:07:34Z

@oleksandr-pavlyk it is not a bug. Numpy demonstrates the same behavior:

>>> import numpy
>>> data, edges = numpy.concatenate([numpy.full(10**7, fill_value=2., dtype="f"), numpy.full(10**7, fill_value=1., dtype="f"), numpy.full(10**7, fill_value=4., dtype="f")]), numpy.asarray([-2,1, 2, 4], dtype="f")
>>> numpy.histogram(data, edges)
(array([       0, 10000000, 20000000]), array([-2.,  1.,  2.,  4.], dtype=float32))
>>> numpy.histogram(data, edges, density=True)
(array([0.        , 0.33333333, 0.33333333]), array([-2.,  1.,  2.,  4.], dtype=float32))

dpnp/backend/extensions/sycl_ext/histogram.cpp

antonwolfy · 2024-10-22T11:51:57Z

dpnp/backend/extensions/sycl_ext/histogram.cpp

+                uint32_t max_local_copies = local_mem_size / bins_count;
+                uint32_t local_hist_count = std::max(
+                    std::min(
+                        int(std::ceil((float(4 * local_size) / bins_count))),


Would it be helpful to place any comment or named constexpr variables to clarify the meaning of constants?

antonwolfy · 2024-10-22T12:00:39Z

dpnp/backend/extensions/sycl_ext/histogram_common.hpp

+{
+    static bool isnan(const T &v)
+    {
+        if constexpr (std::is_floating_point<T>::value) {


Suggested change

if constexpr (std::is_floating_point<T>::value) {

if constexpr (std::is_floating_point_v<T>) {

antonwolfy · 2024-10-22T12:10:49Z

dpnp/backend/extensions/sycl_ext/histogram_common.cpp

+            return arr->get_queue() != exec_q;
+        });
+
+    if (unequal_queue != arrays.cend()) {


Was it intentional not to use the utils functions from dpctl here (like dpctl::utils::queues_are_compatible)? (to print parameter name also in case of check failure)

antonwolfy · 2024-10-22T13:34:22Z

dpnp/dpnp_iface_histograms.py

+        if hist_dtype == dpnp.complex128:
+            a_bin_dtype = dpnp.float64
+        elif hist_dtype == dpnp.float64:
+            a_bin_dtype = dpnp.complex128


Isn't that do the same like in above block 357-360?

antonwolfy · 2024-10-22T13:37:43Z

dpnp/dpnp_iface_histograms.py

+    has_fp64 = device.has_aspect_fp64
+    a_bin_dtype = _result_type_for_device(a_dtype, bins_dtype, device)
+
+    supported_types = (dpnp.float32, dpnp.int64, numpy.uint64, dpnp.complex64)


Wouldn't it be helpful to return a mapper function based on ContigFactory? Like it's done for ufunc, and then it will only need to check if we can cast the input arrays to expected matching dtype if any. And it will return the dtype of result histogram array we have to allocate.

antonwolfy · 2024-10-22T13:45:12Z

dpnp/dpnp_iface_histograms.py

+    if hist_dtype == numpy.uint64:
+        hist_dtype = dpnp.int64
+
+    if (a_bin_dtype in float_types and hist_dtype in float_types) or (


Should it be else if?

Suggested change

if (a_bin_dtype in float_types and hist_dtype in float_types) or (

elif (a_bin_dtype in float_types and hist_dtype in float_types) or (

antonwolfy · 2024-10-22T13:46:45Z

dpnp/dpnp_iface_histograms.py

+    # host usm memory
+    n_usm_type = "device" if usm_type == "host" else usm_type
+
+    n_casted = dpnp.zeros(


Do we need to fill the memory with zeros?

Suggested change

n_casted = dpnp.zeros(

n_casted = dpnp.empty(

antonwolfy · 2024-10-22T13:48:30Z

dpnp/dpnp_iface_histograms.py

+        a_usm,
+        bins_usm,
+        weights_usm,
+        n_usm,


We know that n_casted is dpnp.ndarray, so we can use get_array method:

Suggested change

n_usm,

n_casted.get_array(),

oleksandr-pavlyk · 2024-10-22T14:22:47Z

dpnp/backend/extensions/sycl_ext/sycl_ext_py.cpp

+
+#include "histogram.hpp"
+
+PYBIND11_MODULE(_sycl_ext_impl, m)


I think we should change the name of the extension (and the folder containing the implementation) into something less generic, like histogram, or bin_counting.

I was planning to add other functions as correlate to the same module.

Probably it is bad idea, but i'm not sure that idea to make new extension module for each implemented function (like having separate module just for correlate) is good either.

So I would rely on your and @antonwolfy opinion in this question

Probably it would be good to rename the extension something like statistics to group by functional block.
Moreover considering that new kernels will have a similar implementation approach.

oleksandr-pavlyk · 2024-10-22T14:24:03Z

dpnp/dpnp_iface_histograms.py

-        db = dpnp.diff(bin_edges).astype(dpnp.default_float_type())
+        db = dpnp.diff(bin_edges).astype(
+            dpnp.default_float_type(sycl_queue=queue)
+        )
        return n / db / n.sum(), bin_edges


Perhaps n.sum() should be replaced with dpnp.sum(n).

oleksandr-pavlyk · 2024-10-24T00:08:01Z

.pre-commit-config.yaml

@@ -52,7 +52,6 @@ repos:
    rev: 24.4.2
    hooks:
    -   id: black
-        args: ["--check", "--diff", "--color"]


Just curious, was this change intentional?

AlexanderKalistratov requested a review from oleksandr-pavlyk September 11, 2024 21:36

AlexanderKalistratov requested review from antonwolfy, npolina4, vlad-perevezentsev and vtavana as code owners September 11, 2024 21:36

AlexanderKalistratov force-pushed the histogram branch 9 times, most recently from bc18622 to 088beb5 Compare September 13, 2024 04:05

oleksandr-pavlyk reviewed Sep 13, 2024

View reviewed changes

dpnp/dpnp_iface_histograms.py Outdated Show resolved Hide resolved

oleksandr-pavlyk reviewed Sep 13, 2024

View reviewed changes

dpnp/backend/extensions/sycl_ext/histogram.cpp Outdated Show resolved Hide resolved

AlexanderKalistratov force-pushed the histogram branch 2 times, most recently from e6c66d1 to 020ea2c Compare September 17, 2024 14:10

AlexanderKalistratov force-pushed the histogram branch from 020ea2c to ad8291f Compare October 9, 2024 03:46

vtavana reviewed Oct 15, 2024

View reviewed changes

dpnp/dpnp_iface_histograms.py Show resolved Hide resolved

dpnp/dpnp_iface_histograms.py Outdated Show resolved Hide resolved

antonwolfy reviewed Oct 16, 2024

View reviewed changes

AlexanderKalistratov force-pushed the histogram branch 2 times, most recently from 3721b6e to bfc7ede Compare October 17, 2024 15:51

AlexanderKalistratov requested review from oleksandr-pavlyk, antonwolfy and vtavana October 17, 2024 23:42

antonwolfy reviewed Oct 22, 2024

View reviewed changes

oleksandr-pavlyk reviewed Oct 22, 2024

View reviewed changes

AlexanderKalistratov added 7 commits October 23, 2024 19:45

Implementation of histogram with sycl kernel

409be7c

Add more checks and test

d950c78

Fix review comments

177953e

Remove dpnp.uint64

22f1715

Review comments fixes

4087d2e

Fix empty case for cuda device

905489b

Remove black options

5d2e4b5

AlexanderKalistratov force-pushed the histogram branch 3 times, most recently from f4206a9 to cb562ae Compare October 23, 2024 22:44

oleksandr-pavlyk reviewed Oct 24, 2024

View reviewed changes

Module renaming & small fixes

cca7fe4

AlexanderKalistratov force-pushed the histogram branch from cb562ae to cca7fe4 Compare October 24, 2024 01:17

Code movement and utility functions

bbfa7f4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of histogram with sycl kernel #2027

Implementation of histogram with sycl kernel #2027

AlexanderKalistratov commented Sep 11, 2024 •

edited

Loading

oleksandr-pavlyk commented Sep 13, 2024

AlexanderKalistratov commented Oct 9, 2024

oleksandr-pavlyk commented Oct 16, 2024

AlexanderKalistratov commented Oct 17, 2024

antonwolfy Oct 22, 2024

antonwolfy Oct 22, 2024

antonwolfy Oct 22, 2024

antonwolfy Oct 22, 2024

antonwolfy Oct 22, 2024

antonwolfy Oct 22, 2024

antonwolfy Oct 22, 2024

antonwolfy Oct 22, 2024

oleksandr-pavlyk Oct 22, 2024

AlexanderKalistratov Oct 22, 2024

antonwolfy Oct 22, 2024

oleksandr-pavlyk Oct 22, 2024

oleksandr-pavlyk Oct 24, 2024

	if constexpr (std::is_floating_point<T>::value) {
	if constexpr (std::is_floating_point_v<T>) {

	if (a_bin_dtype in float_types and hist_dtype in float_types) or (
	elif (a_bin_dtype in float_types and hist_dtype in float_types) or (


		#include "histogram.hpp"

		PYBIND11_MODULE(_sycl_ext_impl, m)

Implementation of histogram with sycl kernel #2027

Are you sure you want to change the base?

Implementation of histogram with sycl kernel #2027

Conversation

AlexanderKalistratov commented Sep 11, 2024 • edited Loading

oleksandr-pavlyk commented Sep 13, 2024

AlexanderKalistratov commented Oct 9, 2024

oleksandr-pavlyk commented Oct 16, 2024

AlexanderKalistratov commented Oct 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexanderKalistratov commented Sep 11, 2024 •

edited

Loading