-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of histogram with sycl kernel #2027
base: master
Are you sure you want to change the base?
Implementation of histogram with sycl kernel #2027
Conversation
bc18622
to
088beb5
Compare
Quick validation via independent implementation: def histogram1d_impl_tensor(data : dpt.usm_ndarray, bins : dpt.usm_ndarray) -> dpt.usm_ndarray:
assert data.ndim == 1
assert bins.ndim == 1
assert bins.shape[0] > 1
bin_idx = dpt.searchsorted(bins, data)
_, c = dpt.unique_counts(dpt.sort(bin_idx))
return c
|
e6c66d1
to
020ea2c
Compare
020ea2c
to
ad8291f
Compare
I think this is a bug:
The density should be |
3721b6e
to
bfc7ede
Compare
@oleksandr-pavlyk it is not a bug. Numpy demonstrates the same behavior:
|
uint32_t max_local_copies = local_mem_size / bins_count; | ||
uint32_t local_hist_count = std::max( | ||
std::min( | ||
int(std::ceil((float(4 * local_size) / bins_count))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be helpful to place any comment or named constexpr variables to clarify the meaning of constants?
{ | ||
static bool isnan(const T &v) | ||
{ | ||
if constexpr (std::is_floating_point<T>::value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if constexpr (std::is_floating_point<T>::value) { | |
if constexpr (std::is_floating_point_v<T>) { |
return arr->get_queue() != exec_q; | ||
}); | ||
|
||
if (unequal_queue != arrays.cend()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was it intentional not to use the utils functions from dpctl here (like dpctl::utils::queues_are_compatible
)? (to print parameter name also in case of check failure)
if hist_dtype == dpnp.complex128: | ||
a_bin_dtype = dpnp.float64 | ||
elif hist_dtype == dpnp.float64: | ||
a_bin_dtype = dpnp.complex128 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that do the same like in above block 357-360?
has_fp64 = device.has_aspect_fp64 | ||
a_bin_dtype = _result_type_for_device(a_dtype, bins_dtype, device) | ||
|
||
supported_types = (dpnp.float32, dpnp.int64, numpy.uint64, dpnp.complex64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be helpful to return a mapper function based on ContigFactory
? Like it's done for ufunc, and then it will only need to check if we can cast the input arrays to expected matching dtype if any. And it will return the dtype of result histogram array we have to allocate.
if hist_dtype == numpy.uint64: | ||
hist_dtype = dpnp.int64 | ||
|
||
if (a_bin_dtype in float_types and hist_dtype in float_types) or ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be else if
?
if (a_bin_dtype in float_types and hist_dtype in float_types) or ( | |
elif (a_bin_dtype in float_types and hist_dtype in float_types) or ( |
# host usm memory | ||
n_usm_type = "device" if usm_type == "host" else usm_type | ||
|
||
n_casted = dpnp.zeros( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to fill the memory with zeros?
n_casted = dpnp.zeros( | |
n_casted = dpnp.empty( |
a_usm, | ||
bins_usm, | ||
weights_usm, | ||
n_usm, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We know that n_casted
is dpnp.ndarray, so we can use get_array
method:
n_usm, | |
n_casted.get_array(), |
|
||
#include "histogram.hpp" | ||
|
||
PYBIND11_MODULE(_sycl_ext_impl, m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should change the name of the extension (and the folder containing the implementation) into something less generic, like histogram
, or bin_counting
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning to add other functions as correlate to the same module.
Probably it is bad idea, but i'm not sure that idea to make new extension module for each implemented function (like having separate module just for correlate) is good either.
So I would rely on your and @antonwolfy opinion in this question
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably it would be good to rename the extension something like statistics
to group by functional block.
Moreover considering that new kernels will have a similar implementation approach.
db = dpnp.diff(bin_edges).astype(dpnp.default_float_type()) | ||
db = dpnp.diff(bin_edges).astype( | ||
dpnp.default_float_type(sycl_queue=queue) | ||
) | ||
return n / db / n.sum(), bin_edges |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps n.sum()
should be replaced with dpnp.sum(n)
.
f4206a9
to
cb562ae
Compare
@@ -52,7 +52,6 @@ repos: | |||
rev: 24.4.2 | |||
hooks: | |||
- id: black | |||
args: ["--check", "--diff", "--color"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, was this change intentional?
cb562ae
to
cca7fe4
Compare
Implemention of histogram with sycl_kernel.
This PR adds generic histogram kernel which can be used in the future to implement other versions of histogram such as
bincount
,histogram2d
andhistogramdd
or specialize kernel for special cases like uniform bins.sycl kernel covers only specific datatype and usm memory types. Unsupported cases are covered by additional copy.