Lazy, distributed boost-histogramming of Dask collections with dask-histogram #570
douglasdavis
started this conversation in
General
Replies: 1 comment 5 replies
-
While discussing a potential move to dask-contrib, the gauging of interest in upstreaming Dask support in boost-histogram was mentioned. @henryiii I'd be interested to hear your thoughts (I'd be happy to help maintain Dask support for histograms at any location). |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi Henry, Hans, et. al.,
I've spent a bit of time working on histogram support in the Dask universe. First, inside the
dask.array
module (dask/dask#7387, dask/dask#7634, with an upcoming PR forhistogram2d
support). And more pertinent for discussion here, using the API provided by boost-histogram to create a new Python library: dask-histogram. As of today the API I had in mind (a lazy-fillableHistogram
object, and a mirroring of the classicnp.histogram{ , 2d, dd}
functions) is fully implemented, but I'm sure there are still some things to iron out. I just wanted to mention the library here and say I'd appreciate any feedback! The library may also end up in the dask-contrib organization at some point.Beta Was this translation helpful? Give feedback.
All reactions