Replies: 1 comment 6 replies
-
I think this creates a new ambiguity, should a ufunc operate on the flow bins or not. It is harmless for transformations, you want them to operate on all bins equally, but for reductions like np.sum, it is potentially a huge difference whether flow bins are included or not. It is also deprecating the existing interface: h.axes.edges(flow=flow)
[...] |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Boost-Histogram Enhancement Proposal: (the only time you'll see me capitalize the H for the Python version ;) )
I've been working on this idea for a while, originally came up from a discussion with @jpivarski. This potentially could simplify quite a bit, pulling common logic into a single interface. Since it doesn't actually invalidate any of the current external API, it doesn't have to be be added in the 0.x series (which I want to finish very soon). I think this might also help with #498 - there's less pressure on
to_numpy
to be the one and only way to get axis edges that match the.view
(which it's not, it returns NumPy edges, not boost-histogram ones).We introduce a new FlowArray class, a subclass of ndarray, which has the following properties:
flow=False
ndarray, and returns an ndarray for any operation, except for the points below.__call__
operator, with a single optional keyword argument; if you call it withflow=True
, it gives you the full ndarray with flow.All of the histogram methods,
h.view
,h.values
, etc would become properties that return a FlowArray; such thath.view()
continues to work exactly the same way, and returns a normal ndarray. All of the axis operations return FlowArrays now too, such thatax.edges
continues to work, though the edge is a FlowArray rather than a vanilla ndarray.ax.edges()
would be a vanilla ndarary. Due to point 1. above, this should have no effect.This would tie the concept of "flow" to FlowArray, and pull it out of Histogram and the C++ Histogram wrapper, providing better separation of concepts. It also would be unit testable without having to make histograms and test each buffer return value separately. It would allow edges (or any other axis operation) to be explicitly synced with view:
Or you could get the FlowArrays and pick flow later:
One other benefit is that the following can now be simplified:
Due to point 3 above, this would fully support all possible
flow=True
orflow=False
assignments, since the decision is made based on the RHS shape, just like the current Histogram supports it!One possible future modification to the above could be to change the FlowArray to allow Ufuncts to apply to all values, and return a FlowArray. But this could be added later without changing the valid uses.
The hardest part of this is actually implementing it, but we've already paved the way with View, and the logic for this is already there as well, just spread around (mostly in the C++ wrapper code). Like the current system, it holds the memory for the whole array, it just uses indexing to present the flow=False view of it.
@HDembinski, thoughts?
Beta Was this translation helpful? Give feedback.
All reactions