You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I guess this discussion should be done within c++ boost histograms, but this community seems more active.
I work in ATLAS performances and many of my workflows are similar to:
axis = hist.axis.Regular(...)
df.groupby([
axis.index(df['var1'])
])['quantity'].agg([myfunction1, myfunction2])
df = df.reindex(range(len(axis))) # to be sure I have all the bins
it would be very nice to be able to do that with boost histograms.
Is it possible to have multiple storages? I would like to loop only one time on my data and to store different estimators of the same quantity (e.g. number of instances, mean, and std, ...)
If myfunction1 is size and myfunction2 is mean then these two would be equivalent to building a normal histogram or a profile histogram, which are already supported. What if I want a different estimator (e.g. the max)? What about estimators that need to have all the data in memory (e.g. quantiles)?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello, I guess this discussion should be done within c++ boost histograms, but this community seems more active.
I work in ATLAS performances and many of my workflows are similar to:
it would be very nice to be able to do that with boost histograms.
myfunction1
issize
andmyfunction2
ismean
then these two would be equivalent to building a normal histogram or a profile histogram, which are already supported. What if I want a different estimator (e.g. the max)? What about estimators that need to have all the data in memory (e.g. quantiles)?I imagine something like this
Beta Was this translation helpful? Give feedback.
All reactions