-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal to add a value_range #482
Comments
I am not super committed to this proposal, I mainly added this here as an anchor for discussion. I am personally fine with not adding too many ways of doing the same thing, bh.loc probably does the job just fine. |
I think we should make sure UHI is powerful enough that users could create a smart locator - see #485. This would greatly complicate UHI by having to track multiple types of slices and/or having a slice transform step, and would not be composable like the current UHI objects are (mixing bins and values is very useful in one special case - start to value or value to end; these are not easy to write any other way - None includes flow). It also does not have Python syntax support like slices do; that's why we use getitem in the first place over a random Python method; and for UHI takes advantage of this unless you want to use dict-based access (if the keyword getitem PEP gets accepted, you won't need verbose syntax for that anymore either!). |
The problem with a "smart locator" is that it breaks some fundamental design principles. In well designed interfaces, everything is plain and obvious and consistent. We hate it when stuff happens behind the scenes, like global state is modified, objects that depend on each other without being explicitly coupled, etc. A "smart locator" knows whether it is passed to Now you want to break all that with a "smart locator" that is context-aware and behaves differently in a Replacing the |
@jpivarski Henry wants smart locators, I think they are a bad idea. @henryiii seems to believe that you were also in favor of "context-aware" locators, but I think it goes against consistency and what is idiomatic Python, which you have also favored in the past. If Henry insists on his context-aware locators, we need another call to arbitrate, because I am strongly against them. |
Tiny clarification: I want the ability for users to write smart locators; I'm proposing the default locators like h[bh.overflow] # -> overflow bin
h[bh.underflow:bh.overflow] # returns the whole histogram (normal slicing always bring along flow bins anyway)
h[bh.underflow:bh.overflow:sum] # Does not include the overflow! Slices do not include the endpoint, so no overflow bin
h[::sum] # correct syntax If a locator can detect where it is in the slice, we can fix Users were not supposed to use these this way, but I fear that some users, wanting to be explicit, will find At the end of our meeting, we said we wanted |
If we're talking about "the ability for users to write" smart locators, then absolutely yes. We can apply maintainability and design constraints to ourselves and our own libraries, but restricting a user's freedom is known as an "opinionated library," usually in a negative light. Most applications are short-lived: maintainability should not be the primary concern for code that I write in my The ability to write locators is a public interface that should allow for gradual complexity: simple things should be easy and complex things should be possible. So, for instance, we shouldn't require locator functions to take this "placement context" as an argument because most locators aren't going to use it. Attaching it to the object that gets passed in sounds like a great solution because users who write locators only have to think about it if they need it. On a different topic, the standard set of locators provided by boost-histogram, such as Coming back to the locators, this means they should be highly constrained by mathematical simplicity, but mathematical simplicity should not win over user expectations. As a user, if h[bh.underflow:bh.overflow] returns the whole histogram, I would expect that h[bh.underflow:bh.overflow:sum] sums over the whole histogram. If If this sounds like the start of a slippery slope, we can just insist that it won't be and apply some self-constraint. Having the ability to use context doesn't mean that we will be tempted to use context just because it's there. Each rule can be approached from the point of view of "what would a user expect?" and the answer to that question will usually veer toward mathematical simplicity. This, I would argue, was the guiding principle in designing the Python language, not the insistence on consistency. I can think of much more mathematical languages than Python (Haskell leaps to mind). The choices Python made to optimize user expectation usually also optimize mathematical simplicity (such as "only one way to do it... mostly"), but as an instrumental good that serves the intrinsic good of making it easier for programmers to write programs. Python doesn't look like Perl because mathematical simplicity is highly correlated with ease of programming, but it doesn't look like Haskell either (or LISP, or Forth, ...) because these two goals aren't perfectly correlated. |
It may be useful to have a way to do an inclusive selection on a value range which follows the rule "keep all bins that overlap with the value range".
I think the clean way of introducing that is via a special slice-like object. I suggest
value_range
as a placeholder for a proper name. You would use it instead ofslice
inside__getitem__
.bh.sum
andbh.rebin
and friends are supported as an optional third argument.value_range
can be transformed internally into an ordinaryslice
.The text was updated successfully, but these errors were encountered: