-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
searchsorted (and variants) do not check equality? #44102
Comments
just for posterity, the reason for this is that julia> isless(-0.0, 0.0)
true
julia> -0.0 < 0.0
false Clearly, we need total order for sorting an array otherwise we get an indeterministic result back.
as I have said in Slack,
|
Did you mean "strict" order? In any case: when I read "which compare as equal to All I'm saying is that, if there's room to make the docs a little clearer, it may avoid other people running into a similar issue. |
I agree with @mtanneau that the mention of "or equal" in the docstring is incredibly confusing. This is made even worse by the fact that the default value for IIUC "less than or equal" actually means "not greater than according to @mtanneau Would you feel like making a PR to propose clarifications? This has been reported before at #35179. It also caused a bug in the StatsBase implementation of histograms (JuliaStats/StatsBase.jl#766). |
I just mean it semantically, what ever "less than" definition you use (plug in different But if you use |
It seems to be related to (kind of duplicate of) #11429. The behavior is undefined for ill-behaved user-specified comparisons, and Julia may even crash during sorting. |
Yes, I'll try and come up with something this week |
I think that checking equality is a good idea. |
This also came up in JuliaMath/IntervalSets.jl#111. The # doesn't find -0.0, but fast:
julia> @btime searchsorted(-1000.0:1:1000, -0.0)
2.310 ns (0 allocations: 0 bytes)
1001:1000
# finds -0.0, but much slower:
julia> @btime searchsorted(-1000.0:1:1000, -0.0; lt=<)
87.270 ns (0 allocations: 0 bytes)
1001:1001 Would be nice to check equality by default, or make the |
No behavior change, just a performance improvement: searchsorted(range, lt=<) now performs exactly as fast as searchsorted(range). Passing lt=< allows searchsorted to find floating point values such as -0.0, so it's useful to make it fast - see #44102 (comment).
The following behaviors have been causing bugs in some code of mine:
In all cases I encountered, I could get away with it by using
searchsortedXXXX(args... ; lt=<)
.However, assuming the above behavior is what's intended, I believe there's an inconsistency in the docs.
For instance, for
searchsorted
:It's hard to believe that
[0.0]
does not contain any value equal to0.0
😅If it's a bug, I probably can't fix it, but if it's intended and clarifying the docs would be satisfactory, then I'm happy to do so.
The text was updated successfully, but these errors were encountered: