Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Julia master is on LLVM 16, so no real rush here.
The
!unpredictable
should stop the cmov converter from converting the cmov into a branch.This PR is a follow up to #10
Basically, that PR shows that disabling the cmov-converter yields a huge performance boost here. The cmov-converted converts branch-free
cmov
s into branches, following the logic that if branch prediction rates are good, the branches are faster.In real world use, binary search's branches are totally unpredictable: for each split, the chance is 50/50 whether the answer is in the upper or lower half.
Note, however, that if you have a microbenchmark where you're searching for the same number in the same vector over and over again, the branch predictor can probably memorize that exact sequence, so you'll need to do much more rigorous benchmark to replicate real-world use, where you're not searching for the same number in the same vector every time (if you are doing that, stop! Calculate it once, and save the answer!).
Because of that, we really do want cmov, which the README benchmarks added in #10 demonstrate: we go from roughly 85 microseconds to just 26 microseconds via using
cmov
!This PR adds unpredictable metadata to the
select
statement.Using __builtin_unpredictable in Clang 17 or newer adds this metadata to the LLVM IR, and also causes
cmov
s to be preserved. So hopefully adding the metadata like this PR does will work for Julia, once we upgrade to LLVM 17.Converting
cmov
s to branches is a bad idea only when branches are hard to predict, so telling LLVM the branch/select is hard to predict is the ideal solution.