-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Set/Map balance calls faster #1053
Comments
Interesting, and clever. I think you're right that slowing down the slow path is generally okay because it's relatively rare. How does this affect sets with one or two elements? |
insDelTiny :: Int -> S.Set Int
insDelTiny n = foldl' f (S.fromList [0,1]) (take n ops)
where
f s (Add x) = S.insert x s
f s (Del x) = S.delete x s
ops = cycle [Del 0, Add 0, Del 1, Add 1]
data Op = Add !Int | Del !Int
|
Looks pretty darn good then. |
Do you mean inlining the whole thing? We're not getting improvements on the small sets if we only inline the Bin-Bin branch. |
We're not slowing them down, so I think we're good to do the partial inlining. We'd need more code size stats to justify inlining everything, I imagine. |
Awesome, I'll benchmark and see what else gets affected.
How would one gather stats like that to make the decision? |
Very good question. Maybe that was a bad way to put it.... The most important effect of code size is on caching, so benchmarking more complex algorithms that use sets should give us useful info. |
I see, we should try adding such algorithms at some point then. Right now we don't really have complex benchmarks. |
I've been looking at #980 which improves insertion times by ~22% in the no-rebalance case. This made me wonder: why is rebalancing slow?
Let's look at
Set
because it is simpler.Map
is similar.From the code, we can see that
balanceL
andbalanceR
are marked asNOINLINE
.containers/containers/src/Data/Set/Internal.hs
Lines 1885 to 1906 in 9b1d9d4
The
NOINLINE
s were added in b96ffeb citing code bloat issues. I think that's reasonable,balanceL
andbalanceR
are moderately large functions. But let's try inlining them to see what happens.We'll use the existing
insert
benchmark, which is pretty good for this purpose. It repeatedly inserts a maximum element into the set. This is close enough to typical usage (repeated insertion) but also a bad case (because it always makes one branch heavier). Benchmarks use GHC 9.6.6.That's a significant improvement, but it might come at the cost of code bloat.
But, there is a compromise:
Most calls to
balanceR
will be on already balanced trees. Performing rebalancing is the rare case. Using some unsafe IO, I gathered some counts to show this:Putting it another way, the number of calls to
balanceR
is O(n log n), but the number of rebalancings is O(n) (amortized O(1) per insert).Which brings me to the suggestion: only inline the fast no-rebalance branch of
balanceR
. And to make that smaller, only the no-rebalance Bin-Bin branch.With this, we get
TL;DR:
To make balanceL/R calls faster, we can
Edit: I should mention that I expect many Set/Map functions will benefit from this, but at the moment I have tested out the idea only on
insert
.The text was updated successfully, but these errors were encountered: