Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: optimize term::relation_with() #59

Closed
wants to merge 1 commit into from
Closed

Conversation

Eh2406
Copy link
Member

@Eh2406 Eh2406 commented Nov 3, 2020

In #27 it was measured that we spend a lot of our runtime in intersections in relation_with. For that use case we do not need the exact list of what versions are in the intersection just a trinary state. So this adds a function to Range to compute that directly.

Criterion thinks this is a ~20% improvement in runtime of the benchmarks from #34.
This is mostly because of skipping the cache miss and allocation of making a new Range. But also because we can short circuit when we see a Relation::Inconclusive.

This still needs wordsmithing.

@mpizenberg
Copy link
Member

At a first glance, it seems to be a non-negligible readability/performance tradeoff. Would you mind waiting a bit for those kind of perf improvements? It also would complicate the promising Range trait approach I believe

@Eh2406
Copy link
Member Author

Eh2406 commented Nov 3, 2020

It also would complicate the promising Range trait approach I believe

Yes (cc #49) it would need to be added to the proposed trate. We could provide a default implementation that uses intersection as we do now, and users can override it when performance matters.

Would you mind waiting a bit for those kind of perf improvements?

Yes, we can definitely weight. Lets get 0.2 out.

@Eh2406 Eh2406 force-pushed the fasster_relation branch 2 times, most recently from e1bd299 to 407c58e Compare November 15, 2020 03:43
@aleksator aleksator changed the title Faster relation perf: optimize term::relation_with() Nov 15, 2020
@Eh2406
Copy link
Member Author

Eh2406 commented Nov 16, 2020

Using the elm benchmark this was a smaller improvement then the noize. I.E. Criterion thought this was faster but not statistically significantly so. So this should be put on the back burner for a while. When we find other ways to make the elm benchmark faster, this may pull its weight.

@Eh2406
Copy link
Member Author

Eh2406 commented Feb 4, 2021

running on top of #74 using the benchmarks from #75

Benchmarking large_cases/elm-packages_str_SemanticVersion.ron: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 20.0s. You may wish to increase target time to 31.2s, or reduce sample count to 60.
large_cases/elm-packages_str_SemanticVersion.ron                                                                            
                        time:   [310.00 ms 310.27 ms 310.55 ms]
                        change: [-6.6554% -6.5353% -6.4141%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
large_cases/large_case_u16_NumberVersion.ron                                                                             
                        time:   [16.331 ms 16.344 ms 16.357 ms]
                        change: [-9.1269% -9.0116% -8.8980%] (p = 0.00 < 0.05)
                        Performance has improved.

So it now looks like an improvement. Can the code be cleaned up to make it worth it? It needs work, but we have something to talk about.

@mpizenberg
Copy link
Member

I'm personally still not very attracted by this change. Thanks to all other changes, the perf improvement here is down to 9% where it was roughly 20% at the beginning. And this makes the code more complex to read and to explain, it also doesn't match anymore with the expression in the guide documentation. Let's keep it around and focus on features important for cargo instead. Once we have something working for cargo, we can see if that perf improvement is still worth the change.

@Eh2406 Eh2406 force-pushed the fasster_relation branch 2 times, most recently from 7342ce5 to 8375f23 Compare February 15, 2021 18:34
@Eh2406
Copy link
Member Author

Eh2406 commented Apr 20, 2021

I am closing, thanks to SmallVec this is not the best perf/complexity change we can make.

@Eh2406 Eh2406 closed this Apr 20, 2021
@Eh2406
Copy link
Member Author

Eh2406 commented Dec 7, 2023

In #157 (comment) reported large improvements from reducing the calls to V.clone(), so at some point this optimization will be helpful to them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants