-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf!: use a priority queue #104
Conversation
Instead of having one
The big advantage of the One drawback I can think of is that the evaluated priority is not re-evaluated when gaining new knowledge after picking new dependencies (is it?). If I understood correctly though, any backtrack clears the prioritized potential packages, so the drawback is not that problematic. I didn't understood fully though the play with indices and decision levels and exactly how you choose to update the packages in the priority list. So I have one main question after reading this. How much of this idea can be implemented without changing the current API of By the way, the changes in |
The result is that we can efficiently update the
I don't think so. To get the perf benefits |
7f15520
to
3852caa
Compare
Theoretically if there are never two things with the same |
That is great! and indeed a little weird that 4 crates had a different output. |
One objection to this PR is that there are algorithms that can be implemented by let v: Vec<_> = packages.collect();
let mut hasher = FxHasher::new();
v.iter().forEatch(|(p, r)| hasher.hash((p.borrow(), v.borrow()));
let hash = hasher.finish();
let (p, r) = v[hash % (v.len() - 1)]; or let v: Vec<_> = packages.collect();
let has_a = v..iter().find(|(p, _)| p == package_named_a));
let has_b = v..iter().find(|(p, _)| p == package_named_b));
let has_c = v..iter().find(|(p, _)| p == package_named_c));
if has_a.is_sum() && has_b.is_sum() {
return has_b;
} else if has_b.is_sum() && has_c.is_sum() {
return has_c;
} else if has_a.is_sum() && has_c.is_sum() {
return has_a;
} else {
return v.last();
} These examples do prove that there are things limiting about the new API. But they are also chaotic and incoherent; they are not examples of someone doing something reasonable or justified. So are there reasonable examples; examples where someone could explain why they are writing the code? I don't think so. Mostly because I cannot think of any examples. But I would not be surprised if the argument from this video can be adopted to prove that all implementations which do not have a consistent prioritization are internally contradictory. |
5c072c7
to
6863f93
Compare
rebased. |
This is assuming that the priority computed by the "utility function" (as the video calls it) has access to all the parts of the world state it needs until we ask to re-compute it again. Which is not the case, it's a tradeoff of when do we ask package priorities to be recomputed. |
I thought this PR needed rebasing but it seems ok actually. What would be nice is having better documentation of the new process detailing under which conditions a package priority is evaluated and re-evaluated, and finding the best suited place in doc comments to put these explanations. @Eh2406 once you've added more doc and feel this is ready for review, let me know and I'll make a more thorough reading and review. |
6863f93
to
c931a9b
Compare
Thanks for the new doc comments! I'll start reviewing and making some naming changes during this week. Is it ok with you @Eh2406 or do you still want to apply some changes here before to avoid conflicts. |
I think its in good shape for review. |
86f7568
to
a00c3a6
Compare
a00c3a6
to
a4f0f9b
Compare
a4f0f9b
to
20233c7
Compare
I will try to get this updated and cleaned up soon. But I would like to get away from long lived branches. It there are problems with this, or ways it can be improved (and there are) lets merge it as is and open issues to follow up. |
20233c7
to
1663bcd
Compare
While reviewing and rebasing the code today it's clear that one complicated part of this PR is the many clever things I have done with the fact that |
I migrated our project over to this branch without much trouble. I think the separation in the API is actually pretty nice, even ignoring the potential performance benefits. |
Just a thought that came to mind. The priority list is a nice way of knowing which concurrent requests to prioritize, if we are to enable concurrent dependency provider requests while waiting for the next solver's demand. Also, since the solver would always re-ask the dependency provider for an updated priority when backtracking happens, it also gives a way for the dependency provider to know which queued requests should be cancelled. Right? |
The API in this PR so far allows a pretty good approximation of this. If a request has already been cashed it gets a high priority, if the request has not yet returned it gets a low priority. The resolver will work through all of the known results before it asks for an unknown result. Unfortunately, when the resolver runs out of known results it will pick an item at random to block on which may or may not have completed in the meantime. This could be improved by providing a way for a command to say "by the way could we reprioritize x" (maybe something on should cancel?) or a way to say "if the highest priority is below x then reprioritize everything". I will look into whether there is a dependency queuing implementation that handles the async -> conversion for us.
Theoretically yes. It would be more ergonomic if we provided a reliable way for the dependency provider to know that we backtracked. Using PubGrub-rs with async code is not intuitive under 0.2 nore dev nore this PR. Let's not let the conversation about how to make the async situation better block merging this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything that should be blocking this PR. Great job guys!
I'll use my superpowers to press the "Approve" button.
1663bcd
to
c3a4f8c
Compare
BREAKING CHANGE: Changes the API of DependencyProvider
c3a4f8c
to
7cf095e
Compare
I took a liberty to rebase and fix conflicts. Feels free to merge if you are satisfied @Eh2406 |
I'm going to try this new merged queue thing! If/when people find problems, we can open issues and get them addressed! |
BREAKING CHANGE: Changes the API of DependencyProvider
BREAKING CHANGE: Changes the api of
DependencyProvider
Still wip, it needs a lot of documentation, and some thinking about the ergonomics of the api. For example what should be able to return an error? How will this work with pubgrub-rs/advanced_dependency_providers#6?
This is a big perf win on larger benchmarks. Some previous discussion:
For example Generating a ron file from all crates in a snapshot of Crates.io went from 21,648s with
v0.2.1
to 3,209s with this branch.