-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix yielding in progress loop #278
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #278 +/- ##
==========================================
+ Coverage 57.21% 57.26% +0.04%
==========================================
Files 69 69
Lines 9740 9762 +22
Branches 1256 1256
==========================================
+ Hits 5573 5590 +17
- Misses 3399 3404 +5
Partials 768 768 ☔ View full report in Codecov by Sentry. |
Thanks, will look at this soon, just wanted to comment quickly on the discussion. If progress loop is configured to run on an ES that must multiplex between two pools, then the only choice is between two (not ideal) configurations: a) setting the There are definitely use cases for taking a busy spin approach, but that can be accomplished in any configuration by setting the progress timeout; it doesn't require setting up multiple pools to share an execution stream. |
In #279 I have added a warning that the configuration will cause unnecessary delays if any pool is sharing an ES with the progress pool. If that is something a user is comfortable with (e.g if they have set the progress timeout to 0 to busy-spin) they can simply ignore the warning. I prefer this to the solution of setting the timeout to 0 if there is another pool used by the same ES. |
- remove redundant yield that occurs after progress in some conditions - refactored logic slightly for calculating timeout so that there is only one progress call instead of a special path for zero timeout - moved yield from within internal progress call to main progress loop, but after triggers, to avoid yielding when the progress loop has completions ready to handle
Check my revision and see if it makes sense. The tests still pass for me with this configuration. I started to comment that there was a superfluous yield call one level up the call stack in some cases, but then I realized we could probably consolidate the calls a little to help avoid that sort of thing, and I also suggest putting the remaining yield after trigger instead of after progress (mostly to avoid splitting up the progress/trigger path when there are completions to handle). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
committed a suggested modification
Looks good to me. I'm merging. |
This PR fixes an issue that appears when we have 2 pools, with the progress loop running on the second one. It we are in a situation where the progress loop is the only ULT in its pool, it's never going to yield and never going to give a chance for the scheduler to check the first pool again, even if that first pool could have ULTs available to run.
The fix adds
ABT_thread_yield
right afterHG_Progress
so that we give a chance for the scheduler to take over.Note that while this fixes the starvation problem, it highlights a potential other issue (performance related, this time): if the progress loop is alone in its pool,
HG_Progress
will be called with a timeout of 100ms, even though ideally we would want to call it with a timeout of 0 so that we can immediately yield and check other pools. We could warn the user about this behavior by displaying a warning if the progress pool is sharing an execution stream with other pools.