-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] add backpressure reason #48009
base: master
Are you sure you want to change the base?
Conversation
a0e9072
to
9f8f680
Compare
Can you add PR description? Ideally also add some screenshots. Thanks. |
d7fbffb
to
557d43f
Compare
The reason string content generally makes sense to me. My main concern is that the progress bar outputs will become very verbose, making it difficult to read. Also, the notation with To address both of these points, I think we should configure this as an advanced feature, and disable by default. The user should enable seeing the backpressure reason through DataContext, e.g. (Ray Data doesn't explicitly truncate the stats outputs for each operator, only truncates the operator name if it is too long). |
37730b1
to
55e6abc
Compare
Thank you for your review. I have made the changes according to your suggestions. |
7720ed9
to
a198e13
Compare
self._in_task_submission_backpressure = False | ||
self._in_task_submission_backpressure_reason = "" | ||
self._in_task_output_backpressure = False | ||
self._in_task_output_backpressure_reason = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wrap this into new structure like:
self._tasks_state = TaskState(...)
class TaskState:
submission_throttled: bool
submission_throttled_reason: str = ...
# ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
e281f8d
to
5e17527
Compare
1827029
to
0191e5e
Compare
Signed-off-by: jukejian <[email protected]>
9fd90d5
to
1a86bbd
Compare
Background
The concurrency of raydata's operators is mostly set manually, and the automatic perception mostly cannot meet the situation of resource utilization. However, manually setting the operator concurrency is very easy to trigger rate limiting, but it is currently difficult to obtain the reason for rate limiting.
Backpressure trigger conditions
There are a total of four trigger conditions:
Backpressure enhancement
Backpressure observability enhancement
Before enhancement:
After enhancement
Concurrent backpressure policy
Backpressure in non-reserved mode
Backpressure in reserved mode
Insufficient free slots of actor
Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.