You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
once it routes the task to worker, then delete it from scheduler's memory, in this situation
if worker died, this task should be marked as failed and send the result back with exception WorkerDied
if rebalancing tasks, scheduler should issue TaskCancel with retrive_task_flag=True, once it get the task back from worker then it can schedule it to other worker (might create some overheads if task is big)
I closed #23 as I cannot mark it as duplicate issue
Currently, scaler scheduler is keeping the task object in the memory so if worker died, scheduler will allocate the task to other workers, but for memory efficiency, we will need have a mode that scheduler doesn't keep the task once sent to worker, so there are some behavior changes when this
keep task:
when task failed due to worker get disconnected, scheduler will reassign to another worker to do
when balancing tasks, scheduler just need task ids from busy worker and send to other workers
do not keep task:
when task failed due to worker get disconnected, scheduler will just return failed result to Client
when balancing tasks, because scheduler doesn't have task content at all, so it will ask busy workers to return not only task ids, but also task contents so it can reschedule to other workers
Scheduler currently stores the task's data until it finishes. This basically doubles the memory usage as the data is stored by the worker too.
We would like to add a new mode to the scheduler in which it does not store the task data.
This causes two issues:
The text was updated successfully, but these errors were encountered: