Prevents the scheduler to store task's data #15

rafa-be · 2024-09-03T15:21:49Z

Scheduler currently stores the task's data until it finishes. This basically doubles the memory usage as the data is stored by the worker too.

We would like to add a new mode to the scheduler in which it does not store the task data.

This causes two issues:

if the scheduler wishes to balance a task, it should request the task data from the worker;
if the worker dies, the scheduler will not be able to reschedule the task and should return a task failure exception to the client.

sharpener6 · 2024-09-09T20:35:03Z

It should have 2 modes:

store the task in scheduler
once it routes the task to worker, then delete it from scheduler's memory, in this situation
- if worker died, this task should be marked as failed and send the result back with exception WorkerDied
- if rebalancing tasks, scheduler should issue TaskCancel with retrive_task_flag=True, once it get the task back from worker then it can schedule it to other worker (might create some overheads if task is big)

rafa-be · 2024-09-30T13:40:46Z

Isn't there some overlap with this issue: #23 ?

sharpener6 · 2024-10-10T20:17:33Z

@rafa-be

I closed #23 as I cannot mark it as duplicate issue

Currently, scaler scheduler is keeping the task object in the memory so if worker died, scheduler will allocate the task to other workers, but for memory efficiency, we will need have a mode that scheduler doesn't keep the task once sent to worker, so there are some behavior changes when this

keep task:

when task failed due to worker get disconnected, scheduler will reassign to another worker to do
when balancing tasks, scheduler just need task ids from busy worker and send to other workers

do not keep task:

when task failed due to worker get disconnected, scheduler will just return failed result to Client
when balancing tasks, because scheduler doesn't have task content at all, so it will ask busy workers to return not only task ids, but also task contents so it can reschedule to other workers

rafa-be self-assigned this Sep 3, 2024

sharpener6 added the enhancement New feature or request label Sep 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevents the scheduler to store task's data #15

Prevents the scheduler to store task's data #15

rafa-be commented Sep 3, 2024

sharpener6 commented Sep 9, 2024

rafa-be commented Sep 30, 2024

sharpener6 commented Oct 10, 2024 •

edited

Loading

Prevents the scheduler to store task's data #15

Prevents the scheduler to store task's data #15

Comments

rafa-be commented Sep 3, 2024

sharpener6 commented Sep 9, 2024

rafa-be commented Sep 30, 2024

sharpener6 commented Oct 10, 2024 • edited Loading

sharpener6 commented Oct 10, 2024 •

edited

Loading