Queue up non-rootish tasks if they break priority ordering #7526

fjetter · 2023-02-08T16:05:12Z

First of all, I hate this but it's a rather pragmatic solution that should only have any real effect if something bad happens

Closes #7496

Queuing does not support any worker restrictions and can therefore break priority task ordering by assigning non-queued, lower prio tasks to workers before it gets a chance to de-queue the queued up tasks.

This is more or less a relict of how queuing is enabled since it prefers scheduling non-queued tasks first. Basically we're calling stimulus_queue_slots_maybe_opened after transitions which causes the non-queued tasks to be transitioned first before checking on queued tasks.

From this perspective, what I'm proposing here is a bit of an ugly work around but it's pretty straight forward. So far, I only verified it on the actual P2P problem presented in #7496 and it works as expected. Will need to look into a test now

gjoseph92

I think it's possible here that tasks that are queued but have restrictions could block up the queue.

Say 1 task slot opens up on worker A, so in stimulus_queue_slots_maybe_opened we peek the first task on the queue and transition it to processing. That task is restricted to only run on worker B, so we don't schedule it. But next in the queue is a task without restrictions. We should have run that immediately, but instead, worker A's thread will just remain unused. The restricted task remains at the front of the queue.

The next time a task slot opens (say on worker C), we'll peek 2 tasks, so the second, unrestricted task on the queue does get scheduled. But the first, restricted task stays at the front of the queue.

In general, we'll only schedule task_slots_available - num_unrunnable_queued_tasks each time, when we should be scheduling task_slots_available tasks. The restricted tasks will slowly pile up at the front of the queue.

We'd have to switch from a peekn here to a linear search through the queue, like I mentioned in #7496 (comment). In that case, a HeapSet is probably the wrong data structure and we'd want a SortedSet instead. But also, when there are a lot of unrunnable tasks on the queue (which I think there would be in the shuffling case), an operation that's currently O(1) in the typical case would get more expensive. It could be worth implementing this and trying it out—maybe in reality the performance cost isn't so bad—but theoretically it does seem slower.

distributed/scheduler.py

gjoseph92 · 2023-02-08T16:35:03Z

distributed/scheduler.py

-            self.queued.discard(ts)
-            worker_msgs = self._add_to_processing(ts, ws)
-        # If no worker, task just stays `queued`
+        if self.is_rootish(ts):


This if/else gives me flashbacks to #7280 / #7221 / #7262. Since we still have that inconsistency where is_rootish can change while the task is in the queue, I'd just want to think through the other possible reasons why ts might not be rootish here.

fjetter · 2023-02-08T17:13:54Z

The next time a task slot opens (say on worker C), we'll peek 2 tasks, so the second, unrestricted task on the queue does get scheduled. But the first, restricted task stays at the front of the queue.

In general, we'll only schedule task_slots_available - num_unrunnable_queued_tasks each time, when we should be scheduling task_slots_available tasks. The restricted tasks will slowly pile up at the front of the queue.

This change does not enable queuing for restricted tasks directly but only for lower priority tasks, i.e. the peek still works.

I think this argument is still valid but I suspect you'd need to write a pretty custom graph to actually trigger something like this.

github-actions · 2023-02-08T18:39:32Z

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      24 files ±  0       24 suites ±0 10h 31m 58s ⏱️ + 5m 58s
  3 344 tests +  7   3 231 ✔️ -   2   105 💤 +1   8 ❌ +  8
39 426 runs +84 37 519 ✔️ +48 1 873 💤 +2 34 ❌ +34

For more details on these failures, see this check.

Results for commit 7f6e80c. ± Comparison against base commit 63ae1db.

♻️ This comment has been updated with latest results.

gjoseph92 · 2023-02-09T01:39:05Z

This change does not enable queuing for restricted tasks directly but only for lower priority tasks, i.e. the peek still works.

Yes, the change does guarantee that there are some tasks ahead of the restricted task when it's inserted. But at some point, all those will be completed and it'll end up at the front of the queue. At that point, the restricted task will take up space at the front of the queue for a while until the right worker opens.

Unless I'm missing some mechanism that would pull the restricted tasks out from the middle of the queue and schedule them?

I imagine this works okay in the context of P2P. I'm more worried about other cases where people are already using restrictions. For example, if you have some huge 30min tasks that use resource restrictions to only run on a couple workers, once those workers are full, other tasks with those restrictions could clog up the front of the queue for 30min until a viable slot opens again.

The problem I'm describing is almost a scheduler corollary of #6136, which is kind of interesting.

fjetter · 2023-02-09T12:48:33Z

There is no explicit mechanism from prohibiting the queue to block up. However, if a task was popped that can not run, it should be moved back to waiting/constrained and would not block an actual slot. whenever the next task finishes, we'd then schedule two queued tasks instead of one. Considering how unlikely this is, I'm not worried about it, yet.

Before I'll engage on any further theoretical back and forth I'll check what kind of tests are failing

fjetter · 2023-02-09T12:55:27Z

For some reason our CI runs froze during test_threadsafe_get. What's interesting is that even pytest-timeout could not kill the tests but we actually ran into GH actions timeout. I wouldn't be terribly surprised if any modifications to queuing would deadlock the scheduler but this is very strange

fjetter · 2023-02-13T12:35:39Z

Interestingly, I found a case where queuing breaks depth first even without any restrictions. Currently investigating

fjetter

In one of our test cases (test_dataframe_set_index_sync) I encountered that tasks were queued up because they would break priority ordering but the queued up tasks didn't have any restrictions. I haven't entirely understood what's happening here, yet.

What's even more interesting is that the behavior is very easy to trigger with cient.compute but not with dask.compute

Code to reproduce (This is the inner workings of set_index to calc quantiles from the test case. I haven't tried reproducing this with a simpler example

import dask
from dask.utils import M
from distributed import Client

with Client() as client:
    df = dask.datasets.timeseries(
        start="2000",
        end="2001",
        dtypes={"value": float, "name": str, "id": int},
        seed=1,
    )
    partition_col = df["name"]
    divisions = partition_col._repartition_quantiles(df.npartitions)
    mins = partition_col.map_partitions(M.min)
    maxes = partition_col.map_partitions(M.max)

    # client.gather(client.compute([mins, maxes, divisions]))
    dask.compute([mins, maxes, divisions])

Rendering a smaller version of this graph shows one anomaly (left hand side, number 28)

I can see how this node is identified as being of lower priority than other root tasks ( which in this specific case is not true but possibly for larger graphs) although I would argue this is wrong since its dependency has to be held in memory way longer. I'm not sure if dask.order considers fan-out dependents to be smaller/more important than their dependencies.

Indeed we can see that with these modifications, the quantile computation has a higher memory footprint because we'd push back 28 into the queue instead of running it eagerly once possible.

I have no idea why I don't see this problem with dask.compute

fjetter · 2023-02-13T13:24:26Z

distributed/scheduler.py

+            if self._is_rootish_no_restrictions(ts):
+                assert not self.idle_task_count, (ts, self.idle_task_count)
            self._validate_ready(ts)


In an earlier version I asserted that if a task is not rootish, it would have restrictions. This caused CI to fail very hard.

It turns out that running non-rootish tasks with lower priority than queued tasks is not impossible

fjetter · 2023-02-15T13:10:10Z

I have a couple of very interesting results from benchmarking then comparing this branch to #7531 i.e. the benchmarking results include exclusively the changes corresponding to queuing up tasks that would otherwise execute too early given their priority

https://github.com/coiled/coiled-runtime/actions/runs/4172251035

There are a couple of minor/mediocre speedups in wall time for some tests but that's not very exciting. However, the memory footprint goes through the roof!

These are rather early results but this lets me think that a major contributing factor to the success of queuing is in fact that we are breaking priority ordering in a very specific way (This is not the only contributor).

Wall Clock

Avg memory

Peak memory

gjoseph92 · 2023-02-16T21:21:47Z

a major contributing factor to the success of queuing is in fact that we are breaking priority ordering in a very specific way

This is interesting. I some research on queuing and priority ordering a couple months ago, when trying to add co-assignment.

This has motivated me to finally post that writeup (in unedited, too-long form). @fjetter you might find the videos and discussion in the middle interesting: #7555.

fjetter · 2023-02-17T10:31:30Z

This has motivated me to finally post that writeup (in unedited, too-long form). @fjetter you might find the videos and discussion in the middle interesting: #7555.

Thanks. I'll have a look as soon as I find the time.

I'm closing this PR. Not following priorities strictly is apparently an unintended(?) perk of task queuing so this PR obviously goes in the wrong direction. However, I believe this motivates a couple of interesting experiments

fjetter requested review from gjoseph92 and hendrikmakait February 8, 2023 16:05

gjoseph92 reviewed Feb 8, 2023

View reviewed changes

gjoseph92 mentioned this pull request Feb 9, 2023

Make root-ish definition static #7531

Open

2 tasks

fjetter added 6 commits February 13, 2023 10:45

What happens if rootish is static?

69a3dac

move dynamic logic into decide_worker_rootish_queuing_disabled

4681348

Add an objective function for queuing

ca99694

Introduce objective function again

e65cc87

Use fan-out rootish definition

ee4c4c4

Queue up non-rootish tasks if they break priority ordering

b21ccdc

remove assert that is no longer valid

7f6e80c

fjetter force-pushed the queuing_breaks_dfs branch from b70bcc1 to 7f6e80c Compare February 13, 2023 13:21

fjetter commented Feb 13, 2023

View reviewed changes

fjetter closed this Feb 17, 2023

gjoseph92 mentioned this pull request Feb 24, 2023

dask.order over-prioritizes root tasks in some situations dask/dask#9995

Closed

fjetter mentioned this pull request Jun 16, 2023

RFC Set priorities for p2p shuffle tasks #7926

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queue up non-rootish tasks if they break priority ordering #7526

Queue up non-rootish tasks if they break priority ordering #7526

fjetter commented Feb 8, 2023 •

edited

Loading

gjoseph92 left a comment

gjoseph92 Feb 8, 2023

fjetter commented Feb 8, 2023

github-actions bot commented Feb 8, 2023 •

edited

Loading

gjoseph92 commented Feb 9, 2023

fjetter commented Feb 9, 2023

fjetter commented Feb 9, 2023

fjetter commented Feb 13, 2023

fjetter left a comment •

edited

Loading

fjetter Feb 13, 2023

fjetter commented Feb 15, 2023

gjoseph92 commented Feb 16, 2023

fjetter commented Feb 17, 2023

Queue up non-rootish tasks if they break priority ordering #7526

Queue up non-rootish tasks if they break priority ordering #7526

Conversation

fjetter commented Feb 8, 2023 • edited Loading

gjoseph92 left a comment

Choose a reason for hiding this comment

gjoseph92 Feb 8, 2023

Choose a reason for hiding this comment

fjetter commented Feb 8, 2023

github-actions bot commented Feb 8, 2023 • edited Loading

Unit Test Results

gjoseph92 commented Feb 9, 2023

fjetter commented Feb 9, 2023

fjetter commented Feb 9, 2023

fjetter commented Feb 13, 2023

fjetter left a comment • edited Loading

Choose a reason for hiding this comment

fjetter Feb 13, 2023

Choose a reason for hiding this comment

fjetter commented Feb 15, 2023

gjoseph92 commented Feb 16, 2023

fjetter commented Feb 17, 2023

fjetter commented Feb 8, 2023 •

edited

Loading

github-actions bot commented Feb 8, 2023 •

edited

Loading

fjetter left a comment •

edited

Loading