observability: add CPU time counter to tasks #47351

vilterp · 2022-10-27T16:41:07Z

Problem

If you have spawned a lot of tasks competing for CPU cores, it's difficult to tell how much CPU time one has actually gotten over its lifetime, since @time measures wall clock time.

What we want is something equivalent to the CPU Time counter in the OS-level activity monitor, but for individual tasks:

Proposed Solution

Add two counters to the Task struct: (forgive the crude pseudo code)

struct task {
…
+ last_scheduled_at timestamp;
+ cpu_time_total_ns int64; // initialize to 0
}

And then use them in the scheduler:

when taking task T off the run queue and putting it on a CPU {
  …
+  T.last_scheduled_at = now()
}

…

when taking task T off the CPU due to it blocking on something {
  …
+  T.cpu_time_total_ns += now() - T.last_scheduled_at
}

The counter could then be read from Julia "userspace":

println("my task got $(t.cpu_time_total_ns) ns of CPU time")

…and then fed into other systems for logging / metrics, etc.

Alternatives

@kpamnany's bpftrace infrastructure (see Runtime event trace visualization #36870) emits task switch events which could be used to derive this same information. It's just a lot harder to use, since you have to be on a machine that supports BPF, then set up something which is consuming the events. Making this info available directly in Julia userspace seems easier.

Task heritability

You might want to know how much time child tasks consumed as well. But I think this could be a problem left to Julia userspace, since it's possible to keep a tree of tasks and read these numbers off of them.

The text was updated successfully, but these errors were encountered:

vchuravy · 2022-10-27T19:45:56Z

Prior art for getting cpu-time: JuliaCI/BenchmarkTools.jl#94

I think you could do this with bpftrace (or at least before we removed the task probe points.), or alternativly I would use Profile, which now has some meta-data in form of the cyclecount and threadid so you could post-process that.

vilterp · 2022-12-16T01:15:12Z

FWIW, the folks at CockroachDB just put up a PR adding this to Go: golang/go#51347

Design doc: https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20220602_fine_grained_cpu_attribution.md#design There's a pretty thorough discussion at the bottom comparing it to other approaches they considered for this, which are pretty much the exact ones you recommended, Valentin — bpftrace and sampling profiler with task id labels.

Blog post describing how they used it: https://www.cockroachlabs.com/blog/rubbing-control-theory/#33-integrated-cooperative-scheduling

@NHDaly

Close #47351 (builds on top of #48416) Adds two per-task metrics: - running time = amount of time the task was actually running (according to our scheduler). Note: currently inclusive of GC time, but would be good to be able to separate that out (in a future PR) - wall time = amount of time between the scheduler becoming aware of this task and the task entering a terminal state (i.e. done or failed). We record running time in `wait()`, where the scheduler stops running the task as well as in `yield(t)`, `yieldto(t)` and `throwto(t)`, which bypass the scheduler. Other places where a task stops running (for `Channel`, `ReentrantLock`, `Event`, `Timer` and `Semaphore` are all implemented in terms of `wait(Condition)`, which in turn calls `wait()`. `LibuvStream` similarly calls `wait()`. This should capture everything (albeit, slightly over-counting task CPU time by including any enqueuing work done before we hit `wait()`). The various metrics counters could be a separate inlined struct if we think that's a useful abstraction, but for now i've just put them directly in `jl_task_t`. They are all atomic, except the `metrics_enabled` flag itself (which we now have to check on task start/switch/done even if metrics are not enabled) which is set on task construction and marked `const` on the julia side. In future PRs we could add more per-task metrics, e.g. compilation time, GC time, allocations, potentially a wait-time breakdown (time waiting on locks, channels, in the scheduler run queue, etc.), potentially the number of yields. Perhaps in future there could be ways to enable this on a per-thread and per-task basis. And potentially in future these same timings could be used by `@time` (e.g. writing this same timing data to a ScopedValue like in #55103 but only for tasks lexically scoped to inside the `@time` block). Timings are off by default but can be turned on globally via starting Julia with `--task-metrics=yes` or calling `Base.Experimental.task_metrics(true)`. Metrics are collected for all tasks created when metrics are enabled. In other words, enabling/disabling timings via `Base.Experimental.task_metrics` does not affect existing `Task`s, only new `Task`s. The other new APIs are `Base.Experimental.task_running_time_ns(::Task)` and `Base.Experimental.task_wall_time_ns(::Task)` for retrieving the new metrics. These are safe to call on any task (including the current task, or a task running on another thread). All these are in `Base.Experimental` to give us room to change up the APIs as we add more metrics in future PRs (without worrying about release timelines). cc @NHDaly @kpamnany @d-netto --------- Co-authored-by: Pete Vilter <[email protected]> Co-authored-by: K Pamnany <[email protected]> Co-authored-by: Nathan Daly <[email protected]> Co-authored-by: Valentin Churavy <[email protected]>

vilterp · 2024-12-06T22:30:50Z

🎉

@NHDaly

Close JuliaLang#47351 (builds on top of JuliaLang#48416) Adds two per-task metrics: - running time = amount of time the task was actually running (according to our scheduler). Note: currently inclusive of GC time, but would be good to be able to separate that out (in a future PR) - wall time = amount of time between the scheduler becoming aware of this task and the task entering a terminal state (i.e. done or failed). We record running time in `wait()`, where the scheduler stops running the task as well as in `yield(t)`, `yieldto(t)` and `throwto(t)`, which bypass the scheduler. Other places where a task stops running (for `Channel`, `ReentrantLock`, `Event`, `Timer` and `Semaphore` are all implemented in terms of `wait(Condition)`, which in turn calls `wait()`. `LibuvStream` similarly calls `wait()`. This should capture everything (albeit, slightly over-counting task CPU time by including any enqueuing work done before we hit `wait()`). The various metrics counters could be a separate inlined struct if we think that's a useful abstraction, but for now i've just put them directly in `jl_task_t`. They are all atomic, except the `metrics_enabled` flag itself (which we now have to check on task start/switch/done even if metrics are not enabled) which is set on task construction and marked `const` on the julia side. In future PRs we could add more per-task metrics, e.g. compilation time, GC time, allocations, potentially a wait-time breakdown (time waiting on locks, channels, in the scheduler run queue, etc.), potentially the number of yields. Perhaps in future there could be ways to enable this on a per-thread and per-task basis. And potentially in future these same timings could be used by `@time` (e.g. writing this same timing data to a ScopedValue like in JuliaLang#55103 but only for tasks lexically scoped to inside the `@time` block). Timings are off by default but can be turned on globally via starting Julia with `--task-metrics=yes` or calling `Base.Experimental.task_metrics(true)`. Metrics are collected for all tasks created when metrics are enabled. In other words, enabling/disabling timings via `Base.Experimental.task_metrics` does not affect existing `Task`s, only new `Task`s. The other new APIs are `Base.Experimental.task_running_time_ns(::Task)` and `Base.Experimental.task_wall_time_ns(::Task)` for retrieving the new metrics. These are safe to call on any task (including the current task, or a task running on another thread). All these are in `Base.Experimental` to give us room to change up the APIs as we add more metrics in future PRs (without worrying about release timelines). cc @NHDaly @kpamnany @d-netto --------- Co-authored-by: Pete Vilter <[email protected]> Co-authored-by: K Pamnany <[email protected]> Co-authored-by: Nathan Daly <[email protected]> Co-authored-by: Valentin Churavy <[email protected]>

@NHDaly

* Add per-task metrics (JuliaLang#56320) Close JuliaLang#47351 (builds on top of JuliaLang#48416) Adds two per-task metrics: - running time = amount of time the task was actually running (according to our scheduler). Note: currently inclusive of GC time, but would be good to be able to separate that out (in a future PR) - wall time = amount of time between the scheduler becoming aware of this task and the task entering a terminal state (i.e. done or failed). We record running time in `wait()`, where the scheduler stops running the task as well as in `yield(t)`, `yieldto(t)` and `throwto(t)`, which bypass the scheduler. Other places where a task stops running (for `Channel`, `ReentrantLock`, `Event`, `Timer` and `Semaphore` are all implemented in terms of `wait(Condition)`, which in turn calls `wait()`. `LibuvStream` similarly calls `wait()`. This should capture everything (albeit, slightly over-counting task CPU time by including any enqueuing work done before we hit `wait()`). The various metrics counters could be a separate inlined struct if we think that's a useful abstraction, but for now i've just put them directly in `jl_task_t`. They are all atomic, except the `metrics_enabled` flag itself (which we now have to check on task start/switch/done even if metrics are not enabled) which is set on task construction and marked `const` on the julia side. In future PRs we could add more per-task metrics, e.g. compilation time, GC time, allocations, potentially a wait-time breakdown (time waiting on locks, channels, in the scheduler run queue, etc.), potentially the number of yields. Perhaps in future there could be ways to enable this on a per-thread and per-task basis. And potentially in future these same timings could be used by `@time` (e.g. writing this same timing data to a ScopedValue like in JuliaLang#55103 but only for tasks lexically scoped to inside the `@time` block). Timings are off by default but can be turned on globally via starting Julia with `--task-metrics=yes` or calling `Base.Experimental.task_metrics(true)`. Metrics are collected for all tasks created when metrics are enabled. In other words, enabling/disabling timings via `Base.Experimental.task_metrics` does not affect existing `Task`s, only new `Task`s. The other new APIs are `Base.Experimental.task_running_time_ns(::Task)` and `Base.Experimental.task_wall_time_ns(::Task)` for retrieving the new metrics. These are safe to call on any task (including the current task, or a task running on another thread). All these are in `Base.Experimental` to give us room to change up the APIs as we add more metrics in future PRs (without worrying about release timelines). cc @NHDaly @kpamnany @d-netto --------- Co-authored-by: Pete Vilter <[email protected]> Co-authored-by: K Pamnany <[email protected]> Co-authored-by: Nathan Daly <[email protected]> Co-authored-by: Valentin Churavy <[email protected]> * Address review comments --------- Co-authored-by: Pete Vilter <[email protected]> Co-authored-by: K Pamnany <[email protected]> Co-authored-by: Nathan Daly <[email protected]> Co-authored-by: Valentin Churavy <[email protected]>

brenhinkeller added the feature Indicates new feature / enhancement requests label Nov 26, 2022

vilterp mentioned this issue Jan 26, 2023

measure CPU time for each task #48416

Closed

4 tasks

nickrobinson251 mentioned this issue Oct 24, 2024

Add per-task metrics #56320

Merged

kpamnany closed this as completed in #56320 Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

observability: add CPU time counter to tasks #47351

observability: add CPU time counter to tasks #47351

vilterp commented Oct 27, 2022 •

edited

Loading

vchuravy commented Oct 27, 2022

vilterp commented Dec 16, 2022 •

edited

Loading

vilterp commented Dec 6, 2024

observability: add CPU time counter to tasks #47351

observability: add CPU time counter to tasks #47351

Comments

vilterp commented Oct 27, 2022 • edited Loading

Problem

Proposed Solution

Alternatives

Task heritability

vchuravy commented Oct 27, 2022

vilterp commented Dec 16, 2022 • edited Loading

vilterp commented Dec 6, 2024

vilterp commented Oct 27, 2022 •

edited

Loading

vilterp commented Dec 16, 2022 •

edited

Loading