Add timing tree #657

mcalancea · 2024-11-29T15:50:41Z

Fixes #655.

Integrates tracing-forest for its nice timing-tree summary.
In addition, I added a trick to control "granularity": spans have a profiling_{LEVEL} field which one can prune from the command line. For example, compare:

cargo run --release --package ceno_zkvm --bin e2e -- --profiling=1 --max-steps=10000 --platform=sp1 ceno_zkvm/examples/fibonacci.elf 

INFO     ZKVM_create_proof [ 790ms | 0.00% / 100.00% ]
INFO     ┝━ commit_to_fixed_commit [ 129µs | 0.02% ] profiling_1: true
INFO     ┝━ commit_to_traces [ 147ms | 18.62% ] profiling_1: true
INFO     ┕━ main_proofs [ 643ms | 81.36% ] profiling_1: true

to

cargo run --release --package ceno_zkvm --bin e2e -- --profiling=2 --max-steps=10000 --platform=sp1 ceno_zkvm/examples/fibonacci.elf 

INFO     ZKVM_create_proof [ 1.10s | 0.00% / 100.00% ]
INFO     ┝━ commit_to_fixed_commit [ 162µs | 0.01% ] profiling_1: true
INFO     ┝━ commit_to_traces [ 151ms | 0.02% / 13.78% ] profiling_1: true
INFO     │  ┝━ commit to iteration [ 7.66ms | 0.70% ] circuit_name: "ADD" | profiling_2: true
INFO     │  ┝━ commit to iteration [ 4.17ms | 0.38% ] circuit_name: "ADDI" | profiling_2: true
INFO     │  ┝━ commit to iteration [ 163ns | 0.00% ] circuit_name: "AND" | profiling_2: true
INFO     │  ┝━ commit to iteration [ 297µs | 0.03% ] circuit_name: "ANDI" | profiling_2: true
INFO     │  ┝━ commit to iteration [ 201µs | 0.02% ] circuit_name: "AUIPC" | profiling_2: true
[... more lines follow]

I haven't gone too far in assigning adequate levels to spans, I'm not sure what's relevant.

matthiasgoergens · 2024-12-02T02:49:47Z

ceno_zkvm/src/bin/e2e.rs

@@ -40,6 +42,11 @@ struct Args {
    #[arg(short, long)]
    max_steps: Option<usize>,

+    // Profiling granularity.
+    // Setting any value restricts logs to profiling information


I think you might want to control log verbosity and profiling information independently?

(It's reasonable to default log verbosity to silent, if you turn on profiling, though.)

Ideally you would, but the only way to keep the profiling data readable is to filter out some of the spans. And filtering out some of the spans has an unclear impact (to me at least) on the general logging context. I think it's simpler and not that inconvenient to keep them separate.

matthiasgoergens

Looks mostly good. Please see comments.

ceno_zkvm/src/bin/e2e.rs

matthiasgoergens · 2024-12-02T02:53:51Z

ceno_zkvm/src/bin/e2e.rs

+        .with(fmt_layer)
+        // if some profiling granularity is specified, use the profiling filter,
+        // otherwise use the default
+        .with((args.profiling.is_some()).then_some(filter_by_profiling_level))


The parens look a bit confusing here?

Btw, instead of is_some().then_some() you can probably just use .map(|_| filter_by_profiling_level)?

Removed parens. .map() is terser, but not as readable in this case I'd say.

matthiasgoergens · 2024-12-02T02:56:21Z

ceno_zkvm/src/scheme/prover.rs

    pub fn create_proof(
        &self,
        witnesses: ZKVMWitnesses<E>,
        pi: PublicValues<u32>,
        mut transcript: Transcript<E>,
    ) -> Result<ZKVMProof<E, PCS>, ZKVMError> {
+        let span = entered_span!("commit_to_fixed_commit", profiling_1 = true);


Instead of pairs of and entered_span and exit_span like this

let span = entered_span(some_name); // [... some code ...] exit_span!(span);

in most cases we can probably use a less error-prone mechanism:

with_span!(some_name) { // [... some code ...] }

(That requires to quickly write a with_span macro. The exact syntax is only a suggestion.)

I get your point, but I dislike the effect on indentation. Often, [... some code...] is not an expression, but a very large block. Because it's not a clear-cut decision, I would defer it to a separate issue/PR.

matthiasgoergens · 2024-12-02T02:58:55Z

Btw, does this produce wallclock time or report CPU seconds spent? Both can be useful, depending on what you want to investigate.

See https://docs.rs/cpu-time/latest/cpu_time/index.html for the latter.

mcalancea · 2024-12-02T07:05:53Z

It's wall time. It uses https://doc.rust-lang.org/std/time/struct.Instant.html and elapsed.

mcalancea added 3 commits November 29, 2024 17:34

add tracing-forest; add some profiling fields

faf967b

clean

f10fbc7

add more fields

599b855

mcalancea requested review from naure and kunxian-xia November 29, 2024 15:51

mcalancea added 2 commits November 29, 2024 17:55

Merge branch 'master' into feat/tracing-forest

a4cbfc7

clippy

3ef9afa

matthiasgoergens reviewed Dec 2, 2024

View reviewed changes

address some comments

be28dd8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add timing tree #657

Add timing tree #657

mcalancea commented Nov 29, 2024 •

edited

Loading

matthiasgoergens Dec 2, 2024

mcalancea Dec 2, 2024

matthiasgoergens left a comment

matthiasgoergens Dec 2, 2024

mcalancea Dec 2, 2024

matthiasgoergens Dec 2, 2024

mcalancea Dec 2, 2024

matthiasgoergens commented Dec 2, 2024 •

edited

Loading

mcalancea commented Dec 2, 2024

Add timing tree #657

Are you sure you want to change the base?

Add timing tree #657

Conversation

mcalancea commented Nov 29, 2024 • edited Loading

matthiasgoergens Dec 2, 2024

Choose a reason for hiding this comment

mcalancea Dec 2, 2024

Choose a reason for hiding this comment

matthiasgoergens left a comment

Choose a reason for hiding this comment

matthiasgoergens Dec 2, 2024

Choose a reason for hiding this comment

mcalancea Dec 2, 2024

Choose a reason for hiding this comment

matthiasgoergens Dec 2, 2024

Choose a reason for hiding this comment

mcalancea Dec 2, 2024

Choose a reason for hiding this comment

matthiasgoergens commented Dec 2, 2024 • edited Loading

mcalancea commented Dec 2, 2024

mcalancea commented Nov 29, 2024 •

edited

Loading

matthiasgoergens commented Dec 2, 2024 •

edited

Loading