Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timing tree #657

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Add timing tree #657

wants to merge 6 commits into from

Conversation

mcalancea
Copy link
Collaborator

@mcalancea mcalancea commented Nov 29, 2024

Fixes #655.

Integrates tracing-forest for its nice timing-tree summary.
In addition, I added a trick to control "granularity": spans have a profiling_{LEVEL} field which one can prune from the command line. For example, compare:

cargo run --release --package ceno_zkvm --bin e2e -- --profiling=1 --max-steps=10000 --platform=sp1 ceno_zkvm/examples/fibonacci.elf 

INFO     ZKVM_create_proof [ 790ms | 0.00% / 100.00% ]
INFO     ┝━ commit_to_fixed_commit [ 129µs | 0.02% ] profiling_1: true
INFO     ┝━ commit_to_traces [ 147ms | 18.62% ] profiling_1: true
INFO     ┕━ main_proofs [ 643ms | 81.36% ] profiling_1: true

to

cargo run --release --package ceno_zkvm --bin e2e -- --profiling=2 --max-steps=10000 --platform=sp1 ceno_zkvm/examples/fibonacci.elf 

INFO     ZKVM_create_proof [ 1.10s | 0.00% / 100.00% ]
INFO     ┝━ commit_to_fixed_commit [ 162µs | 0.01% ] profiling_1: true
INFO     ┝━ commit_to_traces [ 151ms | 0.02% / 13.78% ] profiling_1: true
INFO     │  ┝━ commit to iteration [ 7.66ms | 0.70% ] circuit_name: "ADD" | profiling_2: true
INFO     │  ┝━ commit to iteration [ 4.17ms | 0.38% ] circuit_name: "ADDI" | profiling_2: true
INFO     │  ┝━ commit to iteration [ 163ns | 0.00% ] circuit_name: "AND" | profiling_2: true
INFO     │  ┝━ commit to iteration [ 297µs | 0.03% ] circuit_name: "ANDI" | profiling_2: true
INFO     │  ┝━ commit to iteration [ 201µs | 0.02% ] circuit_name: "AUIPC" | profiling_2: true
[... more lines follow]

I haven't gone too far in assigning adequate levels to spans, I'm not sure what's relevant.

@@ -40,6 +42,11 @@ struct Args {
#[arg(short, long)]
max_steps: Option<usize>,

// Profiling granularity.
// Setting any value restricts logs to profiling information
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might want to control log verbosity and profiling information independently?

(It's reasonable to default log verbosity to silent, if you turn on profiling, though.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally you would, but the only way to keep the profiling data readable is to filter out some of the spans. And filtering out some of the spans has an unclear impact (to me at least) on the general logging context. I think it's simpler and not that inconvenient to keep them separate.

Copy link
Collaborator

@matthiasgoergens matthiasgoergens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good. Please see comments.

ceno_zkvm/src/bin/e2e.rs Show resolved Hide resolved
ceno_zkvm/src/bin/e2e.rs Outdated Show resolved Hide resolved
.with(fmt_layer)
// if some profiling granularity is specified, use the profiling filter,
// otherwise use the default
.with((args.profiling.is_some()).then_some(filter_by_profiling_level))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parens look a bit confusing here?

Btw, instead of is_some().then_some() you can probably just use .map(|_| filter_by_profiling_level)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed parens. .map() is terser, but not as readable in this case I'd say.

pub fn create_proof(
&self,
witnesses: ZKVMWitnesses<E>,
pi: PublicValues<u32>,
mut transcript: Transcript<E>,
) -> Result<ZKVMProof<E, PCS>, ZKVMError> {
let span = entered_span!("commit_to_fixed_commit", profiling_1 = true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of pairs of and entered_span and exit_span like this

let span = entered_span(some_name);
// [... some code ...]
exit_span!(span);

in most cases we can probably use a less error-prone mechanism:

with_span!(some_name) {
  // [... some code ...]
}

(That requires to quickly write a with_span macro. The exact syntax is only a suggestion.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point, but I dislike the effect on indentation. Often, [... some code...] is not an expression, but a very large block. Because it's not a clear-cut decision, I would defer it to a separate issue/PR.

@matthiasgoergens
Copy link
Collaborator

matthiasgoergens commented Dec 2, 2024

Btw, does this produce wallclock time or report CPU seconds spent? Both can be useful, depending on what you want to investigate.

See https://docs.rs/cpu-time/latest/cpu_time/index.html for the latter.

@mcalancea
Copy link
Collaborator Author

It's wall time. It uses https://doc.rust-lang.org/std/time/struct.Instant.html and elapsed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate plonky3-like timing tree
2 participants