-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ssa): Pass to preprocess functions #7072
base: master
Are you sure you want to change the base?
Conversation
Compilation Memory Report
|
Execution Memory Report
|
Compilation Report
|
Execution Report
|
Some observations based on the post-preprocessed SSA of
The last would benefit from a generalization of this PR but we can break out other PRs to tackle the other two cases. |
Posting for SSA tests (obviously once cut down) SSA
|
Changes to number of Brillig opcodes executed
🧾 Summary (10% most significant diffs)
Full diff report 👇
|
Changes to Brillig bytecode sizes
🧾 Summary (10% most significant diffs)
Full diff report 👇
|
Changes to circuit sizes
🧾 Summary (10% most significant diffs)
Full diff report 👇
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, overall the PR is really nice.
It would be nice to nail down the exact cause of the regressions for higher order functions (which I see you have been working on) before merging.
It's because the loops are not small enough to be unrolled in Brillig #7100 (comment) |
Found out the reason why |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Compilation Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.10
.
Benchmark suite | Current: 04c8395 | Previous: dd70845 | Ratio |
---|---|---|---|
ram_blowup_regression |
18.6 s |
16.1 s |
1.16 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @TomAFrench
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! As it is Friday evening and this is a pretty meaningful PR let's hold off on merging though in case @TomAFrench wants to take another pass next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Execution Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.10
.
Benchmark suite | Current: 4826682 | Previous: 7705a62 | Ratio |
---|---|---|---|
rollup-merge |
0.008 s |
0.007 s |
1.14 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @TomAFrench
Alright, I definitely need to make the comment trigger less sensitive... |
Yeah, will take a look on Monday. |
Description
Problem*
Followup for #7001 (comment)
Summary*
Adds a
Ssa::preprocess_fns
pass that does a subset of SSA passes on functions in a bottom-up order, so that by the time they are inlined into other functions, they have a chance to be already simplified. The changes made are cumulative, ie. if A calls B, then B gets optimised first, and then A inlines the optimised B before it gets optimised itself.To avoid going all the way to the top and optimising the functions near the entry point, the weight of the functions (derived from their instruction count) is accumulated to the callers, and the average of the accumulated weights serves as a cutoff for running the preprocessing.
Added a
--skip-preprocess-fns
CLI option to turn this pass off.TODO:
&mut
parameters #7104Additional Context
Idea
We noticed in #7001 (comment) that
main
gets bloated to 65K blocks after the unrolling pass, because everything it calls has already been inlined into it, and then we have to simplify down this large function, which came with a big memory footprint.@TomAFrench suggested switching to a bottom-up scenario, where we optimise functions that get called by others before inlining them, and to start as follows:
Testing
cargo run -p nargo_cli -- --program-dir ../aztec-packages/noir-projects/noir-protocol-circuits/crates/rollup-base-public compile --silence-warnings --skip-underconstrained-check --skip-brillig-constraints-check --force --show-ssa-pass "Preprocessing"
Result
Unfortunately this hasn't significantly changed the number of blocks after unrolling in the
main
, at least not yet:The SSA of one of the fist
serialize
function in the linked comment changed as follows:Before Pre-processing
After Pre-processing
So this has been achieved 🎉 :
Documentation*
Check one:
PR Checklist*
cargo fmt
on default settings.