-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Make nargo::ops::transform_program
idempotent
#6695
fix: Make nargo::ops::transform_program
idempotent
#6695
Conversation
Peak Memory Sample
|
4b370d6
to
b59cb34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we need two extra passes to stabilise the optimisation currently. An example were 1 extra pass is not enough was conditional_1.
Performing the full transformation step again is a little overkill. The only non-idempotent optimization is #6668 afaik so if anything we should run multiple instances of this but avoid the rest.
Agreed, I just tried that as a quick way to see if there is an upper bound now. |
I pushed a commit which allows multiple passes with the |
Changes to circuit sizes
🧾 Summary (10% most significant diffs)
Full diff report 👇
|
For example on the following example it looks like the MEO only runs once, which suggests there is another source of unstable output: cargo test -p nargo_cli --bins -- test_transform_program_is_idempotent to_be_bytes |
371678e
to
ce9d107
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
051be60
to
ce9d107
Compare
Description
Problem*
Followup for #6694
Summary*
Makes
nargo::ops::transform_program
idempotent by adding two loops, both up to 3 passes, exiting early if the opcode hash doesn't change:acvm::compiler::compile
that includesoptimize_internal
andtransform_internal
acvm::compiler::transformers::transform_internal
Additional Context
Here's the journey I went through investigating the source of changes across optimisation runs.
Ever increasing
current_witness_index
On the
slice_loop
example theCircuit::current_witness_index
field increased by 2 after each pass. This seems to be because:The PR adds a brute force step to visit each opcode and collect the remaining witnesses to set the next one correctly.
Alternatively we could implement
PartialEq
forCircuit
in a way that it ignorescurrent_witness_index
.After this fix we have the following tests still flagging the transformation as non-idempotent:
7_function
conditional_1
fold_fibonacci
hashmap
regression_5252
regression_6451
sha256
sha256_regression
sha256_var_size_regression
sha256_var_witness_const_regression
slices
to_be_bytes
Multiple passes
For the above I found that adding the following to the test makes all of them pass:
So two extra full passes to stabilise the optimisation currently. An example where 1 extra pass is not enough was
conditional_1
. Ideally we would find which part of the transformation needs to be repeated to avoid having to do a full pass; our initial hypothesis was to look at #6668I had to add a loop around
transform_internal
allowing 3 passes before the tests indicated more stability than before. The expectation was that we only need to loop aroundMergeExpressionOptimizer
, but that doesn't seem to be true. Even after this, the following two programs fail:7_function
slices
The difference in both cases is the removal of a range check:
For these to stabilise had to move the loop to be around the entire transformation that includes the initial backend agnostic
optimize_internal
as well:The final solution has an inner and an outer loop, treating both as black boxes.
Documentation*
Check one:
PR Checklist*
cargo fmt
on default settings.