feat(avm2): avm redesign init (#10906)

This is a redesign of the witgen/proving part of the AVM. There's still a lot of work to be done, but I have to merge at some point to let others contribute :). Most of the content is PoC, not supposed to be real. We'll eventually have a doc explaining everything, but for now, some highlights: **Architecture** The proving process is now divided in 3 parts: * Simulation (aka event generation): Intrinsically sequential. Executes bytecode and generates packed information (events) that summarize what happened. Examples would be a bytecode decomposition event, memory access event, etc. This part has no dependencies on BB or PIL beyond FF. It also has, in principle, no knowledge of the circuit or columns. * Trace generation: This part is parallelizable. The meat of it is translating events into columns in a (sparse!) trace. It is the glue between events and the circuit. It has knowledge of the columns, but not really about any relation or constrain (**) or PIL. * Constraining: This is parallelizable. It's the actual constraining/proving/check circuit. It's dependent on BB and the (currently) autogenerated relations from PIL. We convert the sparse trace to polynomials. **Possible future standalone simulation** Hints and DB accesses: The simulation/witgen process has no knowledge of hints (so far). We define a DB interface which the simulation process uses. This DB is then "seeded" with hints. This means that in the future it should be possible to switch the DB to a real DB and things should "just work™️". I think we should try to follow this philosophy as much as possible and not rely on TS hints that we can compute ourselves. Configurability: Other aspects of simulation are configurable. E.g., we can configure a fast simulation only variant that does no event generation and no bytecode hashing whereas for full proving you would do that (incurring in at least 25ms for a single bytecode hashing). **Philosophy** Dependency injection is used everywhere (without framework). You'll see references stored in classes and may not like it, but it's actually working well. See https://www.youtube.com/watch?v=kCYo2gJ3Y38 as well. There are lots of interfaces for mocking. Blame C++ 🤷 . I'm making it a priority to have the right separation of concerns and engineering practices. There's zero tolerance on hacks. If we need a hack, we trigger a refactor. **Testing** Whereas before our tests required setting up everything and basically do full proving or check circuit, now everything can be tested separately. We use a mockist approach (common in C++). Our old tests would take ~0.5s each, now they take microseconds. Simulation, tracegen, and constraining can be tested separate from each other. In particular, you can create tests for constraints at the relation or subrelation level. **Lookups/permutations** Not really supported yet. But you don't need to keep counts for lookups. **TS/C++ communication** AVM inputs are now (de)serialized with messagepack. (**) It does require lookup/permutation settings.
AztecProtocol · Jan 11, 2025 · 231f017 · 231f017 · AztecBot · Jan 11, 2025
1 parent 9189120
commit 231f017
Show file tree

Hide file tree

Showing 125 changed files with 9,064 additions and 37 deletions.
diff --git a/barretenberg/cpp/pil/vm2/README.md b/barretenberg/cpp/pil/vm2/README.md
@@ -0,0 +1,7 @@
+Compile with:
+
+```
+~/aztec-packages/bb-pilcom/target/release/bb_pil pil/vm2/execution.pil --name Avm2 -y -o src/barretenberg/vm2/generated && ./format.sh changed
+```
+
+while on the `barretenberg/cpp` directory.
diff --git a/barretenberg/cpp/pil/vm2/addressing.pil b/barretenberg/cpp/pil/vm2/addressing.pil
@@ -0,0 +1,21 @@
+// This is a virtual gadget, which is part of the execution trace.
+namespace execution(256);
+
+pol commit stack_pointer_val;
+pol commit stack_pointer_tag;
+pol commit sel_addressing_error;  // true if any error type
+pol commit addressing_error_kind;  // TODO: might need to be selectors
+pol commit addressing_error_idx;  // operand index for error, if any
+
+// whether each operand is an address for the given opcode.
+// retrieved from the instruction spec.
+pol commit sel_op1_is_address;
+pol commit sel_op2_is_address;
+pol commit sel_op3_is_address;
+pol commit sel_op4_is_address;
+// operands after relative resolution
+pol commit op1_after_relative;
+pol commit op2_after_relative;
+pol commit op3_after_relative;
+pol commit op4_after_relative;
+// operands after indirect resolution are the resolved_operands rop1, ...
diff --git a/barretenberg/cpp/pil/vm2/alu.pil b/barretenberg/cpp/pil/vm2/alu.pil
@@ -0,0 +1,16 @@
+namespace alu(256);
+
+pol commit sel_op_add;
+pol commit ia;
+pol commit ib;
+pol commit ic;
+pol commit op;
+pol commit ia_addr;
+pol commit ib_addr;
+pol commit dst_addr;
+
+#[SEL_ADD_BINARY]
+sel_op_add * (1 - sel_op_add) = 0;
+
+#[ALU_ADD]
+ia + ib = ic;
diff --git a/barretenberg/cpp/pil/vm2/execution.pil b/barretenberg/cpp/pil/vm2/execution.pil
@@ -0,0 +1,55 @@
+include "alu.pil";
+include "addressing.pil";
+include "precomputed.pil";
+
+namespace execution(256);
+
+pol commit sel; // subtrace selector
+
+pol commit ex_opcode;
+pol commit indirect;
+// operands
+pol commit op1;
+pol commit op2;
+pol commit op3;
+pol commit op4;
+// resolved operands
+pol commit rop1;
+pol commit rop2;
+pol commit rop3;
+pol commit rop4;
+
+pol commit pc;
+pol commit clk;
+pol commit last;
+
+// Selector constraints
+sel * (1 - sel) = 0;
+last * (1 - last) = 0;
+
+// If the current row is an execution row, then either
+// the next row is an execution row, or the current row is marked as the last row.
+// sel => (sel' v last) = 1              iff
+// ¬sel v (sel' v last) = 1              iff
+// ¬(¬sel v (sel' v last)) = 0           iff
+// sel ^ (¬sel' ^ ¬last) = 0             iff
+// sel * (1 - sel') * (1 - last) = 0
+#[TRACE_CONTINUITY_1]
+sel * (1 - sel') * (1 - last) = 0;
+// If the current row is not an execution row, then there are no more execution rows after that.
+// (not enforced for the first row)
+#[TRACE_CONTINUITY_2]
+(1 - precomputed.first_row) * (1 - sel) * sel' = 0;
+// If the current row is the last row, then the next row is not an execution row.
+#[LAST_IS_LAST]
+last * sel' = 0;
+
+// These are needed to have a non-empty set of columns for each type.
+pol public input;
+#[LOOKUP_DUMMY_PRECOMPUTED]
+sel {/*will be 1=OR*/ sel, clk, clk, clk} in
+precomputed.sel_bitwise {precomputed.bitwise_op_id, precomputed.bitwise_input_a, precomputed.bitwise_input_b, precomputed.bitwise_output};
+#[LOOKUP_DUMMY_DYNAMIC]  // Just a self-lookup for now, for testing.
+sel {op1, op2, op3, op4} in sel {op1, op2, op3, op4};
+#[PERM_DUMMY_DYNAMIC]  // Just a self-permutation for now, for testing.
+sel {op1, op2, op3, op4} is sel {op1, op2, op3, op4};
diff --git a/barretenberg/cpp/pil/vm2/precomputed.pil b/barretenberg/cpp/pil/vm2/precomputed.pil
@@ -0,0 +1,17 @@
+// General/shared precomputed columns.
+namespace precomputed(256);
+
+// From 0 and incrementing up to the size of the circuit (2^21).
+pol constant clk;
+
+// 1 only at row 0.
+pol constant first_row;
+
+// AND/OR/XOR of all 8-bit numbers.
+// The tables are "stacked". First AND, then OR, then XOR.
+// Note: think if we can avoid the selector.
+pol constant sel_bitwise; // 1 in the first 3 * 256 rows.
+pol constant bitwise_op_id; // identifies if operation is AND/OR/XOR.
+pol constant bitwise_input_a; // column of all 8-bit numbers.
+pol constant bitwise_input_b; // column of all 8-bit numbers.
+pol constant bitwise_output; // output = a AND/OR/XOR b.
diff --git a/barretenberg/cpp/src/CMakeLists.txt b/barretenberg/cpp/src/CMakeLists.txt
@@ -95,6 +95,7 @@ add_subdirectory(barretenberg/transcript)
 add_subdirectory(barretenberg/translator_vm)
 add_subdirectory(barretenberg/ultra_honk)
 add_subdirectory(barretenberg/vm)
+add_subdirectory(barretenberg/vm2)
 add_subdirectory(barretenberg/wasi)
 add_subdirectory(barretenberg/world_state)
 
@@ -171,6 +172,7 @@ set(BARRETENBERG_TARGET_OBJECTS
 if(NOT DISABLE_AZTEC_VM)
     # enable AVM
     list(APPEND BARRETENBERG_TARGET_OBJECTS $<TARGET_OBJECTS:vm_objects>)
+    list(APPEND BARRETENBERG_TARGET_OBJECTS $<TARGET_OBJECTS:vm2_objects>)
 endif()
 
 if(NOT WASM)

diff --git a/barretenberg/cpp/src/barretenberg/bb/main.cpp b/barretenberg/cpp/src/barretenberg/bb/main.cpp
@@ -3,6 +3,7 @@
 #include "barretenberg/bb/file_io.hpp"
 #include "barretenberg/client_ivc/client_ivc.hpp"
 #include "barretenberg/common/benchmark.hpp"
+#include "barretenberg/common/log.hpp"
 #include "barretenberg/common/map.hpp"
 #include "barretenberg/common/serialize.hpp"
 #include "barretenberg/common/timer.hpp"
@@ -13,7 +14,6 @@
 #include "barretenberg/dsl/acir_format/proof_surgeon.hpp"
 #include "barretenberg/dsl/acir_proofs/acir_composer.hpp"
 #include "barretenberg/dsl/acir_proofs/honk_contract.hpp"
-#include "barretenberg/flavor/flavor.hpp"
 #include "barretenberg/honk/proof_system/types/proof.hpp"
 #include "barretenberg/numeric/bitop/get_msb.hpp"
 #include "barretenberg/plonk/proof_system/proving_key/serialize.hpp"
@@ -24,15 +24,17 @@
 #include "barretenberg/stdlib_circuit_builders/ultra_flavor.hpp"
 #include "barretenberg/stdlib_circuit_builders/ultra_keccak_flavor.hpp"
 #include "barretenberg/stdlib_circuit_builders/ultra_rollup_flavor.hpp"
-#include "barretenberg/vm/avm/trace/public_inputs.hpp"
-#include <cstdint>
 
 #ifndef DISABLE_AZTEC_VM
 #include "barretenberg/vm/avm/generated/flavor.hpp"
 #include "barretenberg/vm/avm/trace/common.hpp"
 #include "barretenberg/vm/avm/trace/execution.hpp"
+#include "barretenberg/vm/avm/trace/public_inputs.hpp"
 #include "barretenberg/vm/aztec_constants.hpp"
 #include "barretenberg/vm/stats.hpp"
+#include "barretenberg/vm2/avm_api.hpp"
+#include "barretenberg/vm2/common/aztec_types.hpp"
+#include "barretenberg/vm2/common/constants.hpp"
 #endif
 
 using namespace bb;
@@ -671,6 +673,16 @@ void vk_as_fields(const std::string& vk_path, const std::string& output_path)
 }
 
 #ifndef DISABLE_AZTEC_VM
+void print_avm_stats()
+{
+#ifdef AVM_TRACK_STATS
+    info("------- STATS -------");
+    const auto& stats = avm_trace::Stats::get();
+    const int levels = std::getenv("AVM_STATS_DEPTH") != nullptr ? std::stoi(std::getenv("AVM_STATS_DEPTH")) : 2;
+    info(stats.to_string(levels));
+#endif
+}
+
 /**
  * @brief Writes an avm proof and corresponding (incomplete) verification key to files.
  *
@@ -726,12 +738,34 @@ void avm_prove(const std::filesystem::path& public_inputs_path,
     write_file(vk_fields_path, { vk_json.begin(), vk_json.end() });
     vinfo("vk as fields written to: ", vk_fields_path);
 
-#ifdef AVM_TRACK_STATS
-    info("------- STATS -------");
-    const auto& stats = avm_trace::Stats::get();
-    const int levels = std::getenv("AVM_STATS_DEPTH") != nullptr ? std::stoi(std::getenv("AVM_STATS_DEPTH")) : 2;
-    info(stats.to_string(levels));
-#endif
+    print_avm_stats();
+}
+
+void avm2_prove(const std::filesystem::path& inputs_path, const std::filesystem::path& output_path)
+{
+    avm2::AvmAPI avm;
+    auto inputs = avm2::AvmAPI::ProvingInputs::from(read_file(inputs_path));
+
+    // This is bigger than CIRCUIT_SUBGROUP_SIZE because of BB inefficiencies.
+    init_bn254_crs(avm2::CIRCUIT_SUBGROUP_SIZE * 2);
+    auto [proof, vk] = avm.prove(inputs);
+
+    // NOTE: As opposed to Avm1 and other proof systems, the public inputs are NOT part of the proof.
+    write_file(output_path / "proof", to_buffer(proof));
+    write_file(output_path / "vk", vk);
+
+    print_avm_stats();
+}
+
+void avm2_check_circuit(const std::filesystem::path& inputs_path)
+{
+    avm2::AvmAPI avm;
+    auto inputs = avm2::AvmAPI::ProvingInputs::from(read_file(inputs_path));
+
+    bool res = avm.check_circuit(inputs);
+    info("circuit check: ", res ? "success" : "failure");
+
+    print_avm_stats();
 }
 
 /**
@@ -783,8 +817,28 @@ bool avm_verify(const std::filesystem::path& proof_path, const std::filesystem::
 
     const bool verified = AVM_TRACK_TIME_V("verify/all", avm_trace::Execution::verify(vk, proof));
     vinfo("verified: ", verified);
+
+    print_avm_stats();
     return verified;
 }
+
+// NOTE: The proof should NOT include the public inputs.
+bool avm2_verify(const std::filesystem::path& proof_path,
+                 const std::filesystem::path& public_inputs_path,
+                 const std::filesystem::path& vk_path)
+{
+    const auto proof = many_from_buffer<fr>(read_file(proof_path));
+    std::vector<uint8_t> vk_bytes = read_file(vk_path);
+    auto public_inputs = avm2::PublicInputs::from(read_file(public_inputs_path));
+
+    init_bn254_crs(1);
+    avm2::AvmAPI avm;
+    bool res = avm.verify(proof, public_inputs, vk_bytes);
+    info("verification: ", res ? "success" : "failure");
+
+    print_avm_stats();
+    return res;
+}
 #endif
 
 /**
@@ -1382,6 +1436,18 @@ int main(int argc, char* argv[])
             std::string output_path = get_option(args, "-o", "./target");
             write_recursion_inputs_honk<UltraRollupFlavor>(bytecode_path, witness_path, output_path, recursive);
 #ifndef DISABLE_AZTEC_VM
+        } else if (command == "avm2_prove") {
+            std::filesystem::path inputs_path = get_option(args, "--avm-inputs", "./target/avm_inputs.bin");
+            // This outputs both files: proof and vk, under the given directory.
+            std::filesystem::path output_path = get_option(args, "-o", "./proofs");
+            avm2_prove(inputs_path, output_path);
+        } else if (command == "avm2_check_circuit") {
+            std::filesystem::path inputs_path = get_option(args, "--avm-inputs", "./target/avm_inputs.bin");
+            avm2_check_circuit(inputs_path);
+        } else if (command == "avm2_verify") {
+            std::filesystem::path public_inputs_path =
+                get_option(args, "--avm-public-inputs", "./target/avm_public_inputs.bin");
+            return avm2_verify(proof_path, public_inputs_path, vk_path) ? 0 : 1;
         } else if (command == "avm_prove") {
             std::filesystem::path avm_public_inputs_path =
                 get_option(args, "--avm-public-inputs", "./target/avm_public_inputs.bin");

diff --git a/barretenberg/cpp/src/barretenberg/ecc/groups/affine_element.hpp b/barretenberg/cpp/src/barretenberg/ecc/groups/affine_element.hpp
@@ -170,6 +170,9 @@ template <typename Fq_, typename Fr_, typename Params> class alignas(64) affine_
     }
     Fq x;
     Fq y;
+
+    // Note: this serialization from typescript does not support infinity.
+    MSGPACK_FIELDS(x, y);
 };
 
 template <typename B, typename Fq_, typename Fr_, typename Params>

diff --git a/barretenberg/cpp/src/barretenberg/vm2/CMakeLists.txt b/barretenberg/cpp/src/barretenberg/vm2/CMakeLists.txt
@@ -0,0 +1,3 @@
+if(NOT DISABLE_AZTEC_VM)
+  barretenberg_module(vm2 sumcheck stdlib_honk_verifier)
+endif()
diff --git a/barretenberg/cpp/src/barretenberg/vm2/avm_api.cpp b/barretenberg/cpp/src/barretenberg/vm2/avm_api.cpp
@@ -0,0 +1,58 @@
+#include "barretenberg/vm2/avm_api.hpp"
+
+#include "barretenberg/vm/stats.hpp"
+#include "barretenberg/vm2/proving_helper.hpp"
+#include "barretenberg/vm2/simulation_helper.hpp"
+#include "barretenberg/vm2/tracegen_helper.hpp"
+
+namespace bb::avm2 {
+
+using namespace bb::avm2::simulation;
+
+std::pair<AvmAPI::AvmProof, AvmAPI::AvmVerificationKey> AvmAPI::prove(const AvmAPI::ProvingInputs& inputs)
+{
+    // Simulate.
+    info("Simulating...");
+    AvmSimulationHelper simulation_helper(inputs);
+    auto events = AVM_TRACK_TIME_V("simulation/all", simulation_helper.simulate());
+
+    // Generate trace.
+    info("Generating trace...");
+    AvmTraceGenHelper tracegen_helper;
+    auto trace = AVM_TRACK_TIME_V("tracegen/all", tracegen_helper.generate_trace(std::move(events)));
+
+    // Prove.
+    info("Proving...");
+    AvmProvingHelper proving_helper;
+    auto [proof, vk] = AVM_TRACK_TIME_V("proving/all", proving_helper.prove(std::move(trace)));
+
+    info("Done!");
+    return { std::move(proof), std::move(vk) };
+}
+
+bool AvmAPI::check_circuit(const AvmAPI::ProvingInputs& inputs)
+{
+    // Simulate.
+    info("Simulating...");
+    AvmSimulationHelper simulation_helper(inputs);
+    auto events = AVM_TRACK_TIME_V("simulation/all", simulation_helper.simulate());
+
+    // Generate trace.
+    info("Generating trace...");
+    AvmTraceGenHelper tracegen_helper;
+    auto trace = AVM_TRACK_TIME_V("tracegen/all", tracegen_helper.generate_trace(std::move(events)));
+
+    // Check circuit.
+    info("Checking circuit...");
+    AvmProvingHelper proving_helper;
+    return proving_helper.check_circuit(std::move(trace));
+}
+
+bool AvmAPI::verify(const AvmProof& proof, const PublicInputs& pi, const AvmVerificationKey& vk_data)
+{
+    info("Verifying...");
+    AvmProvingHelper proving_helper;
+    return AVM_TRACK_TIME_V("verifing/all", proving_helper.verify(proof, pi, vk_data));
+}
+
+} // namespace bb::avm2
diff --git a/barretenberg/cpp/src/barretenberg/vm2/avm_api.hpp b/barretenberg/cpp/src/barretenberg/vm2/avm_api.hpp
@@ -0,0 +1,24 @@
+#pragma once
+
+#include <tuple>
+
+#include "barretenberg/vm2/common/avm_inputs.hpp"
+#include "barretenberg/vm2/proving_helper.hpp"
+
+namespace bb::avm2 {
+
+class AvmAPI {
+  public:
+    using AvmProof = AvmProvingHelper::Proof;
+    using AvmVerificationKey = std::vector<uint8_t>;
+    using ProvingInputs = AvmProvingInputs;
+
+    AvmAPI() = default;
+
+    // NOTE: The public inputs are NOT part of the proof.
+    std::pair<AvmProof, AvmVerificationKey> prove(const ProvingInputs& inputs);
+    bool check_circuit(const ProvingInputs& inputs);
+    bool verify(const AvmProof& proof, const PublicInputs& pi, const AvmVerificationKey& vk_data);
+};
+
+} // namespace bb::avm2
-Original file line number
+Diff line change
@@ Expand Up @@
         }
         Fq x;
         Fq y;
+        // Note: this serialization from typescript does not support infinity.
+        MSGPACK_FIELDS(x, y);
     };
     template <typename B, typename Fq_, typename Fr_, typename Params>
@@ Expand Down @@
Benchmark suite	Current: `231f017`	Previous: `f034e2a`	Ratio
`wasmClientIVCBench/Full/6`	`81955.39306799999` ms/iter	`76677.342763` ms/iter	`1.07`
`commit(t)`	`3337463545` ns/iter	`3148715909` ns/iter	`1.06`
`Goblin::merge(t)`	`155250859` ns/iter	`145031539` ns/iter	`1.07`