Issue | title |
Owners | @LHerskind @MirandaWood |
Approvers | @just-mitch @PhilWindle @iAmMichaelConnor |
Target Approval Date | YYYY-MM-DD |
We will retire calldata for block bodies and instead use EIP-4844 blobs.
Briefly describe the problem the work solves, and for whom. Include any relevant background information and the goals (and non-goals) of this implementation.
In our current system, we have an AvailabilityOracle
contract.
This contract is relatively simple, given a bunch of transactions it will compute a commitment to them, by building a merkle tree from their transaction effects.
The root of this tree is then stored in the contract such that we can later check for availability.
When a block is proposed to the rollup contract, it will perform a query to the AvailabilityOracle
to check availability of the txs_effects_hash
of the header.
Since the only way it could be marked as available was by it being hashed at the oracle, we are sure that the data was published.
When the proof is to be verified, the txs_effects_hash
is provided as a public input.
The circuits are proving that the "opening" of the commitment is indeed the transactions effects from the transactions of the block.
We are using just the hash for the public input instead of the transactions effects directly since it is a cost optimisation.
An extra public input have a higher cost than the extra layer of hashing that we need to do on both L1 and L2.
As the hashing are done both places, we use the sha256
hash as it is relatively cheap on both sides.
It is a simple system, but using calldata and building the merkle tree on L1 is very gas intensive.
Following 4844 (blob transactions), an Ethereum transaction can have up to 6 "sidecars" of 4096 field elements.
These sidecars are called blobs, and are by themselves NOT accessible from the EVM.
However, a VersionedHash
is exposed to the EVM, this is a hash of the version number and the kzg commitment to the sidecar.
def kzg_to_versioned_hash(commitment: KZGCommitment) -> VersionedHash:
return VERSIONED_HASH_VERSION_KZG + sha256(commitment)[1:]
If a VersionedHash
is exposed to the EVM, the Ethereum network guarantees that the data (its 4096 fields) are published.
@MirandaWood note: The
VersionedHash
(orblobhash
) is available in the EVM once a blob has been published via a tx. For example, from blob-lib:function submitBlobs() external { bytes32 blobHash; assembly { blobHash := blobhash(0) } blobHashes[txId][0] = blobHash; ++txId; }Like our
txs_effect_hash
, we must prove that theblobhash
corresponds to the data in a published Aztec block (see implementation section).
As you might have noticed, the VersionedHash
and our AvailabilityOracle
have a very similar purpose, if commitment is published according to it, then the pre-image of the commitment have also been published.
Special Trivia for @iAmMichaelConnor:
Theto
field of a blob transactions cannot beaddress(0)
so it cannot be a "create" transaction, meaning that your "fresh rollup contract every block" dream have a few extra hiccups. Could still happen through a factory, but a factory make a single known contract the deployer and kinda destroy the idea.
Update the system to publish the transactions effects using blobs instead of calldata.
We do NOT change the data that is published, e.g., we will be publishing the transactions effects.
Who are your users, and how do they interact with this? What is the top-level interface?
Essentially, we aim to replace publishing all a tx's effects in calldata with publishing in a blob. As mentioned above, any data inside a blob is not available to the EVM so we cannot simply hash the same data on L1 and in the rollup circuits, and check the hash matches, as we do now.
Instead, publishing a blob makes the blobhash
available:
/**
* blobhash(i) returns the versioned_hash of the i-th blob associated with _this_ transaction.
* bytes[0:1]: 0x01
* bytes[1:32]: the last 31 bytes of the sha256 hash of the kzg commitment C.
*/
bytes32 blobHash;
assembly {
blobHash := blobhash(0)
}
Where the commitment
In the background, this polynomial is found by interpolating the
This means our blob data
where
So to prove that we are publishing the correct tx effects, we just do this sum in the circuit, and check the final output is the same
Thankfully, there is a more efficient way, already implemented by @iAmMichaelConnor in the blob-lib
repo and blob
crate in aztec-packages.
Our goal is to efficiently show that our tx effects accumulated in the rollup circuits are the same
-
versioned_hash
: Theblobhash
for this$C$ -
z
: The challenge value -
y
: The claimed evaluation value atz
-
commitment
: The commitment$C$ -
proof
: The KZG proof of opening
It checks:
assert kzg_to_versioned_hash(commitment) == versioned_hash
assert verify_kzg_proof(commitment, z, y, proof)
As long as we use our tx effect fields as the
But isn't evaluating
To evaluate
What's
We can precompute all the
Previously, the base rollup would sha256
hash all the tx effects to one value and pass it up through the remaining rollup circuits. It would then be recalculated on L1 as part of the AvailabilityOracle
.
We no longer need to do this, but we do need to pass up something encompassing the tx effects to the rollup circuits, so they can be used as poseidon2
hash the tx effects instead and pass those up, but that has some issues:
- If we have one hash per base rollup (i.e. per tx), we have an ever increasing list of hashes to manage.
- If we hash these in pairs, as we do now with the
tx_effects_hash
, then we need to recreate the rollup structure when we prove the blob.
The latter is doable, but means encoding some maximum number of txs, N
, to loop over and potentially wasting gates for blocks with fewer than N
txs. For instance, if we chose N = 96
, a block with only 2 txs would still have to loop 96 times.
Alvaro suggested a solution to this in the vein of PartialStateReference
, where we provide a start
and end
state in each base and subsequent merge rollup circuits check that they follow on from one another. The base circuits themselves simply prove that adding the data of its tx indeed moves the state from start
to end
.
To encompass all the tx effects, we use a poseidon2
sponge and absorb each field. We also track the number of fields added to ensure we don't overflow the blob (4096 BLS fields, which can fit 4112 BN254 fields, but adding the mapping between these is a TODO). Given that this struct is a sponge used for a blob, I have named it:
// Init is given by input len * 2^64 (see noir/noir-repo/noir_stdlib/src/hash/poseidon2.nr -> hash_internal)
global IV: Field = (FIELDS_PER_BLOB as Field) * 18446744073709551616;
struct SpongeBlob {
sponge: Poseidon2,
fields: u32,
}
impl SpongeBlob {
fn new() -> Self {
Self {
sponge: Poseidon2::new(IV),
fields: 0,
}
}
// Add fields to the sponge
fn absorb<let N: u32>(&mut self, input: [Field; N], in_len: u32) {
// in_len is all non-0 input
let mut should_add = true;
for i in 0..input.len() {
should_add &= i != in_len;
if should_add {
self.sponge.absorb(input[i]);
}
}
self.fields += in_len;
}
// Finalise the sponge and output poseidon2 hash of all fields absorbed
fn squeeze(&mut self) -> Field {
self.sponge.squeeze()
}
}
To summarise: each base circuit starts with a start
SpongeBlob
instance, which is either blank or from the preceding circuit, then calls .absorb()
with the tx effects as input. Just like the output BaseOrMergeRollupPublicInputs
has a start
and end
PartialStateReference
, it will also have a start
and end
SpongeBlob
.
Since we are removing a very large sha256
hash, this should considerably lower gate counts for base.
We will no longer have two tx_effect_hash
es from a merge circuit's left
and right
inputs to hash together, instead we have a start
and end
SpongeBlob
and simply check that the left
's end
SpongeBlob
== the right
's start
SpongeBlob
.
We are removing one sha256
hash and introducing a few equality gates, so gate counts for merge should be slightly lower.
There have been multiple designs and discussions on where exactly the 'blob circuit', which computes
The current route is option 5a in the document; to inline the blob functionality inside the block root circuit. We would allow up to 3 blobs to be proven in one block root rollup. For simplicity, the below explanation will just summarise what happens for a single blob.
First, we must gather all our tx effects (SpongeBlob
s from the pair of BaseOrMergeRollupPublicInputs
that we know contain all the effects in the block's txs. Like the merge circuit, the block root checks that the left
's end
SpongeBlob
== the right
's start
SpongeBlob
.
It then calls squeeze()
on the right
's end
SpongeBlob
to produce the hash of all effects that will be in the blob. Let's call this h
. The raw injected tx effects are poseidon2
hashed and we check that the result matches h
. We now have our set of
We now need to produce a challenge point z
. This value must encompass the two 'commitments' used to represent the blob data: h
(see here for more on the method). We simply provide z = poseidon2(h, C)
.
The block root now has all the inputs required to call the blob functionality. It is already written here, the only current difference being that we provide z
rather than calculate it.
Along with the usual BlockRootOrBlockMergePublicInputs
, we would also have
@MirandaWood note: This section is in the early design stages and needs more input. See TODOs at the bottom.
We are replacing publishing effects with AvailabilityOracle.sol
with instead publishing a blob. A high level overview of this change is to provide the blob as the data
part of the tx calling propose()
instead of calling AvailabilityOracle.publish()
.
This does not mean actually sending the data to the EVM - this only makes the blobhash
available. The function propose()
should then extract and store this blobhash
alongside the blockHash
for future verification.
As mentioned, we now have new public inputs to use for verifying the blob. As usual, submitBlockRootProof
verifies the Honk proof against its BlockRootOrBlockMergePublicInputs
and performs checks on them. With the addition of the blobs, we now must also input and check the KZG opening proof. Note that the below pseudocode is just an overview, as the precompile actually takes bytes
we must encode:
// input for the blob precompile
bytes32[] input;
// extract the blobhash from the one submitted earlier:
input[0] = blobHashes[blockHash];
// z, y, and C are already used as part of the PIs for the block root proof
input[1] = z;
input[2] = y;
input[3] = C;
// the opening proof is computed in ts and inserted here
input[4] = kzgProof;
// Staticcall the point eval precompile https://eips.ethereum.org/EIPS/eip-4844#point-evaluation-precompile :
(bool success, bytes memory data) = address(0x0a).staticcall(input);
require(success, "Point evaluation precompile failed");
I'm also glossing over the fact that we are allowing each block to have up to 3 blobs, so we would need to ensure that the 3 KZG opening proofs and blobhashes are handled properly.
Note that we do not need to check that our blobhash
- the precompile does this for us.
@MirandaWood note: This section is in the early design stages and needs more input. See TODOs at the bottom.
We would require:
- Updates to contructing circuit inputs, such that each base is aware of the 'current'
SpongeBlob
state, like we already do withPartialStateReference
s, and raw tx effects are injected to the private inputs of block root. - Adding functionality to tx sending code to:
- Include new structs and tests corresponding to rollup circuit updates.
- Add functionality to read blob information from L1.
All the above assumes verifying a final block root proof on L1, when we will actually be verifying a root proof, encompassing many blocks and therefore many more blobs per verification. I'm not sure entirely how managing the L1 state will look for this change and how best to align this with storing the information required to verify the blob via the KZG proof.
Whether verifying block root or root proofs on L1, a single call performing a Honk verification and up to 3 calls to the KZG point evaluation precompile may be too costly. It's possible to store blockHash
for a separate call to the precompile, but we have to consider that DA is not 'confirmed' until this call has happened.
Since all the blob circuit code above 'cares' about is an array of fields matching another array of fields, it should theoretically not affect too much. However we should be careful to include all the new effects in the right structure to be read by clients from L1.
@MirandaWood note: I'm sure there are plenty of areas I'm not familiar in which would be affected by this. Hopefully this doc gives a decent overview of the rollup circuit changes and a bit of the maths behind blobs.
Fill in bullets for each area that will be affected by this change.
- Cryptography
- Noir
- Aztec.js
- PXE
- Aztec.nr
- Enshrined L2 Contracts
- Private Kernel Circuits
- Sequencer
- AVM
- Public Kernel Circuits
- Rollup Circuits
- L1 Contracts
- Archiver
- Prover
- Economics
- P2P Network
- DevOps
Outline what unit and e2e tests will be written. Describe the logic they cover and any mock objects used.
Forge does allow emitting a blob, however, it allows for mocking a set of KZG hashes,
Identify changes or additions to the user documentation or protocol spec.
If the design is rejected, include a brief explanation of why.
If the design is abandoned mid-implementation, include a brief explanation of why.
If the design is implemented, include a brief explanation of deviations to the original design.
The information set out herein is for discussion purposes only and does not represent any binding indication or commitment by Aztec Labs and its employees to take any action whatsoever, including relating to the structure and/or any potential operation of the Aztec protocol or the protocol roadmap. In particular: (i) nothing in these projects, requests, or comments is intended to create any contractual or other form of legal relationship with Aztec Labs or third parties who engage with this AztecProtocol GitHub account (including, without limitation, by responding to a conversation or submitting comments) (ii) by engaging with any conversation or request, the relevant persons are consenting to Aztec Labs’ use and publication of such engagement and related information on an open-source basis (and agree that Aztec Labs will not treat such engagement and related information as confidential), and (iii) Aztec Labs is not under any duty to consider any or all engagements, and that consideration of such engagements and any decision to award grants or other rewards for any such engagement is entirely at Aztec Labs’ sole discretion. Please do not rely on any information on this account for any purpose - the development, release, and timing of any products, features, or functionality remains subject to change and is currently entirely hypothetical. Nothing on this account should be treated as an offer to sell any security or any other asset by Aztec Labs or its affiliates, and you should not rely on any content or comments for advice of any kind, including legal, investment, financial, tax, or other professional advice.