Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: blob staleness check #226

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

samlaf
Copy link
Collaborator

@samlaf samlaf commented Jan 8, 2025

We add a check to discard blobs that have been included in the rollup's batcher inbox a long time after it was included in an EigenDA batch by the eigenda disperser.

It is now expected that GET requests to the proxy can include an optional l1_inclusion_block_number query argument, which will cause a blob to be rejected if it was included onchain > RollupBlobInclusionWindow L1 blocks after the blob's batch's RBN.

The RollupBlobInclusionWindow is a flag added to the proxy binary, which is set to 0 by default, which skips the check.

Fixes Issue

Fixes #

Changes proposed

Screenshots (Optional)

Note to reviewers

Commits are:

  • 8eaef49: contains the main feature logic
  • a3b6ca4: test the query param parsing logic
  • 0cf0e78: test the verification logic (also includes refactoring the CertVerifier to an interface to be able to test other verification that doesnt need an eth rpc connection)

@samlaf samlaf marked this pull request as draft January 8, 2025 22:32
@samlaf samlaf marked this pull request as ready for review January 9, 2025 02:11
// The eigenDA batch header contains a reference block number (RBN) which is used to pin the stake of the eigenda operators at that specific blocks.
// The rollup batch containing the eigenDA cert is only valid if it was included within a certain number of blocks after the RBN.
// validity condition is: RBN < l1_inclusion_block_number < RBN + some_delta
RollupL1InclusionBlockNum int64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be uint64?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using -1 to mean that this query param was not passed, and hence to skip the test. Could using a pointer to uint64 instead and use nil to mean this? I don't like either tbh... but go is dumb.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you use 0 to indicate "skip the test"?

RollupL1InclusionBlockNum must be strictly greater RBN, and the min value of RBN is 0, therefore RollupL1InclusionBlockNum == 0 isn't sensical, and is free realestate to have special meaning

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that's a good point! 2 questions to answer here:

  1. do we really want < or should I have been using <= ? Don't think it makes a lot of sense for RBN to be == l1_inclusion_block_number but I think its technically possible, especially on a devnet?
  2. if we have ==, perhaps its still fine to use 0 because I don't think txs are allowed in the genesis block (although apparently bitcoin allows it? so I'm not sure.... would an op L2 ever have a batcher tx in genesis block? that doesn't make a lot of sense but what if...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think its technically possible, especially on a devnet

I don't see how this would be possible

The eigenDA cert directly contains the RBN, so you can't honestly construct the cert until the reference block exists. And if the RB already exists, how could the eigenDA cert be put on chain any earlier than RBN+1?

verify/cert.go Outdated
// to ensure disperser returned fields haven't been tampered with
type CertVerifier struct {
type CertVerification struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knit - any reason to make this public if we're binding to an interface?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, I can change. Don't think using an interface matters here though, it's more about whether we think people will want to import and use this struct elsewhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// If batch.RBN + rollupBlobInclusionWindow < cert.L1InclusionBlock, the batch is considered stale and verification will fail.
// This check is optional and will be skipped when rollupBlobInclusionWindow is set to 0.
// Note: if there are more rollup related properties that we need to check in the future, then maybe create a RollupVerifier struct
rollupBlobInclusionWindow uint32
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this need to be globally set per the rollup? Otherwise couldn't you have multiple verifiers articulating alternative views due to different values?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's a good point. I wrote the code out but now's the time to discuss how we want to handle this feature. cc @bxue-l2 for discussion. I'll at least add a bit WARNING comment to the flag, saying that the feature is experimental and not really meant to be turned on? But then what's our plan for eventually turning it on?

Copy link
Collaborator

@bxue-l2 bxue-l2 Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this require an upgrade process like what op currently has. not very good. Alternatively, we can do it inside the op-node derivation pipeline, but we have the burden to keep our fork up to date

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can put the staleness gap in the cert, signed by DA network

// 1 - verify batch in the cert is confirmed onchain
err := v.cv.verifyBatchConfirmedOnChain(ctx, cert.Proof().GetBatchId(), cert.Proof().GetBatchMetadata())
if err != nil {
return fmt.Errorf("failed to verify batch: %w", err)
}

// 2 - verify merkle inclusion proof
// 2 - verify that the blob cert was submitted to the rollup's batch inbox within the allowed RollupBlobInclusionWindow window.
// This is to prevent timing attacks where a rollup batcher could try to game the fraud proof window by including an old DA blob that is about to expire on the DA layer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will this verification look when one step proving the certificate? ie, will a challenger need to provide the inbox confirmation tx block #, if so then how do they do so in a way that isn't tamper resistant?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right precisely the right question to ask. Discussed this at length with Bowen. I'll need to add comments detailing this.

The 2 options for securing this check is:

  1. do it in derivation pipeline
  2. do it in preimage oracle contract (as you mention, this would require a zk proof that the inclusion block # is correct)

We are opting for 1, to do this check in the derivation pipeline. It's a simple check to make in the rust hokulea client, so we'll add it there. Ideally we'd add the same check to the golang derivation pipeline, but the problem is that we haven't had much success lately getting our PRs upstreamed. Also if ever a problem happens and we need to change the logic in the golang binary then we depend on optimism's team to review and merge our PR, which feels less than ideal (this is different from the rust code b/c in rust we have our own Hokulea library which projects using eigenda can import directly, and which we have full control over).

So the idea is that the rust stack will be a requirement to do the fraud proofs. But we still want normal validators running op-node to be able to follow the chain properly, which is why we are adding this check here in the proxy.

cc @bxue-l2 for any more thoughts.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently for op, the check is done inside the client program, and derivation pipeline provide the correct block Info. See hokulea code here.

I think it worths some discussion, and tbh didn't know sam made a lot of progress. Nice Sam. I thought you are debugging the bug all the times. I am down to have a meeting about this.

Comment on lines +3 to +7
//
// Generated by this command:
//
// mockgen -package mocks --destination ../mocks/manager.go . IManager
//
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also make go-gen-mocks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did use make go-gen-mocks. Not sure why this repo was setup to use the deprecated go mock library instead of the new uber mock repo. Updated that dependency in go.mod.

Created this issue to make a reproducible dev env: #236

common/store.go Outdated Show resolved Hide resolved
if !(blockNumEigenDA < blockNumRollup) {
return fmt.Errorf("eigenda batch reference block number (%d) needs to be < rollup inclusion block number (%d)", blockNumEigenDA, blockNumRollup)
}
if !(blockNumRollup < blockNumEigenDA+int64(v.rollupBlobInclusionWindow)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you have an off-by-one error here

The comment in the config struct above states that the following constitutes a stale batch:

batch.RBN + RollupBlobInclusionWindow < cert.L1InclusionBlock

but it seems what's actually implemented is this:

batch.RBN + RollupBlobInclusionWindow <= cert.L1InclusionBlock

It might help to have consistent variable names between the two.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was doing < everywhere? But I agree it should be <=, so changed. Also updates comments and made naming consistent: 72451c6

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discrepancy was in the config doc:

// If batch.RBN + RollupBlobInclusionWindow < cert.L1InclusionBlock, the batch is considered stale and verification will fail.

Since this doc is defining what constitutes staleness, the implied invariant would be the inverse

cert.L1InclusionBlock <= batch.RBN + RollupBlobInclusionWindow

Your most recent changeset has things correct, since you modified the invariant, but left the staleness definition in the Config struct unchanged.

verify/verifier.go Outdated Show resolved Hide resolved
We add a check to discard blobs that have been included in the rollup's batcher inbox a long time after it was included in an EigenDA batch by the eigenda disperser.
Made the Verifier take an interface for the certVerifier instead of the struct, to allow me to test functionality that didn't depend on having a blockchain connection (like the certVerifier does).

This way we can test the blob staleness check by using a noopCertVerifier to skip all the other verification that require an eth rpc.
@samlaf samlaf force-pushed the feat--rollup-blob-inclusion-staleness-check branch from bc6d810 to 06237d8 Compare January 10, 2025 21:23
@samlaf
Copy link
Collaborator Author

samlaf commented Jan 10, 2025

rebased on top of master to get the latest change (was one minor conflict where verifier's Verify function had change the argument name from expectedSomething to certCommitment)

@samlaf samlaf requested review from litt3 and ethenotethan January 10, 2025 23:10
}

// NoopCertVerifier is used in place of CertVerification when certificate verification is disabled
type NoopCertVerifier struct{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you left this public?

batchRBN := int64(cert.BlobVerificationProof.BatchMetadata.BatchHeader.ReferenceBlockNumber)
rollupInclusionBlock := args.RollupL1InclusionBlockNum
// We need batchRBN < rollupInclusionBlock <= batch.RBN + rollupBlobInclusionWindow
// TODO: should we be using <= <= instead?
Copy link
Collaborator

@litt3 litt3 Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, no. See my comment above, I don't think RBN == inclusion block is sensical

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants