-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Confirming
state for blobs
#466
Conversation
761b31f
to
3c034f5
Compare
@@ -179,7 +179,8 @@ func (f *finalizer) updateBlobs(ctx context.Context, metadatas []*disperser.Blob | |||
confirmationBlockNumber, err := f.getTransactionBlockNumber(ctx, confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash) | |||
if errors.Is(err, ethereum.NotFound) { | |||
// The confirmed block is finalized, but the transaction is not found. It means the transaction should be considered forked/invalid and the blob should be considered as failed. | |||
_, err := f.blobStore.HandleBlobFailure(ctx, m, f.maxNumRetriesPerBlob) | |||
f.logger.Warn("confirmed transaction not found", "blobKey", blobKey.String(), "confirmationTxnHash", confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash.Hex(), "confirmationBlockNumber", confirmationMetadata.ConfirmationInfo.ConfirmationBlockNumber) | |||
err := f.blobStore.MarkBlobFailed(ctx, m.GetBlobKey()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From conversation with @teddyknox , it seems like marking a blob as failed in case of reorg like this is more appropriate than retrying dispersal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because the sequencer has already retrieved the blob lookup information, right?
In some ways, this makes me wonder if the finalizer is necessary at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are few cases to consider here.
- Reorg unconfirms the blob and is marked as
Failed
before the sequencer retrieves the blob status: The sequencer will see the blob asFailed
and redisperse the blob. - Reorg unconfirms the blob. The sequencer has already retrieved the confirmed blob status. The sequencer submits inbox transaction.
a. The inbox transaction gets reorg'd as well. Rollup redisperses the same data.
b. (rare) The inbox transaction successfully makes it to the chain. Now it doesn't matter if the old blob gets retried or is marked asFailed
. Need DA cert verification.
I think the finalizer is still necessary to be the source of truth though.
disperser/batcher/batcher.go
Outdated
@@ -493,9 +489,12 @@ func (b *Batcher) HandleSingleBatch(ctx context.Context) error { | |||
return fmt.Errorf("HandleSingleBatch: error sending confirmBatch transaction: %w", err) | |||
} else { | |||
for _, metadata := range batch.BlobMetadata { | |||
err = b.EncodingStreamer.MarkBlobPendingConfirmation(metadata) | |||
err = b.Queue.MarkBlobConfirming(ctx, metadata.GetBlobKey()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about the error handling. Here we check if marking the blobs in blobstore is successful before removing the encoded blobs. In handleFailure
(e.g. L488), we remove the encoded blobs first and then update the blobstore. Have we thought through the different states that the system can be in depending on these different failure scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. The encoded blob should probably be removed only if the update is successful.
The difference is whether the blob gets encoded again or not, which isn't super critical but has important performance implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be cleaner to have a set of "state transition functions" which will handle transitioning the state of a blob from one state to another.
Named things like
- TransitionBlobProcessingToFailed
- TrabnitionBlobProcessingtoConfirmed
Or
- TransitionBlobToFailed
- TransitionBlobToConfirmed
depending on whether the actions depend on initial state. These transition functions could handle all of updates to encodedBlobStore as well as blobStore, as well as error handling, and we could annotate them with the set of possible states that the transition function could end up in.
I think handleFailure
is already close to this, but we could formalize and generalize the pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a good idea.
I made transitionBlobToConfirming
but kept the transitions to Failed
and Confirmed
as they were because those transitions only update blob store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, a few more thoughts
disperser/batcher/batcher.go
Outdated
@@ -493,9 +489,12 @@ func (b *Batcher) HandleSingleBatch(ctx context.Context) error { | |||
return fmt.Errorf("HandleSingleBatch: error sending confirmBatch transaction: %w", err) | |||
} else { | |||
for _, metadata := range batch.BlobMetadata { | |||
err = b.EncodingStreamer.MarkBlobPendingConfirmation(metadata) | |||
err = b.Queue.MarkBlobConfirming(ctx, metadata.GetBlobKey()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be cleaner to have a set of "state transition functions" which will handle transitioning the state of a blob from one state to another.
Named things like
- TransitionBlobProcessingToFailed
- TrabnitionBlobProcessingtoConfirmed
Or
- TransitionBlobToFailed
- TransitionBlobToConfirmed
depending on whether the actions depend on initial state. These transition functions could handle all of updates to encodedBlobStore as well as blobStore, as well as error handling, and we could annotate them with the set of possible states that the transition function could end up in.
I think handleFailure
is already close to this, but we could formalize and generalize the pattern.
@@ -179,7 +179,8 @@ func (f *finalizer) updateBlobs(ctx context.Context, metadatas []*disperser.Blob | |||
confirmationBlockNumber, err := f.getTransactionBlockNumber(ctx, confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash) | |||
if errors.Is(err, ethereum.NotFound) { | |||
// The confirmed block is finalized, but the transaction is not found. It means the transaction should be considered forked/invalid and the blob should be considered as failed. | |||
_, err := f.blobStore.HandleBlobFailure(ctx, m, f.maxNumRetriesPerBlob) | |||
f.logger.Warn("confirmed transaction not found", "blobKey", blobKey.String(), "confirmationTxnHash", confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash.Hex(), "confirmationBlockNumber", confirmationMetadata.ConfirmationInfo.ConfirmationBlockNumber) | |||
err := f.blobStore.MarkBlobFailed(ctx, m.GetBlobKey()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because the sequencer has already retrieved the blob lookup information, right?
In some ways, this makes me wonder if the finalizer is necessary at all.
3c034f5
to
98af55d
Compare
98af55d
to
32f141b
Compare
Why are these changes needed?
At the high level, this PR adds a new blob status called
Confirming
to indicate that it's awaitingConfirmBatch
transaction to be confirmed onchain.To make this update not a breaking change for rollups, this status will remain internal, and blobs in this status will show as
Processing
status as before.The main motivations for this PR are to
EncodedSizeNotifier
to properly trigger new batch iteration.Checks