Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Confirming state for blobs #466

Merged
merged 1 commit into from
Apr 18, 2024

Conversation

ian-shim
Copy link
Contributor

@ian-shim ian-shim commented Apr 10, 2024

Why are these changes needed?

At the high level, this PR adds a new blob status called Confirming to indicate that it's awaiting ConfirmBatch transaction to be confirmed onchain.
To make this update not a breaking change for rollups, this status will remain internal, and blobs in this status will show as Processing status as before.

The main motivations for this PR are to

  • simplify and reduce complexity for in-memory accounting for blob status logic
  • make memory usage more efficient by flushing dispersed encoded blobs immediately after transaction has been created
  • fix the accounting issue for total encoded blob size and enables the EncodedSizeNotifier to properly trigger new batch iteration.

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@ian-shim ian-shim force-pushed the add-confirming-status branch from 761b31f to 3c034f5 Compare April 10, 2024 17:04
@ian-shim ian-shim changed the title add confirming state Add Confirming state for blobs Apr 10, 2024
@@ -179,7 +179,8 @@ func (f *finalizer) updateBlobs(ctx context.Context, metadatas []*disperser.Blob
confirmationBlockNumber, err := f.getTransactionBlockNumber(ctx, confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash)
if errors.Is(err, ethereum.NotFound) {
// The confirmed block is finalized, but the transaction is not found. It means the transaction should be considered forked/invalid and the blob should be considered as failed.
_, err := f.blobStore.HandleBlobFailure(ctx, m, f.maxNumRetriesPerBlob)
f.logger.Warn("confirmed transaction not found", "blobKey", blobKey.String(), "confirmationTxnHash", confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash.Hex(), "confirmationBlockNumber", confirmationMetadata.ConfirmationInfo.ConfirmationBlockNumber)
err := f.blobStore.MarkBlobFailed(ctx, m.GetBlobKey())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From conversation with @teddyknox , it seems like marking a blob as failed in case of reorg like this is more appropriate than retrying dispersal

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the sequencer has already retrieved the blob lookup information, right?

In some ways, this makes me wonder if the finalizer is necessary at all.

Copy link
Contributor Author

@ian-shim ian-shim Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are few cases to consider here.

  1. Reorg unconfirms the blob and is marked as Failed before the sequencer retrieves the blob status: The sequencer will see the blob as Failed and redisperse the blob.
  2. Reorg unconfirms the blob. The sequencer has already retrieved the confirmed blob status. The sequencer submits inbox transaction.
    a. The inbox transaction gets reorg'd as well. Rollup redisperses the same data.
    b. (rare) The inbox transaction successfully makes it to the chain. Now it doesn't matter if the old blob gets retried or is marked as Failed. Need DA cert verification.

I think the finalizer is still necessary to be the source of truth though.

@ian-shim ian-shim marked this pull request as ready for review April 10, 2024 17:16
@@ -493,9 +489,12 @@ func (b *Batcher) HandleSingleBatch(ctx context.Context) error {
return fmt.Errorf("HandleSingleBatch: error sending confirmBatch transaction: %w", err)
} else {
for _, metadata := range batch.BlobMetadata {
err = b.EncodingStreamer.MarkBlobPendingConfirmation(metadata)
err = b.Queue.MarkBlobConfirming(ctx, metadata.GetBlobKey())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about the error handling. Here we check if marking the blobs in blobstore is successful before removing the encoded blobs. In handleFailure (e.g. L488), we remove the encoded blobs first and then update the blobstore. Have we thought through the different states that the system can be in depending on these different failure scenarios?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. The encoded blob should probably be removed only if the update is successful.
The difference is whether the blob gets encoded again or not, which isn't super critical but has important performance implications.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be cleaner to have a set of "state transition functions" which will handle transitioning the state of a blob from one state to another.

Named things like

  • TransitionBlobProcessingToFailed
  • TrabnitionBlobProcessingtoConfirmed

Or

  • TransitionBlobToFailed
  • TransitionBlobToConfirmed

depending on whether the actions depend on initial state. These transition functions could handle all of updates to encodedBlobStore as well as blobStore, as well as error handling, and we could annotate them with the set of possible states that the transition function could end up in.

I think handleFailure is already close to this, but we could formalize and generalize the pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good idea.
I made transitionBlobToConfirming but kept the transitions to Failed and Confirmed as they were because those transitions only update blob store.

Copy link
Collaborator

@mooselumph mooselumph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, a few more thoughts

disperser/batcher/batcher.go Outdated Show resolved Hide resolved
@@ -493,9 +489,12 @@ func (b *Batcher) HandleSingleBatch(ctx context.Context) error {
return fmt.Errorf("HandleSingleBatch: error sending confirmBatch transaction: %w", err)
} else {
for _, metadata := range batch.BlobMetadata {
err = b.EncodingStreamer.MarkBlobPendingConfirmation(metadata)
err = b.Queue.MarkBlobConfirming(ctx, metadata.GetBlobKey())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be cleaner to have a set of "state transition functions" which will handle transitioning the state of a blob from one state to another.

Named things like

  • TransitionBlobProcessingToFailed
  • TrabnitionBlobProcessingtoConfirmed

Or

  • TransitionBlobToFailed
  • TransitionBlobToConfirmed

depending on whether the actions depend on initial state. These transition functions could handle all of updates to encodedBlobStore as well as blobStore, as well as error handling, and we could annotate them with the set of possible states that the transition function could end up in.

I think handleFailure is already close to this, but we could formalize and generalize the pattern.

@@ -179,7 +179,8 @@ func (f *finalizer) updateBlobs(ctx context.Context, metadatas []*disperser.Blob
confirmationBlockNumber, err := f.getTransactionBlockNumber(ctx, confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash)
if errors.Is(err, ethereum.NotFound) {
// The confirmed block is finalized, but the transaction is not found. It means the transaction should be considered forked/invalid and the blob should be considered as failed.
_, err := f.blobStore.HandleBlobFailure(ctx, m, f.maxNumRetriesPerBlob)
f.logger.Warn("confirmed transaction not found", "blobKey", blobKey.String(), "confirmationTxnHash", confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash.Hex(), "confirmationBlockNumber", confirmationMetadata.ConfirmationInfo.ConfirmationBlockNumber)
err := f.blobStore.MarkBlobFailed(ctx, m.GetBlobKey())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the sequencer has already retrieved the blob lookup information, right?

In some ways, this makes me wonder if the finalizer is necessary at all.

@ian-shim ian-shim force-pushed the add-confirming-status branch from 3c034f5 to 98af55d Compare April 17, 2024 23:17
@ian-shim ian-shim force-pushed the add-confirming-status branch from 98af55d to 32f141b Compare April 17, 2024 23:30
@ian-shim ian-shim merged commit a0c28c7 into Layr-Labs:master Apr 18, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants