Add `Confirming` state for blobs #466

ian-shim · 2024-04-10T05:40:11Z

Why are these changes needed?

At the high level, this PR adds a new blob status called Confirming to indicate that it's awaiting ConfirmBatch transaction to be confirmed onchain.
To make this update not a breaking change for rollups, this status will remain internal, and blobs in this status will show as Processing status as before.

The main motivations for this PR are to

simplify and reduce complexity for in-memory accounting for blob status logic
make memory usage more efficient by flushing dispersed encoded blobs immediately after transaction has been created
fix the accounting issue for total encoded blob size and enables the EncodedSizeNotifier to properly trigger new batch iteration.

Checks

I've made sure the lint is passing in this PR.
I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
Testing Strategy
- Unit tests
- Integration tests
- This PR is not tested :(

ian-shim · 2024-04-10T17:16:11Z

disperser/batcher/finalizer.go

@@ -179,7 +179,8 @@ func (f *finalizer) updateBlobs(ctx context.Context, metadatas []*disperser.Blob
 		confirmationBlockNumber, err := f.getTransactionBlockNumber(ctx, confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash)
 		if errors.Is(err, ethereum.NotFound) {
 			// The confirmed block is finalized, but the transaction is not found. It means the transaction should be considered forked/invalid and the blob should be considered as failed.
-			_, err := f.blobStore.HandleBlobFailure(ctx, m, f.maxNumRetriesPerBlob)
+			f.logger.Warn("confirmed transaction not found", "blobKey", blobKey.String(), "confirmationTxnHash", confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash.Hex(), "confirmationBlockNumber", confirmationMetadata.ConfirmationInfo.ConfirmationBlockNumber)
+			err := f.blobStore.MarkBlobFailed(ctx, m.GetBlobKey())


From conversation with @teddyknox , it seems like marking a blob as failed in case of reorg like this is more appropriate than retrying dispersal

This is because the sequencer has already retrieved the blob lookup information, right?

In some ways, this makes me wonder if the finalizer is necessary at all.

There are few cases to consider here.

Reorg unconfirms the blob and is marked as Failed before the sequencer retrieves the blob status: The sequencer will see the blob as Failed and redisperse the blob.

Reorg unconfirms the blob. The sequencer has already retrieved the confirmed blob status. The sequencer submits inbox transaction.
a. The inbox transaction gets reorg'd as well. Rollup redisperses the same data.
b. (rare) The inbox transaction successfully makes it to the chain. Now it doesn't matter if the old blob gets retried or is marked as Failed. Need DA cert verification.

I think the finalizer is still necessary to be the source of truth though.

disperser/apiserver/server.go

mooselumph · 2024-04-10T20:47:56Z

disperser/batcher/batcher.go

@@ -493,9 +489,12 @@ func (b *Batcher) HandleSingleBatch(ctx context.Context) error {
 		return fmt.Errorf("HandleSingleBatch: error sending confirmBatch transaction: %w", err)
 	} else {
 		for _, metadata := range batch.BlobMetadata {
-			err = b.EncodingStreamer.MarkBlobPendingConfirmation(metadata)
+			err = b.Queue.MarkBlobConfirming(ctx, metadata.GetBlobKey())


I'm confused about the error handling. Here we check if marking the blobs in blobstore is successful before removing the encoded blobs. In handleFailure (e.g. L488), we remove the encoded blobs first and then update the blobstore. Have we thought through the different states that the system can be in depending on these different failure scenarios?

This is a good point. The encoded blob should probably be removed only if the update is successful.
The difference is whether the blob gets encoded again or not, which isn't super critical but has important performance implications.

Maybe it would be cleaner to have a set of "state transition functions" which will handle transitioning the state of a blob from one state to another.

Named things like

TransitionBlobProcessingToFailed

TrabnitionBlobProcessingtoConfirmed

Or

TransitionBlobToFailed

TransitionBlobToConfirmed

depending on whether the actions depend on initial state. These transition functions could handle all of updates to encodedBlobStore as well as blobStore, as well as error handling, and we could annotate them with the set of possible states that the transition function could end up in.

I think handleFailure is already close to this, but we could formalize and generalize the pattern.

I think that's a good idea.
I made transitionBlobToConfirming but kept the transitions to Failed and Confirmed as they were because those transitions only update blob store.

mooselumph

looks good, a few more thoughts

disperser/batcher/batcher.go

mooselumph · 2024-04-17T16:43:55Z

disperser/batcher/batcher.go

@@ -493,9 +489,12 @@ func (b *Batcher) HandleSingleBatch(ctx context.Context) error {
 		return fmt.Errorf("HandleSingleBatch: error sending confirmBatch transaction: %w", err)
 	} else {
 		for _, metadata := range batch.BlobMetadata {
-			err = b.EncodingStreamer.MarkBlobPendingConfirmation(metadata)
+			err = b.Queue.MarkBlobConfirming(ctx, metadata.GetBlobKey())


Maybe it would be cleaner to have a set of "state transition functions" which will handle transitioning the state of a blob from one state to another.

Named things like

TransitionBlobProcessingToFailed

TrabnitionBlobProcessingtoConfirmed

Or

TransitionBlobToFailed

TransitionBlobToConfirmed

depending on whether the actions depend on initial state. These transition functions could handle all of updates to encodedBlobStore as well as blobStore, as well as error handling, and we could annotate them with the set of possible states that the transition function could end up in.

I think handleFailure is already close to this, but we could formalize and generalize the pattern.

mooselumph · 2024-04-17T16:58:37Z

disperser/batcher/finalizer.go

@@ -179,7 +179,8 @@ func (f *finalizer) updateBlobs(ctx context.Context, metadatas []*disperser.Blob
 		confirmationBlockNumber, err := f.getTransactionBlockNumber(ctx, confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash)
 		if errors.Is(err, ethereum.NotFound) {
 			// The confirmed block is finalized, but the transaction is not found. It means the transaction should be considered forked/invalid and the blob should be considered as failed.
-			_, err := f.blobStore.HandleBlobFailure(ctx, m, f.maxNumRetriesPerBlob)
+			f.logger.Warn("confirmed transaction not found", "blobKey", blobKey.String(), "confirmationTxnHash", confirmationMetadata.ConfirmationInfo.ConfirmationTxnHash.Hex(), "confirmationBlockNumber", confirmationMetadata.ConfirmationInfo.ConfirmationBlockNumber)
+			err := f.blobStore.MarkBlobFailed(ctx, m.GetBlobKey())


This is because the sequencer has already retrieved the blob lookup information, right?

In some ways, this makes me wonder if the finalizer is necessary at all.

ian-shim force-pushed the add-confirming-status branch from 761b31f to 3c034f5 Compare April 10, 2024 17:04

ian-shim changed the title ~~add confirming state~~ Add Confirming state for blobs Apr 10, 2024

ian-shim commented Apr 10, 2024

View reviewed changes

ian-shim requested review from teddyknox, mooselumph and jianoaix April 10, 2024 17:16

ian-shim marked this pull request as ready for review April 10, 2024 17:16

mooselumph reviewed Apr 10, 2024

View reviewed changes

disperser/apiserver/server.go Show resolved Hide resolved

mooselumph reviewed Apr 10, 2024

View reviewed changes

mooselumph approved these changes Apr 17, 2024

View reviewed changes

ian-shim force-pushed the add-confirming-status branch from 3c034f5 to 98af55d Compare April 17, 2024 23:17

add confirming state

32f141b

ian-shim force-pushed the add-confirming-status branch from 98af55d to 32f141b Compare April 17, 2024 23:30

ian-shim merged commit a0c28c7 into Layr-Labs:master Apr 18, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `Confirming` state for blobs #466

Add `Confirming` state for blobs #466

ian-shim commented Apr 10, 2024 •

edited

Loading

ian-shim Apr 10, 2024

mooselumph Apr 17, 2024

ian-shim Apr 17, 2024 •

edited

Loading

mooselumph Apr 10, 2024

ian-shim Apr 10, 2024

mooselumph Apr 17, 2024

ian-shim Apr 17, 2024

mooselumph left a comment

mooselumph Apr 17, 2024

mooselumph Apr 17, 2024

Add Confirming state for blobs #466

Add Confirming state for blobs #466

Conversation

ian-shim commented Apr 10, 2024 • edited Loading

Why are these changes needed?

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ian-shim Apr 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mooselumph left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Add `Confirming` state for blobs #466

Add `Confirming` state for blobs #466

ian-shim commented Apr 10, 2024 •

edited

Loading

ian-shim Apr 17, 2024 •

edited

Loading