feat: Dynamically adjust proposal block ranges #1139

pxrl · 2024-01-05T13:21:40Z

This commit introduces dynamic adjustment of the block ranges that the proposer will propose. The intention of this change is to make the proposer more robust to incomplete or inconsistent data being supplied by one or more RPC providers.

Dynamic adjustment is enacted in the event that any of the following conditions are detected:

An invalid fill traces to a missing deposit (no deposit found for the specified depositId).
A deposit is known to be filled, without any associated FilledRelay event.

In the first case, this results in narrowing of the block range on both the origin and destination chains. For the latter, only the destination chain is narrowed.

Some additional comments on why this approach was chosen:

It's minimally invasive to the existing dataworker proposal flow, and it's ++easy to disable in the event that any issue arises in prod. Ideally we'd refactor in favour of progressively incrementing the block ranges as deposits and fills are validated, rather than attempting to propose over the entire range and then winding back in case of issue. That would however require a much larger refactor, and there's no appetite for that given the pending v3 changes.

Closes ACX-1767

This commit introduces dynamic adjustment of the block ranges that the proposer will propose. The intention of this change is to make the proposer more robust to incomplete or inconsistent data being supplied by one or more RPC providers. Dynamic adjustment is enacted in the event that any of the following conditions are detected: - An invalid fill traces to a missing deposit (no deposit found for the specified depositId). - A deposit is known to be filled, without any associated FilledRelay event. In the first case, this results in narrowing of the block range on both the origin and destination chains. For the latter, only the destination chain is narrowed.

pxrl

Just adding some context.

src/dataworker/Dataworker.ts

test/Dataworker.proposeRootBundle.ts

src/dataworker/Dataworker.ts

linear · 2024-01-05T13:47:04Z

ACX-1767 proposer: Avoid proposing when inconsistent data is detected

pxrl · 2024-01-05T14:33:25Z

~~fwiw am aware of a single test failure associated with this change. It's not obvious to me why it's failing at the moment; I will dig into it shortly.~~

Update - resolved here: ccb5df0

pxrl · 2024-01-05T14:37:59Z

test/Dataworker.proposeRootBundle.ts

nb. In general, this new feature probably warrants some additional test cases to stress out the implementation. This is especially important as it's not possible to backtest the change against previous bundles.

nicholaspai

Left some comments on first read through

src/dataworker/Dataworker.ts

nicholaspai · 2024-01-06T11:37:01Z

src/dataworker/Dataworker.ts

+    // - Narrow the destination block range to exclude the invalid fill.
+    allInvalidFills
+      .filter(({ code }) => code === InvalidFill.DepositIdNotFound)
+      .forEach(({ fill: { depositId, originChainId, destinationChainId, blockNumber } }) => {


Can we sort these by block number by chain to allow us to exit the following map earlier?

I'm not totally confident on this, but I'm not sure that we can actually simply sort and exit earlier. This is because we need to avoid making assumptions about the ordering of fills vs. deposits.

For example, the first invalid fill on a destination chain might correspond to the last missing deposit on the origin chain. Then, some later invalid fill on the destination chain might actually fill an earlier missing deposit on the origin chain. This also seems more likely to occur in the case that the RPCs are serving inconsistent data.

If we group/sort by destinationChainId and destination block number, we might not narrow the originChainId block range correctly. If we group/sort originChainId and origin block number, we might not narrow the destinationChainid block range correctly.

In general, even when it's really bad, the number of missing deposits we typically see is about 20 - 30. In order to identify both the earliest block for both origin and destination chains we might end up looping multiple times over all of the invalid fills. In the case where we only have tens missing events, it seems like it might be cheaper and simpler to just process them one by one.

Full disclosure: I might have overlooked something really obvious here. wdyt?

this is a good point, that the latest deposit might match with the earliest fill and vice versra

src/dataworker/Dataworker.ts

mrice32 · 2024-01-10T05:28:10Z

src/dataworker/Dataworker.ts

+        const previousDeposit = originSpokePoolClient
+          .getDepositsForDestinationChain(destinationChainId)
+          .filter((deposit: DepositWithBlock) => deposit.blockNumber < blockNumber)
+          .at(-1);


What if the deposit is from a gap between spoke pool deployments or something? So there actually is no deposit for this id? Will this always return undefined in that case, which would mean no change?

If there are edge cases like that that are handled in a subtle way here, we may want to add a brief comment explaining them.

The SpokePoolClient only has the ability to handle a single SpokePool deployment, so in the case that we don't find any preceding deposit then previousDeposit resolves to undefined. In this case, the updateEndBlock() helper detects that the updated endBlock is undefined and implements a soft-pause of the chain by proposing over the previous endBlock (as we do for Boba).

I'll update the comment on line 509 to clarify this.

There's arguably a scenario where the first deposit on a new chain goes missing. This would implicitly result in proposing over the range [0,0] for that chain because the HubPoolClient supplies previousEndBlock 0 in that scenario. I think this is an extremely remote probability, and activating a new chain requires multiple responsible adults to test the deployment and monitor the proposal, so I'm probably not inclined to handle it explicitly in the code. wdyt?

what would a bundle block range look like for a new spoke pool address? whether it be a new chain or an updated spoke pool address for an existing chain?

The bundlEndBlock is sourced from HubPoolClient.getLatestBundleEndBlockForChain(), so we inherit its behaviour. For a new deployment on an existing chain, we're bound to continue from the previous bundleEndBlock. This should work as expected.

For a new chain, the chainId index isn't found in the previous bundle, so we default to proposing from 0. This is also as was the case for the activation of zkSync and Base - so we inherit the existing behaviour.

The only change that I can foresee here is that if that initial proposal contains invalid fills due to missing deposits on the new chain, or fills for missing deposits on another chain, then we'd revert to proposing over [0,0]. I'm not sure whether that's a problem in itself, but it's a pretty extreme edge case because the number of deposits and fills for the new chain in the initial proposal are likely to be very low. and we'd detect it immediately because activating a new chain implies manual review of the proposal block ranges.

I think the 0,0 case is remote and doesn't cause an obvious problem, so I wouldn't worry about handling it explicitly.

This all makes sense. I think a comment that says something like:

This can return undefined if there is no known preceding deposit in range. That will result in the chain being soft-paused by setting the endBlock equal to the previous endBlock, making this bundle cover no blocks on that chain.

Would be really helpful to the reader.

Suggested by Matt. Co-authored-by: Matt Rice <[email protected]>

mrice32 · 2024-01-10T05:53:28Z

src/dataworker/Dataworker.ts

+      if (blockNumber > updatedBlockRanges[originChainId][1]) {
+        return false; // Fill event is already out of scope due to previous narrowing.
+      }
+      return allValidFills.find(


Can we use a spoke pool function or map lookup to get the fill for a deposit rather than looping through all fills on each deposit?

We store by a special hash in the client to avoid this n^2 loop (which has caused speed issues in the past).

I have backed that entire section out (for now) per this thread with Nick: #1139 (comment).

Here's the change: 093d80b

I did consider accessing these via a map, but tentatively decided against it because I don't think this will actually use much CPU time in practice. Arriving here is conditional on at least one deposit is detected as filled, despite no known FilledRelay event. Then, we iterate over the àllValidFills` array once per instance of "filled-but-missing" event.

So in the normal case it costs nothing, but it does cost CPU in the bad case where a chain/RPC is misbehaving. However, because we've already filtered on the missing deposits and have narrowed the block range accordingly, we should skip over most of those missing fill events anyway. So in practice I think the impact of this would be fairly limited, even though it is technically inefficient to search the allValidFills array in this way. Does that make sense?

It does, it sounds like you're saying this loop will be run rarely so optimization isn't a concern. I wasn't aware of that when I initially read the code, but will take another pass.

mrice32 · 2024-01-10T05:58:51Z

src/dataworker/Dataworker.ts

+      const [originalStartBlock] = blockRanges[idx];
+      return (
+        (endBlock > startBlock && startBlock === originalStartBlock) ||
+        (startBlock === endBlock && startBlock === originalStartBlock - 1) // soft-pause


Why would the new start block be one behind the original start block in this case? And is it okay to leave it that way?

originalStartBlock is normally previousEndBlock + 1, so if startBlock === endBlock is true then we're implementing a soft-pause on the chain, and will propose over the range of [previousEndBlock, previousEndBlock]. This is consistent with the way that chains are "hard-paused" (i.e. disabled via the ConfigStore), so this check is simply verifying the invariant.

To be more correct, rather than assuming that previousEndBlock === originalStartBlock - 1, this check should resolve the previous ending block via HubPoolClient.getLatestBundleEndBlockForChain().

Understood. That makes sense.

src/dataworker/Dataworker.ts

Proceeding with the initial "invalid fills" change alone. The subsequent change can be re-introduced later.

src/dataworker/Dataworker.ts

nicholaspai · 2024-01-17T18:37:10Z

src/dataworker/Dataworker.ts

+        const destSpokePoolClient = spokePoolClients[destinationChainId];
+        [startBlock, endBlock] = updatedBlockRanges[destinationChainId];
+
+        if (blockNumber <= endBlock) {


nit: wouldn't blockNumbers for invalid fills always be within the bundle block range by definition when calling loadData?

Not necessarily, because we source endBlock from updatedBlockRanges, so it is subject to being iteratively updated within this loop. So if we are examining a fill where the blockNumber has already been excluded by narrowing then we should just skip over that.

Clarified here: 7559948

src/dataworker/Dataworker.ts

nicholaspai

I think the new implementation is a lot more clear and therefore I left more targeted questions

Proposed by Nick. Co-authored-by: nicholaspai <[email protected]>

nicholaspai

I'm on board with merging this, closely monitoring the data worker proposals for a few days

OR

adding some more unit tests before we activate this.

We are probably safe to launch this and in the worst case, we'll self-dispute since the disputer does not have this narrowBlockRanges logic. And AFAICT the disputer's logic to construct the block ranges from getWidestPossibleBlockRanges hasn't changed

mrice32

A few more comments! Sorry, this logic is pretty nuanced

mrice32 · 2024-01-24T15:11:48Z

src/clients/BundleDataClient.ts

@@ -289,7 +295,8 @@ export class BundleDataClient {
        if (historicalDeposit.found) {
          addRefundForValidFill(fill, historicalDeposit.deposit, blockRangeForChain);
        } else {
-          allInvalidFills.push(fill);
+          assert(historicalDeposit.found === false); // Help tsc to narrow the discriminated union type.


This is annoying -- ts really should not require runtime code to force it to get that it must be this type. It won't compile without this line?

OOC @pxrl what is the error that is thrown?

mrice32 · 2024-01-24T15:43:46Z

src/dataworker/Dataworker.ts

+
+        // Find the previous known deposit. This may resolve a deposit before the immediately preceding depositId.
+        const previousDeposit = originSpokePoolClient
+          .getDepositsForDestinationChain(destinationChainId)


I don't think these are guaranteed to be sorted, which might mean this deposit would be too early, right?

mrice32 · 2024-01-24T15:50:50Z

src/dataworker/Dataworker.ts

+        const previousDeposit = originSpokePoolClient
+          .getDepositsForDestinationChain(destinationChainId)
+          .filter((deposit: DepositWithBlock) => deposit.blockNumber < blockNumber)
+          .at(-1);


I think the 0,0 case is remote and doesn't cause an obvious problem, so I wouldn't worry about handling it explicitly.

This all makes sense. I think a comment that says something like:

This can return undefined if there is no known preceding deposit in range. That will result in the chain being soft-paused by setting the endBlock equal to the previous endBlock, making this bundle cover no blocks on that chain.

Would be really helpful to the reader.

mrice32 · 2024-01-24T15:54:13Z

src/dataworker/Dataworker.ts

+      assert(
+        endBlock < currentEndBlock,
+        `Invalid block range update for chain ${chainId}: block ${endBlock} >= ${currentEndBlock}`
+      );


Since these updates can come in out of order, I would expect this case to happen, no?

I find a missing deposit at block 7, then later I see a missing deposit at block 12.

Maybe I'm missing the logic that prevents this.

james-a-morris · 2024-01-29T16:03:21Z

src/dataworker/Dataworker.ts

@@ -263,10 +266,10 @@ export class Dataworker {
   * of log level
   * @returns Array of blocks ranges to propose for next bundle.
   */
-  _getNextProposalBlockRanges(
+  async _getNextProposalBlockRanges(


It looks like this function only has one await and it's at the very last line. Thoughts on removing this async and returning the promise generated by the call to this.narrowProposalBlockRanges(blockRangesForProposal, spokePoolClients)

james-a-morris · 2024-01-29T16:04:17Z

src/dataworker/Dataworker.ts

+    blockRanges: number[][],
+    spokePoolClients: SpokePoolClientsByChain,
+    logData = true,
+    isUBA = false


Will this need to be changed for the UBA removal changes?

james-a-morris

Left a few comments

…lRange

pxrl requested review from mrice32, nicholaspai and james-a-morris as code owners January 5, 2024 13:21

Re-enable disabled tests

cb9186a

pxrl commented Jan 5, 2024

View reviewed changes

src/dataworker/Dataworker.ts Outdated Show resolved Hide resolved

src/dataworker/Dataworker.ts Outdated Show resolved Hide resolved

src/dataworker/Dataworker.ts Outdated Show resolved Hide resolved

test/Dataworker.proposeRootBundle.ts Show resolved Hide resolved

pxrl commented Jan 5, 2024

View reviewed changes

src/dataworker/Dataworker.ts Outdated Show resolved Hide resolved

pxrl added 5 commits January 5, 2024 14:48

Return finalBlockRanges

b3d31e9

Undo unnecessary change

d029d53

Additional cleanup

ee01c7e

Relocate block range shuffling to proposer-only workflow

314073f

Relocate block range narrowing further upstream in proposer workflow

fb12aa0

pxrl commented Jan 5, 2024

View reviewed changes

nicholaspai reviewed Jan 6, 2024

View reviewed changes

pxrl added 4 commits January 9, 2024 00:49

Merge branch 'master' into pxrl/dynamicProposalRange

f0cb419

Fix tests

ccb5df0

Simplify end block updating

c716ee8

lint

a990ee7

mrice32 reviewed Jan 10, 2024

View reviewed changes

Fix comment

51df1ca

Suggested by Matt. Co-authored-by: Matt Rice <[email protected]>

mrice32 reviewed Jan 10, 2024

View reviewed changes

pxrl commented Jan 10, 2024

View reviewed changes

src/dataworker/Dataworker.ts Show resolved Hide resolved

pxrl and others added 4 commits January 10, 2024 07:08

Nuke stray newline

70f65e9

Merge branch 'master' into pxrl/dynamicProposalRange

4697b59

Back out "missing fills" change

093d80b

Proceeding with the initial "invalid fills" change alone. The subsequent change can be re-introduced later.

lint

7a17005

pxrl requested review from nicholaspai and mrice32 January 17, 2024 11:12

Merge branch 'master' into pxrl/dynamicProposalRange

3a5b2c1

nicholaspai reviewed Jan 17, 2024

View reviewed changes

src/dataworker/Dataworker.ts Show resolved Hide resolved

nicholaspai reviewed Jan 17, 2024

View reviewed changes

src/dataworker/Dataworker.ts Show resolved Hide resolved

nicholaspai reviewed Jan 17, 2024

View reviewed changes

src/dataworker/Dataworker.ts Show resolved Hide resolved

nicholaspai reviewed Jan 17, 2024

View reviewed changes

pxrl and others added 3 commits January 17, 2024 19:47

Add comment

677c8b9

Proposed by Nick. Co-authored-by: nicholaspai <[email protected]>

reflow comment

c8673dc

Clarify iterative updates

7559948

pxrl requested a review from nicholaspai January 17, 2024 19:47

nicholaspai approved these changes Jan 22, 2024

View reviewed changes

Merge branch 'master' into pxrl/dynamicProposalRange

6c19e4b

mrice32 reviewed Jan 24, 2024

View reviewed changes

james-a-morris reviewed Jan 29, 2024

View reviewed changes

pxrl added 2 commits June 7, 2024 22:30

Merge remote-tracking branch 'origin/master' into pxrl/dynamicProposa…

5c88581

…lRange

lint

56bef8c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Dynamically adjust proposal block ranges #1139

feat: Dynamically adjust proposal block ranges #1139

pxrl commented Jan 5, 2024 •

edited

Loading

pxrl left a comment

linear bot commented Jan 5, 2024

pxrl commented Jan 5, 2024 •

edited

Loading

pxrl Jan 5, 2024

nicholaspai left a comment

nicholaspai Jan 6, 2024

pxrl Jan 17, 2024 •

edited

Loading

nicholaspai Jan 17, 2024

mrice32 Jan 10, 2024

pxrl Jan 10, 2024 •

edited

Loading

nicholaspai Jan 17, 2024

pxrl Jan 17, 2024 •

edited

Loading

mrice32 Jan 24, 2024

mrice32 Jan 10, 2024

pxrl Jan 17, 2024

mrice32 Jan 23, 2024

mrice32 Jan 10, 2024

pxrl Jan 10, 2024 •

edited

Loading

mrice32 Jan 23, 2024

nicholaspai Jan 17, 2024

pxrl Jan 17, 2024

pxrl Jan 17, 2024

nicholaspai left a comment

nicholaspai left a comment

mrice32 left a comment

mrice32 Jan 24, 2024

james-a-morris Jan 29, 2024

mrice32 Jan 24, 2024

mrice32 Jan 24, 2024

mrice32 Jan 24, 2024

james-a-morris Jan 29, 2024

james-a-morris Jan 29, 2024

james-a-morris left a comment

feat: Dynamically adjust proposal block ranges #1139

Are you sure you want to change the base?

feat: Dynamically adjust proposal block ranges #1139

Conversation

pxrl commented Jan 5, 2024 • edited Loading

pxrl left a comment

Choose a reason for hiding this comment

linear bot commented Jan 5, 2024

pxrl commented Jan 5, 2024 • edited Loading

Choose a reason for hiding this comment

nicholaspai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pxrl Jan 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pxrl Jan 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pxrl Jan 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pxrl Jan 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nicholaspai left a comment

Choose a reason for hiding this comment

nicholaspai left a comment

Choose a reason for hiding this comment

mrice32 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

james-a-morris left a comment

Choose a reason for hiding this comment

pxrl commented Jan 5, 2024 •

edited

Loading

pxrl commented Jan 5, 2024 •

edited

Loading

pxrl Jan 17, 2024 •

edited

Loading

pxrl Jan 10, 2024 •

edited

Loading

pxrl Jan 17, 2024 •

edited

Loading

pxrl Jan 10, 2024 •

edited

Loading