Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: archiver to follow "fork-choice" of contract #8457

Closed
5 tasks done
Tracked by #7614
LHerskind opened this issue Sep 9, 2024 · 1 comment
Closed
5 tasks done
Tracked by #7614

feat: archiver to follow "fork-choice" of contract #8457

LHerskind opened this issue Sep 9, 2024 · 1 comment
Assignees

Comments

@LHerskind
Copy link
Contributor

LHerskind commented Sep 9, 2024

[!success] Goal
The archiver should:

  • only look for blocks if any blocks were proposed;
  • download non-pruned history only;
  • identify a prune that its view is effected by;
  • store a list of prunes that effected it.

Currently the archiver continuously look for block proposal events emitted by the Rollup contract.
This have historically caused some pain for initial synching, since it started at block 0, and looked for events before any even had the possibility of being emitted.
Furthermore, it means that we might start searching for events for a period where no blocks have been proposed.

First, we suggest using the Rollup::GENESIS_TIME as a proxy for when to start, potentially adding Rollup::GENESIS_TIME_IN_L1_BLOCK_NUMBER such that the point of start can simple be read directly from the contract. This will allow us for a simpler setup, and avoid searching for something in the void.
Second, if we look at the values Rollup::provenBlockCount and Rollup::pendingBlockCount and keep track of these locally, we will know when blocks have been proposed. This allows us to easily skip the event search when we know that nothing will be found.

In relation to pruned blocks, the archiver need to handle those differently depending on how it is encountering them. We will go through some cases now.

The fresh sync:
The archiver have no local state. Does it gain anything from downloading blocks that were pruned? No, these would never be applied to the world state, so we might as well discard them. Better than discarding them, we should not download them at all.
Consider the event Rollup::L2BlockProposed, it currently have just the block number, giving us very little to figure out if it is pruned or not. Lets say that we alter it:

// Old
event L2BlockProposed(uint256 indexed blockNumber);

// New
event L2BlockProposed(uint256 indexed blockNumber, uint256 archive);

By including the archive value, we have more information that we can use to validate that the proposal was not pruned, without needing to download it in full first.

For each event we find, we check archive == Rollup::archiveAt(blockNumber). If the check passes, the block is currently part of the chain and we can download it, otherwise it have been pruned and we just ignore it.

Synching after shutdown:
Say that we have an archiver that have previously been running at the head of the chain but we shut it down over the weekend, and now we want to start it again and catch up.

At the start of every loop, call Rollup::archiveAt() for the blockNumber of your current chain tip. If the returned value does not match the archive of your chain tip, a prune have executed while you were gone.
We need to figure out where the prune happened. We at this point know that it have happened sometime between our proven and pending blocks, so we can simply perform a binary search in there to find the last block where the Rollup::archiveAt() will match. For any blocks we have stored that have a higher block number than the found value, we delete the blocks.

At this point, the archive includes only non-pruned blocks. However, the world-state might include changes made by these blocks, and we need those undone. We therefore want to tell the world state that a prune that have impacted our state have happened. To ensure that it can correctly unwind state, we should tell it at what block it should unwind until, and what we expect that state root to be, e.g., {blockNumber, archive}. The world state unwinds until these values are hit.

We should insert these into a store before we delete the blocks, such that a node failure in the moment where blocks have been deleted but the world-state have not been unwind is not a problem. I think the check should happen as part of collectAndProcessBlocks to ensure that we don't start applying a bunch of blocks before unwinding them.

When the unwind have occurred, we should be able to simple progress as usual.

To optimise the number of calls needed, we can add the status() function to the Rollup, that way, we fetch multiple of the values we are interested in, with just a single call.

function status(uint256 myHeaderBlockNumber) 
	external 
	view 
	returns(uint256, bytes32, uint256, bytes32, bytes32) 
{
	bytes32 archiveOfMyBlock = myHeaderBlockNumber < pendingBlockCount 
	? archiveAt(myHeaderBlockNumber)
	: bytes32(0);

	return (
		provenBlockCount,
                blocks[provenBlock - 1].archive,
		pendingBlockCount,
		blocks[pendingBlock - 1].archive,
		archiveOfMyBlock
	);
}
@github-project-automation github-project-automation bot moved this to Todo in A3 Sep 9, 2024
@LHerskind LHerskind self-assigned this Sep 13, 2024
@LHerskind LHerskind added this to the Sequencer & Prover Testnet milestone Sep 24, 2024
LHerskind added a commit that referenced this issue Sep 26, 2024
Takes the same approach as done in other optimisations tasks related to
#8457 and before it go looking at the events, it will check values in
the contract to figure out if there is even anything to look for. If we
figure that there is nothing to look for, we update the l1 block number
of interest, and call it a day. This allow getting rid of a bunch of get
logs call and replacing them with a single function call instead. Does
not impact the tests in E2E significantly because they are running on
anvil without anything but our rollup happening so there is not a lot of
wasted effort there, but should see meaningful changes when pointed at
larger systems.
@LHerskind
Copy link
Contributor Author

The archiver can now follow the L2 forkchoice - the state is yet to come.

@github-project-automation github-project-automation bot moved this from Todo to Done in A3 Sep 27, 2024
Rumata888 pushed a commit that referenced this issue Sep 27, 2024
Takes the same approach as done in other optimisations tasks related to
#8457 and before it go looking at the events, it will check values in
the contract to figure out if there is even anything to look for. If we
figure that there is nothing to look for, we update the l1 block number
of interest, and call it a day. This allow getting rid of a bunch of get
logs call and replacing them with a single function call instead. Does
not impact the tests in E2E significantly because they are running on
anvil without anything but our rollup happening so there is not a lot of
wasted effort there, but should see meaningful changes when pointed at
larger systems.
Rumata888 pushed a commit that referenced this issue Sep 27, 2024
Takes the same approach as done in other optimisations tasks related to
#8457 and before it go looking at the events, it will check values in
the contract to figure out if there is even anything to look for. If we
figure that there is nothing to look for, we update the l1 block number
of interest, and call it a day. This allow getting rid of a bunch of get
logs call and replacing them with a single function call instead. Does
not impact the tests in E2E significantly because they are running on
anvil without anything but our rollup happening so there is not a lot of
wasted effort there, but should see meaningful changes when pointed at
larger systems.
Rumata888 pushed a commit that referenced this issue Sep 27, 2024
Takes the same approach as done in other optimisations tasks related to
#8457 and before it go looking at the events, it will check values in
the contract to figure out if there is even anything to look for. If we
figure that there is nothing to look for, we update the l1 block number
of interest, and call it a day. This allow getting rid of a bunch of get
logs call and replacing them with a single function call instead. Does
not impact the tests in E2E significantly because they are running on
anvil without anything but our rollup happening so there is not a lot of
wasted effort there, but should see meaningful changes when pointed at
larger systems.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

1 participant