-
Notifications
You must be signed in to change notification settings - Fork 984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Historical batches #2649
Historical batches #2649
Conversation
This PR, a continuation of replaces `historical_roots` with `historical_block_roots`. By keeping an accumulator of historical block roots in the state, it becomes possible to validate the entire block history that led up to that particular state without executing the transitions, and without checking them one by one in backwards order using a parent chain. This is interesting for archival purposes as well as when implementing sync protocols that can verify chunks of blocks quickly, meaning they can be downloaded in any order. It's also useful as it provides a canonical hash by which such chunks of blocks can be named, with a direct reference in the state. In this PR, `historical_roots` is frozen at its current value and `historical_batches` are computed from the merge epoch onwards. After this PR, `block_batch_root` in the state can be used to verify an era of blocks against the state with a simple root check. The `historical_roots` values on the other hand can be used to verify that a constant distributed with clients is valid for a particular state, and therefore extends the block validation all the way back to genesis without backfilling `block_batch_root` and without introducing any new security assumptions in the client. As far as naming goes, it's convenient to talk about an "era" being 8192 slots ~= 1.14 days. The 8192 number comes from the SLOTS_PER_HISTORICAL_ROOT constant. With multiple easily verifable blocks in a file, it becomes trivial to offload block history to out-of-protocol transfer methods (bittorrent / ftp / whatever) - including execution payloads, paving the way for a future in which clients purge block history in p2p. This PR can be applied along with the merge which simplifies payload distribution from the get-go. Both execution and consensus clients benefit because from the merge onwards, they both need to be able to supply ranges of blocks in the sync protocol from what effectively is "cold storage". Another possibility is to include it in a future cleanup PR - this complicates the "cold storage" mode above by not covering exection payloads from start.
a94471f
to
1c8d57e
Compare
avoids changing "header" fields in state
@@ -213,7 +227,7 @@ class BeaconState(Container): | |||
latest_block_header: BeaconBlockHeader | |||
block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT] | |||
state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT] | |||
historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT] | |||
historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT] # Frozen in Merge, replaced by historical_batches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT] # Frozen in Merge, replaced by historical_batches | |
historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT] # Frozen in Capella, replaced by historical_batches |
while this is a strict decrease in UX (more data to wrangle, process), we could not change the consensus protocol and simply supply merkle branches along w/ the block roots in the "era" format and still retain the verification property |
Yes, though at that point I think we seriously need to consider what the historical_roots field is doing here at all - ie the status quo is the worst of both worlds: we have an accumulator that grows forever and that can't really be used for anything useful without jumping through hoops. The question "does this block belong" is a fundamental one and this PR brings the cost of answering that from O(N) to O(1) basically - mixing the state in there is not fundamental because the state is a derivative of the block except when nothing happened, ie the only raison d'etre for historical roots in its current shape is to show that something did not happen at the tail of a block history (ie to prove that the empty state transition was done correctly) - everything else is already baked into the block root as far has "accumulation" goes.
also, the "era" format doesn't actually need this PR - a design decision in "era" was to include a state for ever "epoch" which lines up with a historical root - in the era file, the individual block roots of each block in that era are natively available from the state - in era files, this makes sense because each era file can then serve as a "starting point" to compute an arbitrary state in the next era (again bringing the cost of computing an arbitrary beacon chain state from O(n) to O(1)), but it comes at a cost: we have to store a state every day. This PR unlocks distinct use cases compared to what era files solve (in particular, a single state is enough to verify all history, instead of one per era). |
This PR, a continuation of
#2428, simplifies and
replaces
historical_roots
withhistorical_block_roots
.By keeping an accumulator of historical block roots in the state, it
becomes possible to validate the entire block history that led up to
that particular state without executing the transitions, and without
checking them one by one in backwards order using a parent chain.
This is interesting for archival purposes as well as when implementing
sync protocols that can verify chunks of blocks quickly, meaning they
can be downloaded in any order.
It's also useful as it provides a canonical hash by which such chunks of
blocks can be named, with a direct reference in the state.
In this PR,
historical_roots
is frozen at its current value andhistorical_batches
are computed from the merge epoch onwards.After this PR,
block_batch_root
in the state can be used to verify anera of blocks against the state with a simple root check.
The
historical_roots
values on the other hand can be used to verifythat a constant distributed with clients is valid for a particular
state, and therefore extends the block validation all the way back to
genesis without backfilling
block_batch_root
and without introducingany new security assumptions in the client.
As far as naming goes, it's convenient to talk about an "era" being 8192
slots ~= 1.14 days. The 8192 number comes from the
SLOTS_PER_HISTORICAL_ROOT constant.
With multiple easily verifable blocks in a file, it becomes trivial to
offload block history to out-of-protocol transfer methods (bittorrent /
ftp / whatever) - including execution payloads, paving the way for a
future in which clients purge block history in p2p.
This PR can be applied along with the merge which simplifies payload
distribution from the get-go. Both execution and consensus clients
benefit because from the merge onwards, they both need to be able to
supply ranges of blocks in the sync protocol from what effectively is
"cold storage".
Another possibility is to include it in a future cleanup PR - this
complicates the "cold storage" mode above by not covering exection
payloads from start.