Manual finalization endpoint new pruning #7060

michaelsproul · 2025-03-01T12:54:51Z

Proposed Changes

Spicy mashup of:

This is BACKWARDS INCOMPATIBLE due to migrating the DB schema to v23 for Lion's improvements.

We might want to just go all in on this for v7.0.0 proper.

This is broken, do not use. See:

Drop head tracker for summaries DAG #6744 (comment)

Currently we have very poor coverage of range sync with unit tests. With the event driven test framework we could cover much more ground and be confident when modifying the code. Add two basic cases: - Happy path, complete a finalized sync for 2 epochs - Post-PeerDAS case where we start without enough custody peers and later we find enough ⚠️ If you have ideas for more test cases, please let me know! I'll write them

…able

Address misc PeerDAS TODOs that are not too big for a dedicated PR I'll justify each TODO on an inlined comment

Currently we track a key metric `PEERS_PER_COLUMN_SUBNET` in a getter `good_peers_on_sampling_subnets`. Another PR sigp#6922 deletes that function, so we have to move the metric anyway. This PR moves that metric computation to the metrics spawned task which is refreshed every 5 seconds. I also added a few more useful metrics. The total set and intended usage is: - `sync_peers_per_column_subnet`: Track health of overall subnets in your node - `sync_peers_per_custody_column_subnet`: Track health of the subnets your node needs. We should track this metric closely in our dashboards with a heatmap and bar plot - ~~`sync_column_subnets_with_zero_peers`: Is equivalent to the Grafana query `count(sync_peers_per_column_subnet == 0) by (instance)`. We may prefer to skip it, but I believe it's the most important metric as if `sync_column_subnets_with_zero_peers > 0` your node stalls.~~ - ~~`sync_custody_column_subnets_with_zero_peers`: `count(sync_peers_per_custody_column_subnet == 0) by (instance)`~~

- PR sigp#6497 made obsolete some consistency checks inside the batch I forgot to remove the consumers of those errors Remove un-used batch sync error condition, which was a nested `Result<_, Result<_, E>>`

Addresses sigp#6854. PeerDAS requires unsubscribing a Gossip topic at a fork boundary. This is not possible with our current topic machinery. Instead of defining which topics have to be **added** at a given fork, we define the complete set of topics at a given fork. The new start of the show and key function is: ```rust pub fn core_topics_to_subscribe<E: EthSpec>( fork_name: ForkName, opts: &TopicConfig, spec: &ChainSpec, ) -> Vec<GossipKind> { // ... if fork_name.deneb_enabled() && !fork_name.fulu_enabled() { // All of deneb blob topics are core topics for i in 0..spec.blob_sidecar_subnet_count(fork_name) { topics.push(GossipKind::BlobSidecar(i)); } } // ... } ``` `core_topics_to_subscribe` only returns the blob topics if `fork < Fulu`. Then at the fork boundary, we subscribe with the new fork digest to `core_topics_to_subscribe(next_fork)`, which excludes the blob topics. I added `is_fork_non_core_topic` to carry on to the next fork the aggregator topics for attestations and sync committee messages. This approach is future-proof if those topics ever become fork-dependent. Closes sigp#6854

…able

Co-authored-by: Michael Sproul <[email protected]>

…te from disk when serving block by range requests.

…nalized_slot`), and use fork choice in that case.

…rilev/lighthouse into manual-finalization-endpoint

…nalization-endpoint-new-pruning

michaelsproul · 2025-03-01T12:57:55Z

Known issues:

HTTP heads endpoint returns blocks in fork choice that are descended from finalization but which have been pruned (because they don't descend from the fake finalization checkpoint). This is probably fine for now.

michaelsproul · 2025-03-01T13:12:15Z

Usage:

curl -X POST --data '{"state_root": "0x7c0b6538b5e0a5b47f66168d72e476c6b9bc8a9882acf407c78d8000d8eee3ba", "block_root": "0xa67f0695d5ea7e8d2bcc01b6b7fbee0178c2a16cbf97f5b65496e0518deb5baf", "epoch": "116805"}' http://localhost:5052/lighthouse/finalize

mergify · 2025-03-03T05:32:30Z

This pull request has merge conflicts. Could you please resolve them @michaelsproul? 🙏

michaelsproul · 2025-03-06T01:54:30Z

Closing in favour of:

Drop head tracker (Holesky rescue edition) #7080

dapplion and others added 30 commits February 3, 2025 16:04

Drop head tracker for summaries dag

be105d1

Improve state summary dag compute logic

8c15bab

Implement db schema upgrade and downgrade

8c9a1b2

Log about multiple roots in dag tree

10bbb2e

Add states descendants_of

28d7b74

Prune descendants of finalized checkpoint not finalized block

c5b4293

Prevent very long log line

066f96a

Update tests

663dfd3

Tweak logs

979e43a

Annotate SummariesDagError error

e56299c

Deprecate block DAG for pruning

91eab38

Remove some persisted head stuff

12fa5a8

Use slot clock in heads

ed97b97

Fix nodes_without_children

b9d8ae7

Merge remote-tracking branch 'origin/release-v7.0.0-beta.0' into unst…

ec2fe38

…able

Fix misc PeerDAS todos (sigp#6862)

3992d6b

Address misc PeerDAS TODOs that are not too big for a dedicated PR I'll justify each TODO on an inlined comment

Remove un-used batch sync error condition (sigp#6917)

431dd7c

- PR sigp#6497 made obsolete some consistency checks inside the batch I forgot to remove the consumers of those errors Remove un-used batch sync error condition, which was a nested `Result<_, Result<_, E>>`

Merge remote-tracking branch 'origin/release-v7.0.0-beta.0' into unst…

6ab6eae

…able

Fix compilation and remove error from heads

7033656

Merge remote-tracking branch 'origin/unstable' into drop-headtracker

1dc6d5e

Tidy and document migrate_database.

f6786eb

Tweaks in prune_hot_db.

37be9ae

Correct assert in revert_minority_fork_on_resume

cf3b776

Update consensus/proto_array/src/proto_array.rs

54010b0

Co-authored-by: Michael Sproul <[email protected]>

Use descent from finality instead of viability

7abbaeb

Clean up DB migrations

5cc266c

Prevent deletion of payloads >= split slot

abb3c3f

jimmygchen and others added 9 commits February 28, 2025 17:37

Load block roots from fork choice where possible to avoid loading sta…

2cb71e2

…te from disk when serving block by range requests.

Check if the start slot is newer than finalization (`start_slot >= fi…

bd093d9

…nalized_slot`), and use fork choice in that case.

force finalization endpoint

b03c18d

cleanup

9057a88

Merge branch 'holesky-rescue' into manual-finalization-endpoint

d43553c

Remove ds store

b567cc4

Merge branch 'manual-finalization-endpoint' of https://github.com/ese…

f397b87

…rilev/lighthouse into manual-finalization-endpoint

Don't import blocks that conflict with the split

9f4e757

Merge remote-tracking branch 'origin/drop-headtracker' into manual-fi…

25465a6

…nalization-endpoint-new-pruning

michaelsproul requested a review from jxs as a code owner March 1, 2025 12:54

michaelsproul added the backwards-incompat Backwards-incompatible API change label Mar 1, 2025

Fix descent from split check

3d2c9d5

michaelsproul mentioned this pull request Mar 2, 2025

lighthouse oom #7036

Open

jimmygchen added the do-not-merge label Mar 3, 2025

michaelsproul closed this Mar 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manual finalization endpoint new pruning #7060

Manual finalization endpoint new pruning #7060

michaelsproul commented Mar 1, 2025 •

edited

Loading

michaelsproul commented Mar 1, 2025 •

edited

Loading

michaelsproul commented Mar 1, 2025

mergify bot commented Mar 3, 2025

michaelsproul commented Mar 6, 2025

Manual finalization endpoint new pruning #7060

Manual finalization endpoint new pruning #7060

Conversation

michaelsproul commented Mar 1, 2025 • edited Loading

Proposed Changes

michaelsproul commented Mar 1, 2025 • edited Loading

michaelsproul commented Mar 1, 2025

mergify bot commented Mar 3, 2025

michaelsproul commented Mar 6, 2025

michaelsproul commented Mar 1, 2025 •

edited

Loading

michaelsproul commented Mar 1, 2025 •

edited

Loading