-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to continue lookups after adding peers #5993
Attempt to continue lookups after adding peers #5993
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me too!
It's indeed a rare case (wasn't easy for me to reproduce) and seem to happen on only few overwhelmed node mostly.
I'm merging this into the testing branch. Please do not merge to release branch yet. |
Squashed commit of the following: commit 6d3d06b Author: dapplion <[email protected]> Date: Tue Jun 25 16:50:26 2024 +0200 Attempt to continue lookups after adding peers
@mergify queue |
✅ The pull request has been merged automaticallyThe pull request has been merged automatically at b38019c |
Issue Addressed
Testing v5.2.1 a node hit a corner case of lookup sync causing a lookup to get stuck:
The lookup was originally created from an unknown parent block event with a block as child components. Therefore the original state of each request was:
When the parent imported successfully, it got called lookup
continue_request
. For each request:lighthouse/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs
Line 202 in f1d88ba
When the block completes processing and it gets the MissingComponents,
continue_request
is called again, but the lookup still has 0 peers. So no progress is made.Then immediately after that attempt at
continue_request
the lookup gets peers. When thedrop_lookups_without_peers
function runs, the lookup has peers but nothing is trying to make progress on it.It appears the comment below is too optmistic. Yes this situation is a rare case but it results in lookups getting stuck. We should attempt to make progress on lookups when adding peers
lighthouse/beacon_node/network/src/sync/block_lookups/mod.rs
Lines 852 to 856 in f1d88ba
Proposed Changes
Attempt to make progress on existing lookups that are not awaiting parent and got new peers