-
Notifications
You must be signed in to change notification settings - Fork 996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Require fork choice to run before proposal #2878
Conversation
I suspect that some clients are already running fork choice aggressively like this, but Lighthouse is not, and anyone following the spec to the letter will not be either. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this is an error
but we need to be careful with the fix.
The intention is "build on a block from a slot that is less than slot
but for the fork choice run at start of slot
". You want to avoid building on a block from slot
which could potentially be your head if there were divergent shufflings and some weirdness.
Prysm implements the identical behavior to the spec today. Prysm will require some implementation changes to support this as we currently only process fork choice attestations twice per slot. I think I agree that |
Teku doesn't currently run fork choice prior to proposing a block. However we do run fork choice (without apply pending attestations) when we import a new block if it isn't the child of our current head (if it is it automatically becomes the new head so we don't need to run fork choice). So a block that causes a reorg because it benefits from proposer boost would become the head in Teku as soon as we import it. It's trivial for us to run get_head prior to proposing though - it was required in an earlier version of proposer boost we implemented so we still have a |
lodestar also doesn't run update_head before assembling block although its should be trivial to call update_head to process queued attestations, but there is still an issue in post merge scenario, that such a reorg of the head at start of the slot will lead to empty execution block generation. Should the attestations of |
0e14f17
to
3f76792
Compare
Good point. This was implicit in the state transition validity before, but my new commit makes it explicit.
I think this is a good strategy, however the re-orgs we're now seeing on mainnet are on proposer boosted nodes which initially see C (the boosted block from my example above) as head, and then re-org to B when the rest of the non-boosted nodes attest to it with more weight. I think Teku would see C as head due to the boost, but wouldn't re-run fork choice before proposing D, and might incorrectly attest to D (because it's the child of current head, C?). Even before we ship this spec change the situation should improve as more users upgrade to proposer boost releases, because then re-orgs back to B that go against the boost will be less common.
I think your idea of running fork choice between the aggregation attestation deadline (8s) and the next slot is a good one. Maybe we should require fork choice to run 10s ( |
Yep, I agree with this. |
This I think is a bug with or without this PR merged right? Teku should not attest to |
Fork choice works by walking forwards along the chain, selecting between competing child blocks. If we've run fork choice and found that X is our chain head, then we must not have any children of X or they would have been our head (no other choice and fork choice always follows to the tip of the chain). So if we import a block that is the child of our current head, it must be the only child of that block and thus fork choice would select it as the new chain head. If we apply changes to weightings by processing attestations then we might reorg to a different chain, but until we do that we can safely update our chain head. Notably we ensure we have applied the new weightings from the previous slot attestations and run fork choice prior to creating attestations. The important thing to note here is that attestations aren't for your chain head. They're for they block on your canonical chain at the slot the attestation is for. Similarly when creating a block the parent is the block on the canonical chain prior to the requested block slot. Normally the VC and BN have clocks in sync so they both work out to your chain head, but the attestation or block slot is specified in the REST API request and must be respected. |
I think this is technically wrong: when you import D which continues the canonical chain and you have not yet applied weightings from attestations that appeared in the previous slot. You are saying "D is my new head for slot N+1, modulo what may have happened from attestations in slot N, I promise not to act on this D until I actually check that it deserved to be head in N+1". Either way I see contradictory statements here: either Teku attests to D because it became head only by being the child of C, which is what I read in the previous message from Michael and its reply, or Teku may actually attest for something different than D then since you apply previous attestations before acting as in your last message. I find this quite difficult even to implement: if you import D and you actually include the attestations that D included in your forkchoice, then you are counting whatever D included from the previous slot, which is biased for D's view of the chain, and not the attestations that the node gathered during that slot. |
Hmm, not really. I'm saying that based on my current world view (ie the attestations I've previously processed) C is my chain head. Then I import D and the attestations haven't changed, so D must be my chain head. I'm not running fork choice to select D, I just know that if C was my head and D is the first child then D would be the chain head based on the existing weightings I have.
In this situation there's a reorg required so every node is going to have to change their chain head when they process attestations from the prior slot. The question is just whether your changing from C to B or from D to B. That's fine, but in the normal case where there isn't a reorg Teku will already have the right head selected and has a head start in processing the various things that are triggered by updating the chain head.
We don't run fork choice to make D the head because we don't need to. But D can only include attestations from prior slots anyway, all of which are now valid to consider so there's nothing wrong with considering them at this point. We can add additional information to get a better view of the network by also considering attestations we received via gossip but that weren't in a block but there's no guarantee that different nodes have the same view of those attestations anyway. But I think we're quite off topic for this issue at this point. :) |
## Issue Addressed Upcoming spec change ethereum/consensus-specs#2878 ## Proposed Changes 1. Run fork choice at the start of every slot, and wait for this run to complete before proposing a block. 2. As an optimisation, also run fork choice 3/4 of the way through the slot (at 9s), _dequeueing attestations for the next slot_. 3. Remove the fork choice run from the state advance timer that occurred before advancing the state. ## Additional Info ### Block Proposal Accuracy This change makes us more likely to propose on top of the correct head in the presence of re-orgs with proposer boost in play. The main scenario that this change is designed to address is described in the linked spec issue. ### Attestation Accuracy This change _also_ makes us more likely to attest to the correct head. Currently in the case of a skipped slot at `slot` we only run fork choice 9s into `slot - 1`. This means the attestations from `slot - 1` aren't taken into consideration, and any boost applied to the block from `slot - 1` is not removed (it should be). In the language of the linked spec issue, this means we are liable to attest to C, even when the majority voting weight has already caused a re-org to B. ### Why remove the call before the state advance? If we've run fork choice at the start of the slot then it has already dequeued all the attestations from the previous slot, which are the only ones eligible to influence the head in the current slot. Running fork choice again is unnecessary (unless we run it for the next slot and try to pre-empt a re-org, but I don't currently think this is a great idea). ### Performance Based on Prater testing this adds about 5-25ms of runtime to block proposal times, which are 500-1000ms on average (and spike to 5s+ sometimes due to state handling issues 😢 ). I believe this is a small enough penalty to enable it by default, with the option to disable it via the new flag `--fork-choice-before-proposal-timeout 0`. Upcoming work on block packing and state representation will also reduce block production times in general, while removing the spikes. ### Implementation Fork choice gets invoked at the start of the slot via the `per_slot_task` function called from the slot timer. It then uses a condition variable to signal to block production that fork choice has been updated. This is a bit funky, but it seems to work. One downside of the timer-based approach is that it doesn't happen automatically in most of the tests. The test added by this PR has to trigger the run manually.
## Issue Addressed Upcoming spec change ethereum/consensus-specs#2878 ## Proposed Changes 1. Run fork choice at the start of every slot, and wait for this run to complete before proposing a block. 2. As an optimisation, also run fork choice 3/4 of the way through the slot (at 9s), _dequeueing attestations for the next slot_. 3. Remove the fork choice run from the state advance timer that occurred before advancing the state. ## Additional Info ### Block Proposal Accuracy This change makes us more likely to propose on top of the correct head in the presence of re-orgs with proposer boost in play. The main scenario that this change is designed to address is described in the linked spec issue. ### Attestation Accuracy This change _also_ makes us more likely to attest to the correct head. Currently in the case of a skipped slot at `slot` we only run fork choice 9s into `slot - 1`. This means the attestations from `slot - 1` aren't taken into consideration, and any boost applied to the block from `slot - 1` is not removed (it should be). In the language of the linked spec issue, this means we are liable to attest to C, even when the majority voting weight has already caused a re-org to B. ### Why remove the call before the state advance? If we've run fork choice at the start of the slot then it has already dequeued all the attestations from the previous slot, which are the only ones eligible to influence the head in the current slot. Running fork choice again is unnecessary (unless we run it for the next slot and try to pre-empt a re-org, but I don't currently think this is a great idea). ### Performance Based on Prater testing this adds about 5-25ms of runtime to block proposal times, which are 500-1000ms on average (and spike to 5s+ sometimes due to state handling issues 😢 ). I believe this is a small enough penalty to enable it by default, with the option to disable it via the new flag `--fork-choice-before-proposal-timeout 0`. Upcoming work on block packing and state representation will also reduce block production times in general, while removing the spikes. ### Implementation Fork choice gets invoked at the start of the slot via the `per_slot_task` function called from the slot timer. It then uses a condition variable to signal to block production that fork choice has been updated. This is a bit funky, but it seems to work. One downside of the timer-based approach is that it doesn't happen automatically in most of the tests. The test added by this PR has to trigger the run manually.
## Issue Addressed Upcoming spec change ethereum/consensus-specs#2878 ## Proposed Changes 1. Run fork choice at the start of every slot, and wait for this run to complete before proposing a block. 2. As an optimisation, also run fork choice 3/4 of the way through the slot (at 9s), _dequeueing attestations for the next slot_. 3. Remove the fork choice run from the state advance timer that occurred before advancing the state. ## Additional Info ### Block Proposal Accuracy This change makes us more likely to propose on top of the correct head in the presence of re-orgs with proposer boost in play. The main scenario that this change is designed to address is described in the linked spec issue. ### Attestation Accuracy This change _also_ makes us more likely to attest to the correct head. Currently in the case of a skipped slot at `slot` we only run fork choice 9s into `slot - 1`. This means the attestations from `slot - 1` aren't taken into consideration, and any boost applied to the block from `slot - 1` is not removed (it should be). In the language of the linked spec issue, this means we are liable to attest to C, even when the majority voting weight has already caused a re-org to B. ### Why remove the call before the state advance? If we've run fork choice at the start of the slot then it has already dequeued all the attestations from the previous slot, which are the only ones eligible to influence the head in the current slot. Running fork choice again is unnecessary (unless we run it for the next slot and try to pre-empt a re-org, but I don't currently think this is a great idea). ### Performance Based on Prater testing this adds about 5-25ms of runtime to block proposal times, which are 500-1000ms on average (and spike to 5s+ sometimes due to state handling issues 😢 ). I believe this is a small enough penalty to enable it by default, with the option to disable it via the new flag `--fork-choice-before-proposal-timeout 0`. Upcoming work on block packing and state representation will also reduce block production times in general, while removing the spikes. ### Implementation Fork choice gets invoked at the start of the slot via the `per_slot_task` function called from the slot timer. It then uses a condition variable to signal to block production that fork choice has been updated. This is a bit funky, but it seems to work. One downside of the timer-based approach is that it doesn't happen automatically in most of the tests. The test added by this PR has to trigger the run manually.
Presently the honest validator spec states that the proposer at
slot
should build their block upon what they think the head is duringslot - 1
:The problem with this is that the attestations from
slot - 1
are queued awaiting processing until the start ofslot
, and have the potential to cause a re-org away from what the head was inslot - 1
.The result is that a proposer's block may get immediately re-orged out and have no hope of becoming the head. Consider the following example:
There is a contest between B and C as to which block should be the head during slot C. For the sake of argument let's say that block C is the head during slot C, but that the queued attestations from slot C will tip fork choice in favour of block B. When the proposer of D makes their block following the current spec they build atop C, as it was the head during slot D - 1 = C. As soon as this block is applied to the fork choice, so are the attestations from slot C, so the head re-orgs back to B (assuming no proposer boost for D, more on that in a moment).
The role of proposer boost
If D is able to receive a proposer boost (due to being on time, and proposer boost being enabled) then it is more likely to succeed in becoming the head because its boost can counter the queued attestations. However if the weight of queued attestations exceeds the boost then the re-org will still occur immediately and D will have no hope of becoming the head.
Proposer boosting can also play a role in establishing the contest between B & C. The only way for C to be head during slot C in the scenario above is if it is receiving a proposer boost in excess of the voting weight on B (from slot B). In other words, without proposer boost there would be no reason for 0-weight C to be head, as B has strictly more weight behind it. However, it's possible that there could be a contest between two blocks over a longer timeframe: if we imagine that B and C are the heads of two longer competing chains sharing common ancestor A.
Therefore the flaw in the honest proposer logic is independent of the proposer boost feature.
Proposed fix
I think that proposers should run fork choice at the start of the slot they are due to propose in and build upon the head as it appears then.
In practical terms, this would require beacon chain clients to run
get_head
immediately before proposing a block. This likely represents <25ms of processing time at current validator counts.