-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Epoch Extensions in HotStuff Committee #6154
Support Epoch Extensions in HotStuff Committee #6154
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## feature/efm-recovery #6154 +/- ##
========================================================
+ Coverage 41.68% 41.70% +0.01%
========================================================
Files 1975 1975
Lines 139306 139258 -48
========================================================
- Hits 58072 58071 -1
+ Misses 75182 75139 -43
+ Partials 6052 6048 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
- adds a single events channel, containing closures - event consumer function enqueues a closure invoking the appropriate function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
started to take a look, but didn't have much time. First set of comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me, nicely done!
Co-authored-by: Alexander Hentschel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the great work. Enjoyed reading the PR. Very clean code and detailed, clear documentation 👏. My comments are mostly suggestions for further refining code clarity, comments and documentation. Structurally, I'd like to suggest the following revisions
- reduce surface for inconsistent inputs for method
epochInfo.recomputeLeaderSelectionForExtendedViewRange
as described in my comment right below - I think adding the sanity check as described here would be good
- testing a broader range of views for the
committee.Consensus
(see comment here, here, here and here) - I have major reservations about including
EpochExtension
s in thesetupEpoch
as explained in this comment- While I think it is beyond the scope of this PR, I would like to explore if we can remove
setupEpoch
entirely (also explained in this comment). Could you please make sure an issue exists and add a links to my comments (i), (ii), (iii), (iv)
- While I think it is beyond the scope of this PR, I would like to explore if we can remove
// setupEpoch is an implementation of protocol.Epoch backed by an EpochSetup service event. | ||
// Includes any extensions which have been included as of the reference block. | ||
// This is used for converting service events to inmem.Epoch. | ||
type setupEpoch struct { | ||
// EpochSetup service event | ||
setupEvent *flow.EpochSetup | ||
extensions []flow.EpochExtension | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, setupEpoch
simply cannot be a valid representation of protocol.Epoch
as it misses a bunch of the important information. We have the convention that an epoch essentially only starts to exist after the EpochCommit Service event has been finalized. I think we had the plan to remove any exposure of uncommitted epochs from the major APIs. Though, NewSetupEpoch
is still used and returned in some of the broadly-used interfaces.
- For the scope of this PR, I would not include
extensions
in thesetupEpoch
because it further increases the inconsistency of protocol definition and implementation. Instead, I would include the extensions intocommittedEpoch
and override theFinalView()
method inherited fromsetupEpoch
. - Not sure if we have already an issue for removing
setupEpoch
as a return value from theEpochQuery
API and only return epochs once they are committed. Is such issue already exists, maybe add a reference to this comment. If not, would you mind creating such issue please? 🙏 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only committed epochs can have extensions
👍 Agree, will move the extensions to committedEpoch
.
We have the convention that an epoch essentially only starts to exist after the EpochCommit Service event has been finalized
Epoch APIs cannot entirely avoid exposing data for un-committed epochs, because the node software requires access to a subset of un-committed epoch data in order to commit that epoch, by completing the DKG and generating root cluster QCs.
There are other ways we can improve the epoch APIs, but we will need something similar to the progressive disclosure of setupEpoch
and committedEpoch
. For example, we could hide some info from the EpochSetup event until the epoch is committed (like the view range), and we could use different interface types to represent the two states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this tech debt issue as a place to capture these ideas for improving the API: #6191
} | ||
|
||
func (eq Epochs) Next() protocol.Epoch { | ||
switch eq.entry.EpochPhase() { | ||
case flow.EpochPhaseStaking, flow.EpochPhaseFallback: | ||
return invalid.NewEpoch(protocol.ErrNextEpochNotSetup) | ||
case flow.EpochPhaseSetup: | ||
return NewSetupEpoch(eq.entry.NextEpochSetup) | ||
return NewSetupEpoch(eq.entry.NextEpochSetup, eq.entry.NextEpoch.EpochExtensions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, this should yield an error ErrNextEpochNotCommitted
return invalid.NewEpoch(protocol.ErrNextEpochNotCommitted)
Though, this would require an update of the EpochQuery
API ... so maybe add a todo to the issue is mentioned in my previous comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #6154 (comment)
// No errors are expected during normal operations. | ||
func NewSetupEpoch(setupEvent *flow.EpochSetup) protocol.Epoch { | ||
func NewSetupEpoch(setupEvent *flow.EpochSetup, extensions []flow.EpochExtension) protocol.Epoch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, this method should be deprecated for the scope of this Pr, including a TODO to remove it entirely
// No errors are expected during normal operations. | |
func NewSetupEpoch(setupEvent *flow.EpochSetup) protocol.Epoch { | |
func NewSetupEpoch(setupEvent *flow.EpochSetup, extensions []flow.EpochExtension) protocol.Epoch { | |
// | |
// DEPRECATED and TO BE REMOVED, because setupEpoch simply cannot be a valid representation of protocol.Epoch | |
// as it misses a bunch of the important information. We have updated our convention such that an epoch | |
// essentially only starts to exist after the EpochCommit Service event has been finalized. The plan is to | |
// remove any exposure of uncommitted epochs from the `EpochQuery` API. | |
// | |
// No errors are expected during normal operations. | |
func NewSetupEpoch(setupEvent *flow.EpochSetup) protocol.Epoch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #6154 (comment)
@@ -419,12 +419,12 @@ func (q *EpochQuery) Next() protocol.Epoch { | |||
// if we are in setup phase, return a SetupEpoch | |||
nextSetup := entry.NextEpochSetup | |||
if phase == flow.EpochPhaseSetup { | |||
return inmem.NewSetupEpoch(nextSetup) | |||
return inmem.NewSetupEpoch(nextSetup, entry.NextEpoch.EpochExtensions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would suggest adding TODO to indicate that epochs will only be exposed once they are committed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #6154 (comment)
Co-authored-by: Alexander Hentschel <[email protected]>
Co-authored-by: Alexander Hentschel <[email protected]>
This PR updates the Consensus Committee to support epoch extensions. Addresses #5730.
Changes
protocol.Epoch
FinalView
method to account for any extensions.committees.Consensus
to handleEpochExtended
protocol event.When we observe an epoch extension, we simply re-compute the leader selection for the entire epoch, including the new view range added by the extension. This is slightly wasteful, incurring a cost of around 20-30ms per extension on Mainnet, but simplifies the business logic and state quite a bit compared to tracking each extension independently.
Now that we can move into and out of EFM several times over the lifecycle of the component, it is easier to reason about correctness if we use a shared channel for events to enforce in-order processing. The event consumer function prepares a (potentially blocking) event handler function to be executed and passes the closure into the channel. The worker goroutine pulls event handler closures off the channel and executes them in order.