-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EpochRecover service event and transaction #420
EpochRecover service event and transaction #420
Conversation
@kc1116 Can you base this off the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I have limited background about the epoch smart contracts. Though, I feel there are a bunch of important details in this PR that still need to be worked out. Overall, we are filling various fields with values according to the happy path. But I miss explanations why those values we use (e.g. epoch counter, number of views for various phase, etc) are also correct for the unhappy path.
The general bar for us as BFT-engineers is that we can explain for every line of code in our PR, why it needs to be precisely like we wrote it. If that reasoning isn't absolutely trivial, a comment in the code is highly desired. There were multiple places, where I got stuck just not understanding why it is correct to chose a value according to your code.
I would suggest to go over the code again with the mindset:
-
during EFM, the Protocol State controlling the nodes has diverged from the Epoch Smart contract. Algorithmically, they are irreconcilable. The Epoch Recovery is a human intervention to reconcile both states:
- Due to the prior diversion, the happy path algorithm for moving to the next epoch doesn't apply to either states anymore. That is fine, because we are having humans force-override values via a governance transaction.
- So here, we are at the point where the Epoch Smart contract received the human override (governance transaction). The human provides some values, others are need to be computed. But how precisely? Keep in mind that the Epoch Smart contract potentially went though multiple Epochs that do not exist in reality.
- You are writing this code for processing the human override here. For every value relevant to epochs (modified or not) you need to have a argument essentially proving that (i) when the value is processed by the Protocol State, it brings the Protocol State back to the happy path and the resulting state it is compatible with all assumptions and consistency requirements of the Epoch Protocol State and (ii) when the value is processed by the Epoch Smart contract, the new value brings the Epoch Smart contract back to the happy path and the resulting state it is compatible with all assumptions and consistency requirements of the Epoch Smart contract.
Literally, I would suggest to go over every epoch-related value in this PR. Try to answer questions (i) and (ii) and modify your algorithm until you feel confident that you have a bullet-proof argument. If in doubt, talk to Jordan.
This will take time and we expect it to take time. Don't feel pressured to crank PRs out. Correctness and good documentation is by far a higher priority in our field of work than development speed.
contracts/epochs/FlowEpoch.cdc
Outdated
/// The last view (inclusive) of the staking phase. | ||
pub let stakingEndView: UInt64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above you state that "This struct is a 1:1 copy of the event EpochRecover". Assuming you mean its cadence definition right above, there is an inconsistency because EpochRecover
does not have the field stakingEndView
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overlooked unused var removed 381ca6f
contracts/epochs/FlowEpoch.cdc
Outdated
// Compute the target end time for the next epoch | ||
let timingConfig = FlowEpoch.getEpochTimingConfig() | ||
let proposedTargetDuration = timingConfig.duration | ||
let proposedTargetEndTime = timingConfig.getTargetEndTimeForEpoch(FlowEpoch.proposedEpochCounter()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have enough context, but I am worried that we might not be able to use the happy path timing config, depending on what we want the recovery epoch to do.
Even if it is ok to use the happy path EpochTimingConfig
, I would like a comment that explains why this is safe and which assumptions we are making (i.e. under which conditions this code would need to change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right we cannot use the EpochTimingConfig in the unhappy path (epoch recover) because we would have already added epoch extensions that will throw off the expected target end time, we may also want to tweak the duration for the recovery epoch. Once the recovery epoch is finished it is safe for subsequent epochs to use the timing config. I've remove this code and added target duration / end time as a parameter. a4943ec
contracts/epochs/FlowEpoch.cdc
Outdated
// Create new Epoch metadata for the next epoch | ||
// with the new values | ||
let newEpochMetadata = EpochMetadata( | ||
counter: FlowEpoch.currentEpochCounter + 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is correct. During EFM, the Protocol State formally remains in the same epoch, which is regularly being extended until we reach an EpochRecovery event. However, the Epoch Smart contract does not know this and behaves as if the epochs where just progressing on the happy path. So very likely, the Epoch smart contract and the Protocol State have a different notion of what the "next" epoch counter should be.
Same suggestion as before:
Even if it is ok to compute the epoch counter like this, I would like a comment that explains why this is safe and which assumptions we are making (i.e. under which conditions this code would need to change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is safe to use the epoch counter in the smart contract because the counter is not incremented when we enter EFM state. Som, despite adding EpochExtensions the counter will not be incremented until the manual intervention via epoch recover transaction. Added a comment 4c0dcf7
contracts/epochs/FlowEpoch.cdc
Outdated
clusterQCs: [], | ||
dkgKeys: dkgPubKeys) | ||
|
||
FlowEpoch.saveEpochMetadata(newEpochMetadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we save the EpochMetadata
and then change its fields? Is that safe? Can we avoid such confusion by maybe moving the saveEpochMetadata
call further below, or add a comment (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
furthermore, aren't we changing parameters for all upcoming epochs (assuming no further human intervention via governance transactions) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not changing the fields after we save the epoch metadata. The epoch metadata is saved and then epoch metadata along with other data (dkg views, cluster qcs etc) is emitted in the EpochRecover event. I've moved it to the end of the func for better clarity cbb2885 . Also once we intervene via the EpochRecover transaction and override the EpochMetadata upcoming epochs will increment the epoch counter as expected and override with the epoch metadata for that epoch. New epoch metadata is created in the upcoming epochs setup phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not changing the fields after we save the epoch metadata
I think calculateAndSetRewards
will actually set the reward amount field, after saving the epoch metadata. But this is following the same pattern as existing code. Maybe there's an opportunity to improve both codepaths.
contracts/epochs/FlowEpoch.cdc
Outdated
// Start a new Epoch, which increments the current epoch counter | ||
FlowEpoch.startNewEpoch() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Protocol State is the ultimate source of truth, because it controls what the network is actually doing.
Generally, when we process the Epoch Recovery transaction (governance transaction), I am not sure it is correct to start the new epoch right away. We should only start it once we reach the specified first view of the recovery transaction (I think ... but again not sure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Protocol State is the ultimate source of truth
I don't completely agree with that mindset. It's true that the Protocol State is the source of truth for the network, but in the steady state (ignoring sporks for a moment) its state is populated by trusted service events from the smart contracts. Ultimately the smart contracts are determining t
When the system contract emits the EpochRecover
event it can either:
- Immediately transition into the
RecoveryEpoch
- Wait until the start view, then transition into the
RecoveryEpoch
Intuitively (2) seems like the better option, but with the context that we are currently already in a bad state (EFM), I'm not fully convinced it is here. Neither are great options, but (1) is less complex and parallels how resetEpoch
works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly reviewed the Cadence code from a code quality perspective. I can't say anything in regards to the business logic
contracts/epochs/FlowEpoch.cdc
Outdated
DKGPhase1FinalView: proposedEpochMetadata.startView + self.configurableMetadata.numViewsInStakingAuction + self.configurableMetadata.numViewsInDKGPhase - 1 as UInt64, | ||
DKGPhase2FinalView: proposedEpochMetadata.startView + self.configurableMetadata.numViewsInStakingAuction + (2 as UInt64 * self.configurableMetadata.numViewsInDKGPhase) - 1 as UInt64, | ||
DKGPhase3FinalView: proposedEpochMetadata.startView + self.configurableMetadata.numViewsInStakingAuction + (3 as UInt64 * self.configurableMetadata.numViewsInDKGPhase) - 1 as UInt64, | ||
dkgPhase1FinalView: proposedEpochMetadata.startView + self.configurableMetadata.numViewsInStakingAuction + self.configurableMetadata.numViewsInDKGPhase - 1 as UInt64, | ||
dkgPhase2FinalView: proposedEpochMetadata.startView + self.configurableMetadata.numViewsInStakingAuction + (2 as UInt64 * self.configurableMetadata.numViewsInDKGPhase) - 1 as UInt64, | ||
dkgPhase3FinalView: proposedEpochMetadata.startView + self.configurableMetadata.numViewsInStakingAuction + (3 as UInt64 * self.configurableMetadata.numViewsInDKGPhase) - 1 as UInt64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: Remove unnecessary casts and introduce variables for the repeated long expressions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// gets into an EFM state the recoverEpoch transaction is used to override the current epoch and recover the network without | ||
/// sporking. This service event is processed by protocol participants to update their local protocol state with the new recover | ||
/// epoch data. | ||
access(all) event EpochRecover ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a massive event. Are you sure it's OK to include so much data right into the event?
It also seems like this event has (almost?) the same contents as RecoverEpochMetadata
?
nodeIDs: [String]) | ||
{ | ||
pre { | ||
FlowEpoch.isValidPhaseConfiguration(stakingEndView-startView+1, FlowEpoch.configurableMetadata.numViewsInDKGPhase, endView-startView+1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though it is only used here and not added, but maybe improve isValidPhaseConfiguration
and add argument labels, it's very hard to see what the arguments passed to the function are, and if they are correct. They all seem to have the same type too, which makes it very easy to mix them up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that commit got lost at some point.
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
lib/go/test/epoch_test_helpers.go
Outdated
epochAddress flow.Address, | ||
expectedRecover EpochRecover) { | ||
var emittedEvent EpochRecoverEvent | ||
evtTypeStr := fmt.Sprintf("A.%s.FlowEpoch.EpochRecover", epochAddress.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a common.AddressLocation
and common.Location.TypeID(...)
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First set of comments
contracts/epochs/FlowEpoch.cdc
Outdated
dkgPhase1FinalView: UInt64, | ||
dkgPhase2FinalView: UInt64, | ||
dkgPhase3FinalView: UInt64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contracts/epochs/FlowEpoch.cdc
Outdated
dkgPhase1FinalView: UInt64, | ||
dkgPhase2FinalView: UInt64, | ||
dkgPhase3FinalView: UInt64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would name these the same as EpochSetup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contracts/epochs/FlowEpoch.cdc
Outdated
/// Contains metadata about the recovery epoch, this data | ||
/// is stored at the storage path /storage/recoverEpochMetadata. | ||
/// This struct is a 1:1 copy of the event EpochRecover event and is | ||
/// used to populate all the fields of the event when the heartbeat | ||
/// detects a new RecoverEpochMetadata stored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this and just immediately emit the event now, alongside onflow/flow-go#5828
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// sanity check all nodes should have a weight > 0 | ||
for nodeID in nodeIDs { | ||
assert( | ||
FlowIDTableStaking.NodeInfo(nodeID: nodeID).initialWeight > 0, | ||
message: "all nodes in node ids list for recovery epoch must have a weight > 0" | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think w need to discuss how to handle this case in more detail, related to this open question: https://www.notion.so/flowfoundation/Spork-less-Epoch-Fallback-Recovery-Design-II-Epoch-Extensions-e923ca84b4d848c5a17de4b18fb5afd5?pvs=4#7b980d86cb7546be85b61c9ce729f837.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this last week and decided to allow the consensus committee to differ from the set of DKG participants. See onflow/flow-go#6214.
This means that we can strictly require all nodes have weight > 0, as implemented here.
contracts/epochs/FlowEpoch.cdc
Outdated
clusterQCs: [], | ||
dkgKeys: dkgPubKeys) | ||
|
||
FlowEpoch.saveEpochMetadata(newEpochMetadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not changing the fields after we save the epoch metadata
I think calculateAndSetRewards
will actually set the reward amount field, after saving the epoch metadata. But this is following the same pattern as existing code. Maybe there's an opportunity to improve both codepaths.
contracts/epochs/FlowEpoch.cdc
Outdated
// Start a new Epoch, which increments the current epoch counter | ||
FlowEpoch.startNewEpoch() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Protocol State is the ultimate source of truth
I don't completely agree with that mindset. It's true that the Protocol State is the source of truth for the network, but in the steady state (ignoring sporks for a moment) its state is populated by trusted service events from the smart contracts. Ultimately the smart contracts are determining t
When the system contract emits the EpochRecover
event it can either:
- Immediately transition into the
RecoveryEpoch
- Wait until the start view, then transition into the
RecoveryEpoch
Intuitively (2) seems like the better option, but with the context that we are currently already in a bad state (EFM), I'm not fully convinced it is here. Neither are great options, but (1) is less complex and parallels how resetEpoch
works.
contracts/epochs/FlowEpoch.cdc
Outdated
/// Calculate rewards for the current epoch | ||
/// and set the payout for the next epoch | ||
FlowEpoch.calculateAndSetRewards() | ||
/// Start a new Epoch, which increments the current epoch counter | ||
FlowEpoch.startNewEpoch() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible that the EpochRecover
will be rejected by the Protocol State, in which case we would need to send a new recoverEpoch
transaction.
We should make sure this function is decoupled from reward payout so we can, if necessary, re-try this process without double-paying rewards.
@@ -258,6 +258,24 @@ access(all) contract FlowClusterQC { | |||
} | |||
} | |||
|
|||
/// Represents the quorum certificate vote data for a signer | |||
/// of the certificate. | |||
access(all) struct ClusterQCVoteData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to match the current util
command output, I think this structure needs to have the form:
struct ClusterQCVoteData {
voteData: String // hex-encoded aggregated signature
voterIDs: [String]
}
* update to view functions for stable cadence * remove unreachable code * update templates to Stable Cadence * update to Stable Cadence preview 4 * update for stable cadence * fix parse error --------- Co-authored-by: Josh Hannan <[email protected]> Co-authored-by: Bastian Müller <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took another pass - my main change request is to make sure that the expected post-recovery epoch counter that we can compute in the efm-recover-tx-args
utility is passed explicitly to the transaction. I think this is something we overlooked previously, but including it would eliminate a class of potential problems.
nodeIDs: [String]) | ||
{ | ||
pre { | ||
FlowEpoch.isValidPhaseConfiguration(stakingEndView-startView+1, FlowEpoch.configurableMetadata.numViewsInDKGPhase, endView-startView+1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that commit got lost at some point.
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
/// sanity check all nodes should have a weight > 0 | ||
for nodeID in nodeIDs { | ||
assert( | ||
FlowIDTableStaking.NodeInfo(nodeID: nodeID).initialWeight > 0, | ||
message: "all nodes in node ids list for recovery epoch must have a weight > 0" | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this last week and decided to allow the consensus committee to differ from the set of DKG participants. See onflow/flow-go#6214.
This means that we can strictly require all nodes have weight > 0, as implemented here.
// state with the new Epoch data. | ||
// This transaction should only be used with the output of the bootstrap utility: | ||
// util epoch efm-recover-tx-args | ||
transaction(startView: UInt64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should include the epochCounter
of the RecoveryEpoch
here and use it for safety checks in the recoverNewEpoch
implementation.
In util epoch efm-recover-tx-args
, we know exactly what the RecoveryEpoch
epoch counter must be to be accepted by the Protocol State (current+1). If we don't pass that counter value here, then we're relying on the Protocol/SmartContract counters being synchronized in a particular way for correctness. In EFM, that is not guaranteed.
Currently we set the RecoveryEpoch
counter as follows:
- if
initNewEpoch
thenRecoveryEpoch.counter
= currentEpochCounter+1 - if
!initNewEpoch
thenRecoveryEpoch.counter
= currentEpochCounter
I would change this to:
- pass in
recoveryEpochCounter
as an argument - change
initNewEpoch
to beunsafeAllowOverwrite
In recoverNewEpoch
, check that recoveryEpochCounter == currentEpochCounter+1
.
- If that passes, then continue
- If that fails, then only continue if
unsafeAllowOverwrite
is set to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clusterQCs: nil, | ||
dkgKeys: dkgPubKeys, | ||
} | ||
verifyEpochMetadata(t, b, env, expectedMetadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also validate that curEpochCounter = startEpochCounter+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expectedRecoverEvent.dkgPhase1FinalView = newStartView + numStakingViews + numDKGViews - 1 | ||
expectedRecoverEvent.dkgPhase2FinalView = newStartView + numStakingViews + (2 * numDKGViews) - 1 | ||
expectedRecoverEvent.dkgPhase3FinalView = newStartView + numStakingViews + (3 * numDKGViews) - 1 | ||
verifyEpochRecover(t, adapter, idTableAddress, expectedRecoverEvent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also validate:
curEpochCounter = startEpochCounter+1
- check
epochMetadata
in addition to the EpochRecover event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contracts/epochs/FlowEpoch.cdc
Outdated
/// - All nodes in the node ids list have a weight > 0. | ||
/// If the configuration is a valid configuration the the staking auction, | ||
/// qc voting and dkg will be ended depending on the current epoch phase. | ||
access(all) fun recoverEpochPreChecks(startView: UInt64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider renaming this function (or moving the latter half that stops voting, DKG, or the staking auction to a separate function).
To me, "sanity checks" implies that we are validating inputs but not changing state. Currently the function is both validating inputs and modifying state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be access(self) also?
@@ -1516,3 +1517,203 @@ func TestEpochReset(t *testing.T) { | |||
assertEqual(t, CadenceUFix64("249999.99300000"), result) | |||
}) | |||
} | |||
|
|||
func TestEpochRecover(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add documentation for the test here?
Currently, I think we are testing:
- execute an epoch recover where
recoveryEpochCounter == currentEpochCounter+1
- execute an epoch recover where
recoveryEpochCounter == currentEpochCounter
After we explicitly include the recoveryEpochCounter
, I would also suggest testing the case where:
recoveryEpochCounter != currentEpochCounter+1
and theunsafeAllowOverwrite
flag is not set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I follow the last point, if we have 2 separate funcs one to recover new epoch and one to recover current epoch we dont need to unsafeAllowOverwrite we dont pass unsafeAllowOverwrite to any of the FlowEpoch funcs we only use it in the transaction to decide which func recoverNewEpoch
or recoverCurrentEpoch
to call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we are passing in the recoveryEpochCounter
, we must use that value as the epoch counter for the recovery epoch 100% of the time, otherwise we will produce an incompatible recovery epoch state. recoveryEpochCounter
is the source of truth.
In the commit which was merged to the feature branch, the logic is:
- if
unsafeAllowOverwrite
is set, then userecoveryEpochCounter
as the counter for the recovery epoch - otherwise, ignore the
recoveryEpochCounter
input and assume that the the smart contract and protocol state epoch counters are synchronized
I think conditional branch 2 is unsafe and we should respect the recoveryEpochCounter
codepath in all cases. Sorry if I didn't explain this properly in the previous review. The purpose of including recoveryEpochCounter
at all is to make sure it is used as the epoch counter for the recovery epoch being constructed in the transaction, in all circumstances.
Co-authored-by: Jordan Schalm <[email protected]>
Co-authored-by: Jordan Schalm <[email protected]>
Co-authored-by: Jordan Schalm <[email protected]>
Co-authored-by: Jordan Schalm <[email protected]>
…nflow/flow-core-contracts into khalil/5639-epoch-recovery-transaction
contracts/epochs/FlowEpoch.cdc
Outdated
|
||
/// Stops epoch components. If the configuration is a valid configuration the staking auction, | ||
/// qc voting and dkg will be ended depending on the current epoch phase. | ||
access(all) fun stopEpochComponents() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
access(self)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contracts/epochs/FlowEpoch.cdc
Outdated
/// - All nodes in the node ids list have a weight > 0. | ||
/// If the configuration is a valid configuration the the staking auction, | ||
/// qc voting and dkg will be ended depending on the current epoch phase. | ||
access(all) fun recoverEpochPreChecks(startView: UInt64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be access(self) also?
contracts/epochs/FlowEpoch.cdc
Outdated
@@ -432,6 +497,234 @@ access(all) contract FlowEpoch { | |||
FlowEpoch.account.storage.load<Bool>(from: /storage/flowAutomaticRewardsEnabled) | |||
FlowEpoch.account.storage.save(enabled, to: /storage/flowAutomaticRewardsEnabled) | |||
} | |||
|
|||
access(all) fun emitEpochRecoverEvent(epochCounter: UInt64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
access(self)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// does not calculate and set rewards avoiding double paying rewards for the same epoch. | ||
/// This meta data will be emitted in the EpochRecover service event. This function is used | ||
/// within sporks to recover the network from Epoch Fallback Mode (EFM). | ||
access(all) fun recoverCurrentEpoch(startView: UInt64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure we shouldn't combine this and recoverNewEpoch
into a single function? Might save some lines of code and we could potentially just detect which one we need. I don't know for sure though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that how it initially was but we decided to make them 2 distinct functions so that it is more obvious the difference in behavior.
This PR updates the FlowEpoch smart contract to support recovering the network while in Epoch Fallback Mode. It adds a new service event
EpochRecover
which contains the metadata for the recovery epoch. This metadata is generated out of band using the bootstrap utilityutil epoch efm-recover-tx-args
PR and submitted to the contract with the recovery_epoch.cdc transaction. The FlowEpoch contract will end the current epoch, start the recovery epoch and store the metadata for the recovery epoch in storage. This metadata will then be emitted to the network during the next heartbeat interval.