Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dynamic Protocol State] Refactoring to support orthogonal state machine operating on sub-states #5616

Conversation

durkmurder
Copy link
Member

#5556

Context

This PR implements a huge refactoring for addressing a few outstanding design issues summarized in linked issue. The answer to it was to change the design to one that better fits out needs.

In proposed implementation the design is centered around the KV Store. KV Store is used to store all data related to the Dynamic Protocol State. Processing pipeline was restructured to use orthogonal state machines where they operate on piece of the overall state(a sub-state) and modify only it.

The Epoch state machine was remolded into a hierarchical state machine and it's treated as another orthogonal state machine which is part of the Dynamic Protocol State.

Support for version upgrades/replication has been added as part of this PR as well.

Conceptually the biggest change is that after this refactoring we are moving from the idea of previous protocol state(ProtocolStateEntry, RichProtocolStateEntry, etc) will get replaced with KV Store and the previous protocol state becomes part of the KV Store.

⚠️ I have disabled large number of tests which all rely on an ability of Snapshot to provide KV Store, this is not implemented yet. Tests will be enabled again after implementing: #5316

…tocol State refactoring. Fixed tests. WIP on fixing last TODOs
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scope of the problem is getting more clear now. Unfortunately, it is bigger than I had hoped. See specifically this comment

state/protocol/badger/mutator.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/state/mutator.go Outdated Show resolved Hide resolved
durkmurder and others added 2 commits April 3, 2024 10:48
@durkmurder durkmurder requested a review from peterargue as a code owner April 3, 2024 07:50
…causality issue on block production. Implemented an alternative approach with deferred DB operations
@codecov-commenter
Copy link

codecov-commenter commented Apr 3, 2024

Codecov Report

Attention: Patch coverage is 70.41565% with 121 lines in your changes are missing coverage. Please review.

Project coverage is 53.58%. Comparing base (6fbc7d6) to head (40edad6).

Files Patch % Lines
state/protocol/protocol_state/epochs/factory.go 0.00% 23 Missing ⚠️
...ate/protocol/protocol_state/epochs/statemachine.go 83.19% 15 Missing and 5 partials ⚠️
state/protocol/protocol_state.go 0.00% 18 Missing ⚠️
...col/protocol_state/kvstore/upgrade_statemachine.go 74.28% 15 Missing and 3 partials ⚠️
state/protocol/protocol_state/kvstore/models.go 50.00% 14 Missing ⚠️
state/protocol/badger/state.go 77.50% 6 Missing and 3 partials ⚠️
state/protocol/protocol_state/kvstore/factory.go 0.00% 6 Missing ⚠️
cmd/scaffold.go 0.00% 5 Missing ⚠️
state/protocol/badger/mutator.go 40.00% 2 Missing and 1 partial ⚠️
...te/protocol/protocol_state/state/protocol_state.go 90.47% 2 Missing ⚠️
... and 3 more
Additional details and impacted files
@@                        Coverage Diff                         @@
##           feature/protocol-state-kvstore    #5616      +/-   ##
==================================================================
- Coverage                           55.96%   53.58%   -2.39%     
==================================================================
  Files                                1019     1017       -2     
  Lines                               98142    97935     -207     
==================================================================
- Hits                                54926    52477    -2449     
- Misses                              39103    41525    +2422     
+ Partials                             4113     3933     -180     
Flag Coverage Δ
unittests 53.58% <70.41%> (-2.39%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments (still about half of the PR to review)

state/protocol/protocol_state.go Outdated Show resolved Hide resolved
state/protocol/protocol_state.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/base_statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/base_statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/base_statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/state/mutator.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/state/mutator.go Outdated Show resolved Hide resolved
Comment on lines 118 to 119
// ApplyServiceEventsFromValidatedSeals applies the state changes that are delivered via
// sealed service events:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ApplyServiceEventsFromValidatedSeals applies the state changes that are delivered via
// sealed service events:
// ApplyServiceEventsFromValidatedSeals applies state changes associated with a candidate block.
// State changes are often triggered by a service event sealed by the candidate block, however they
// may also be triggered by entering a particular view.
// CAUTION: this method MUST be called for all candidates, even if `seals` is empty.
//

Comment on lines 202 to 203
// only exceptions should be propagated
err := stateMachine.ProcessUpdate(orderedUpdates)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// only exceptions should be propagated
err := stateMachine.ProcessUpdate(orderedUpdates)
// CAUTION: `ProcessUpdate` must be called, even if `orderedUpdates` is empty.
// only exceptions should be propagated
err := stateMachine.ProcessUpdate(orderedUpdates)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more warning messages I add, the more I feel we should change the interface instead 😅

state/protocol/protocol_state/kvstore.go Outdated Show resolved Hide resolved
return dbUpdates
}

// ProcessUpdate applies the state changes that are delivered via sealed service events.
Copy link
Member

@AlexHentschel AlexHentschel Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change of opinion 😅

I think the patter of "this function must be called even if there is no input" is a vastly unsafe pattern, because it creates this additional edge case across multiple abstraction layers. I think we would need CAUTION statements at this layer for example. I think we can simplify the usage patterns of our API in this regards. I another words, I think we need to fix it (TODO - not in this PR).

My opinion: we solve this problem at the lowest level. We want to guarantee to state machines that their OrthogonalStoreStateMachine.ProcessUpdate(..) method is called? I would suggest that we implement this guarantee at the StateMutator level:

@durkmurder durkmurder requested review from jordanschalm and removed request for peterargue April 4, 2024 18:03
Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got through the remaining files. A few minor suggestions, nothing major though. Great work!

Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR (and huge). Only minor suggestions. I have some thoughts on how to make DeferredDBUpdate a bit prettier ... but lets get this PR merged first.

state/protocol/badger/state.go Show resolved Hide resolved
state/protocol/protocol_state/epochs/statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/epochs/statemachine.go Outdated Show resolved Hide resolved
state/protocol/protocol_state/state/mutator_test.go Outdated Show resolved Hide resolved
@durkmurder
Copy link
Member Author

@AlexHentschel @jordanschalm FYI, I have commented out a much needed check for integrity of the sealing segment(which I have previously added as suggest from Alex) because we cannot support it in current state, due to different protocol state IDs at root block and the very next one.

@durkmurder durkmurder merged commit 27a79f1 into feature/protocol-state-kvstore Apr 5, 2024
50 of 51 checks passed
@durkmurder durkmurder deleted the yurii/5556-epochs-hierarchical-state-machine branch April 5, 2024 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants