-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clear the global session cache when a node is unjailed #1529
Clear the global session cache when a node is unjailed #1529
Conversation
Thanks @msmania. Logically, this makes sense to me and I've confirmed that we call My only concern is that I don't have the resources right now to validate this and the testing coverage isn't good enough either. Ccing @PoktBlade for a 2nd opinion if this is good to just 🚢 ? |
Good find, you are correct that we should be clearing the session cache when unjailing, and this could cause consensus issues due to caching causing nondeterministic behavior. Most validators and servicers should not have the session cached until the claim/proof lifecycle. They would only if the validator/servicer is in the same app session. Furthermore, sessions are cleared in general based on certain transactions. TLDR - we're riding off luck with nondeterministic behavior Where does the risk lie?: Even without LP, the probability of it happening is non-zero. I am unsure if we should merge this change, or if we should consider a consensus-breaking change for this otherwise we will have some validators clearing the cache and some validators not clearing the cache, which could cause more issues. |
If we want to play it 100% safe, then an upgrade height would be ideal, and release it ASAP. |
I don't mean to distract this conversation, but I added a sample implementation here: It compiles, that's as far as I've tested it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few comments to the PR in #1532 that has the flag. Do you mind copying those changes into this PR?
AI-Generated Pull Request Summary: This pull request includes two patches:
|
AI-Generated Pull Request Summary: This pull request includes two patches:
|
"github.com/pokt-network/pocket-core/crypto" | ||
exported2 "github.com/pokt-network/pocket-core/x/apps/exported" | ||
"github.com/pokt-network/pocket-core/x/auth" | ||
"github.com/pokt-network/pocket-core/x/gov" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this change?
x/gov/module.go importing x/pocketcore/types caused a cyclic import in the x/pocketcore/types package because this file imported x/gov.
It seems we don't need makeTestCodec()
at all. I verified all tests under x/pocketcore/types and x/pocketcore/keeper passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable to me
Done! |
This is a fix for a consensus failure with
wrong Block.Header.AppHash
, which is probably the same issue as #1478.Issue description:
For a node to be selected for a session, it must be valid not only at the start of a session but also at the end of a session. If a node is edit-staked, jailed, or unjailed in the middle of a session, a servicer set of the session is changed.
When a node serves relays, on the other hand, it stores session information into the global session cache. If the session's servicer set is changed, the node needs to update the cached session. Otherwise the node accepts a claim that should be rejected, or rejects a claim that should be accepted. As a result, the node ends up with a consensus failure with
wrong Block.Header.AppHash
.Root cause and Fix:
The root cause is simple. We have a call to
ClearSessionCache
inEditStakeValidator
,LegacyForceValidatorUnstake
,ForceValidatorUnstake
, andJailValidator
, but not inUnjailValidator
. The proposed fix is to add the call inUnjailValidator
as well.