Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of EndBlock() #51

Closed
mpoke opened this issue Mar 10, 2022 · 4 comments · Fixed by #221
Closed

Order of EndBlock() #51

mpoke opened this issue Mar 10, 2022 · 4 comments · Fixed by #221
Assignees
Labels
scope: docs Improvements or additions to documentation

Comments

@mpoke
Copy link
Contributor

mpoke commented Mar 10, 2022

Update: see #51 (comment)

The EndBlock() of the provider CCV module gets validator updates from the staking module. Therefore, the staking.EndBlock() must be executed before ccv.EndBlock().

Let h be the height at which the provider creates a client for a new consumer chain, i.e.,

func (k Keeper) CreateChildClient(ctx sdk.Context, chainID string, initialHeight clienttypes.Height) error {

Then, the initial validator set on the consumer chain is the next validator set (i.e., the validator set that validates at height h+1). This validator set was updated by executing staking.EndBlock() at height h-1. Thus, the LastValidatorPower index contains this validator set until executing staking.EndBlock() at height h. Since MakeChildGenesis() (called from CreateChildClient()) needs LastValidatorPower to contain this initial validator set, it must be called before staking.EndBlock().
Moreover, since staking.EndBlock() at height h may update the validator set of height h+1 (i.e., the initial val set), these updates must be sent to the consumer chain. Thus, the consumer chain must be registered (i.e., have a client created) before ccv.EndBlock() is called. Consequently, CCV requires the following order: CreateChildClient(), staking.EndBlock(), ccv.EndBlock().

CreateChildClient() is called from two places, i.e.,

return k.CreateChildClient(ctx, p.ChainId, p.InitialHeight)
and
k.CreateChildClient(ctx, splitKey[1], initialHeight)

The former is called from gov.EndBlock(), which would entail the following order of executing the EndBlock(): gov.EndBlock(), staking.EndBlock(), ccv.EndBlock().

The issue is with the latter which is called from ccv.EndBlock() and, thus, would entail the following order: ccv.EndBlock(), staking.EndBlock(), ccv.EndBlock(), which is not possible. As a solution, we should call CreateChildClient() for pending clients from ccv.BeginBlock().

@mpoke
Copy link
Contributor Author

mpoke commented Mar 10, 2022

Test scenario 1:

  • at height h receive proposal to spawn new chain and also remove one validator from the validator set on the provider chain;
  • after the CCV channel is established, check if the validator set on the consumer is updated accordingly, i.e., the result should be the same as applying all the updates that happen on the consumer in the mean time.

Test scenario 2:

  • same as above but the proposal to spawn a consumer chain is in the future, i.e., SpawnTime > BlockTime();
  • in the same block when the client to the consumer chain is created, remove one validator from the validator set on the provider chain.

@jtremback
Copy link
Contributor

jtremback commented Mar 14, 2022

It should be pretty easy to make everything work by moving IteratePendingClientInfo into BeginBlock. This will be correct because in our (and many other chain's?) app.go, the order of the Governance endblocker is set to be before Staking, which is what we need. However, this creates a brittle situation where the order matters but it is very non obvious. This creates the potential for a very hard to diagnose bug that can arise unexpectedly.

Possible solutions:

  • Do nothing. Maybe this is just the way things are in Cosmos. Maybe there is already other stuff that depends on this order.
  • Create some way for modules to query the ModuleManager.OrderEndBlockers to check that the order is correct for them.
  • Keep the last two validator sets in storage, and query them by block number. This way the order doesn't matter.

@AdityaSripal would like to get your opinion on this because I'm sure it has been an issue for other modules.

@mpoke mpoke moved this to In Progress in Replicated Security Apr 11, 2022
@mpoke mpoke added question scope: cosmos-sdk Integration with Cosmos SDK labels Apr 26, 2022
@mpoke mpoke added the product label May 10, 2022
@mpoke mpoke assigned mpoke and unassigned jtremback May 11, 2022
@AdityaSripal
Copy link
Member

Looks like we already do IteratePendingClientInfo in BeginBlock which is correct.

This creates the potential for a very hard to diagnose bug that can arise unexpectedly.

I don't think atm there's a better solution than to document this clearly. Plenty of ways to shoot yourself in the foot by wiring app.go incorrectly. I think so far, we've relied on clear documentation and code examples people can copy/paste from

@mpoke
Copy link
Contributor Author

mpoke commented Jun 8, 2022

Add comment in provider/app.go addressing the order of EndBlock.

@mpoke mpoke added mvcc and removed mvcc labels Jun 8, 2022
@mpoke mpoke moved this from In Progress to Todo in Replicated Security Jun 20, 2022
@mpoke mpoke added the scope: docs Improvements or additions to documentation label Jun 29, 2022
@mpoke mpoke moved this from Todo to Waiting for review in Replicated Security Jul 11, 2022
Repository owner moved this from Waiting for review to Done in Replicated Security Jul 11, 2022
mpoke added a commit that referenced this issue Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope: docs Improvements or additions to documentation
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants