Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPIC: make guarantees around query responses #13041

Closed
8 tasks done
Tracked by #13085
tac0turtle opened this issue Aug 25, 2022 · 22 comments
Closed
8 tasks done
Tracked by #13085

EPIC: make guarantees around query responses #13041

tac0turtle opened this issue Aug 25, 2022 · 22 comments
Assignees
Labels
T: Dev UX UX for SDK developers (i.e. how to call our code) T:Epic Epics

Comments

@tac0turtle
Copy link
Member

tac0turtle commented Aug 25, 2022

Summary

As mentioned in the issue from cosmwasm there is a need to have data response guarantees around queries and messages. This issue focuses on the former, queries.

Problem Definition

Within minor releases responses of queries could change, potentially causing issues to downstream users.

Proposal

This could be done in coordination with auto generated queries

@tac0turtle tac0turtle added T: Dev UX UX for SDK developers (i.e. how to call our code) T:Epic Epics labels Aug 25, 2022
@aaronc
Copy link
Member

aaronc commented Aug 25, 2022

I believe there should be some annotation in the .proto files to indicate which queries are deterministic and can be used inside the state machine.

There may always be some cases where people want to use queries in some non-deterministic or very inefficient way. Those queries shouldn't be exposed to the state machine and modules like cosmwasm.

Probably by default we should assume all queries are unsafe, and one by one start marking the safe ones (which have proper regression tests, gas consumption, etc.) with an annotation.

@amaury1093
Copy link
Contributor

It's also probably safer to disable all historical queries, since they depend on each node's pruning settings.

@tac0turtle
Copy link
Member Author

It's also probably safer to disable all historical queries, since they depend on each node's pruning settings.

could you elaborate

@amaury1093
Copy link
Contributor

amaury1093 commented Aug 25, 2022

If your state machine code makes a historical query for height say N-1000, then:

  • if the validator node prunes state older than 1000 blocks, than the query will error
  • if the validtor node doesn't prune, then the query will succeed

causing a consensus error.

My safe solution above is to only allow state machine queries to query the latest state.

@alexanderbez
Copy link
Contributor

I'm not sure I understand @AmauryM. AFAIK, CW would not make any historical queries, but rather assumes the latest/current state.

I like @aaronc's proposal of a Proto annotation. This way devs/clients can inspect docs and know what queries are safe to use.

So I say we:

  1. Introduce and add annotations
  2. Write tests for determinisms for said queries (i.e. loop N times and ensure output is canonical each time)

@tac0turtle tac0turtle mentioned this issue Aug 30, 2022
18 tasks
@tac0turtle tac0turtle moved this to 📝 Todo in Cosmos-SDK Aug 31, 2022
@amaury1093 amaury1093 moved this from 📝 Todo to 💪 In Progress in Cosmos-SDK Sep 7, 2022
@amaury1093
Copy link
Contributor

I creatd a first PR to make sure we agree on the proto annotation, and which queries will have the guarantees on determinism. #13174

@amaury1093
Copy link
Contributor

Write tests for determinisms for said queries (i.e. loop N times and ensure output is canonical each time)

How about using property-based testing with rapid, which the SDK already uses? So basically something like:

// Check GetBalance determinism
rapid.Check(func(t *testing.T) {
  addr := AddressGenerator().Draw(t) // where AddressGenerator returns a rapid.Generator[sdk.AccAddress]
  amt := rapid.Uint64().Draw(t)
  bankKeeper.SetBalance(ctx, addr, sdk.NewCoin(amt, "stake"))
  for i := 0; i < 1000; i++ {
    // make sure bankKeeper.GetBalance always returns the same response
  }
})

So there are 1000*N iterations, where N = our rapid's configuration. This way we don't hardcode the response.

Do we also agree these tests belong inside test/integration ? I.e not suited to be tested using mocks.

@alexanderbez
Copy link
Contributor

I've never seen or used Rapid, but the approach seems reasonable to me.

@tac0turtle
Copy link
Member Author

with rapid how do you verify that the response didnt change in minor releases?

@amaury1093
Copy link
Contributor

amaury1093 commented Sep 8, 2022

with rapid how do you verify that the response didnt change in minor releases?

Good point. So maybe we do one or three hardcoded responses (to check backwards-determinism) plus rapid (to check the property "query is deterministic"). I feel both are useful here since they test different things.

@amaury1093
Copy link
Contributor

amaury1093 commented Oct 10, 2022

FYI, the way we currently test determinism/regression across minor releases is weak. We basically test queries against hardcoded values. E.g. if we fund acct1 with balance2, then the query always return the same balance, see example.

With the above method, we might catch changes in the query response. But we wouldn't catch stuff like:

  • server-side logic change (e.g. additional server-side validation of incoming request on the query handler)
  • gas change

These are not trivial to catch though. Are people fine, for this epic, to do the following:

  1. keep the current hardcoded tests for regression
  2. add clear documentation for module developers that they need to be careful about those state-machine breaking changes on each PR, when they add module_query_safe proto annotation, it can easily be a footgun.

Also happy to hear other ideas on how to test state-machine breaking changes in queries.

@alexanderbez
Copy link
Contributor

@AmauryM I'm a bit lost as to how your points relate to non-determinism re queries:

server-side logic change (e.g. additional server-side validation of incoming request on the query handler)

How is validation breaking determinism? If some query previous didn't error on input but with a new change now does, is this what you mean? If so, tests should easily catch this (e.g. Error vs NoError assertions).

gas change

Why is this difficult? Can't we read the gas consumed off of the context?

@amaury1093
Copy link
Contributor

How is validation breaking determinism? If some query previous didn't error on input but with a new change now does, is this what you mean?

Yes. Imagine you (or any module developer in the future) create a PR on a deterministic query which:

  • adds a new field in the proto request struct
  • does some validation with this new field

Then our existing deterministic tests won't fail. Even though this is a state-machine breaking change. We would need to write new tests.

gas change

So you're proposing to hardcode the gas consumed by each query inside the test itself? Not sure how flakey this would be with our test setup, but we can try.

@alexanderbez
Copy link
Contributor

Then our existing deterministic tests won't fail. Even though this is a state-machine breaking change. We would need to write new tests.

Yeah, makes sense. I suppose there isn't much we can do -- we just have to ensure we right good tests for new fields.

So you're proposing to hardcode the gas consumed by each query inside the test itself? Not sure how flakey this would be with our test setup, but we can try.

Yes, exactly! With queries, the gas should not be flakey since we're not simulating any state! In fact, we can improve these tests by also ensuring the gas consumed never changes per iteration.

@amaury1093
Copy link
Contributor

amaury1093 commented Nov 2, 2022

This Epic is complete 🎉.

As a summary, we added the (cosmos.query.v1.module_query_safe) = true protobuf annotation to Auth,bank, staking grpc queries that are allowed to be called from the state machine. Here's some documentation too.

Repository owner moved this from 📝 Todo to 👏 Done in Cosmos-SDK Nov 2, 2022
@tac0turtle
Copy link
Member Author

amazing, thank you!!! Nice job!!

Should we open a new epic to slowly migrate other modules as well?

@amaury1093
Copy link
Contributor

amaury1093 commented Nov 2, 2022

Not sure about all modules, there are still caveats like #13041 (comment), so a consensus error probability is never 0.

Are there queries/modules that people particularly want? Let's do those first.

@tac0turtle
Copy link
Member Author

@ValarDragon @ethanfrey @assafmo do you have queries you need to be deterministic from the core sdk modules

@ethanfrey
Copy link
Contributor

ethanfrey commented Nov 2, 2022

Bank, Staking and Auth is a huge step forward 💪
Thank you for getting this in 🚀

I don't have any more pressing needs. I think gov would be interesting to some users, but I think it is also in a process of change and not in a place to produce such guarantees.

Maybe adding such deterministic query checks for x/distribution, which is often used with staking?

NB: We just merged support to let chains configure their own such whitelist. Will play together well with this work CosmWasm/wasmd#1069

@assafmo
Copy link
Contributor

assafmo commented Nov 3, 2022

We went over all of these queries and found them to be deterministic: https://github.com/scrtlabs/SecretNetwork/blob/4397f93b2/x/compute/internal/keeper/query_plugins.go#L163-L199

Actually we didn't find a query in the standard modules that isn't deterministic, but we excluded the ones with iterations (E.g. /cosmos.auth.v1beta1.Query/Accounts) to prevent spamming nodes with huge queries.

@amaury1093
Copy link
Contributor

We went over all of these queries and found them to be deterministic

Thanks for the list! @assafmo How did you check they were deterministic?

@assafmo
Copy link
Contributor

assafmo commented Nov 6, 2022

By eye, I went over the code and made sure there's just storage access and no maps serializations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: Dev UX UX for SDK developers (i.e. how to call our code) T:Epic Epics
Projects
None yet
Development

No branches or pull requests

7 participants