docs: adr-40: reduce multistore and make it atomic #9355

robert-zaremba · 2021-05-18T22:47:04Z

Description

This ADR-40 update discusses the MultiStore removal.

Closes: #10013
Related discussion: #8297 (comment)

Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

Targeted PR against correct branch (see CONTRIBUTING.md)
Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
Code follows the module structure standards.
Wrote unit and integration tests
Updated relevant documentation (docs/) or specification (x/<module>/spec/)
Added relevant godoc comments.
Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
Re-reviewed Files changed in the Github PR explorer
Review Codecov Report in the comment section below once CI passes

robert-zaremba · 2021-05-18T23:03:31Z

Alternative to the prefix store based design is to use a concept proposed by AiB (@jgimeno , @fdymylja ):
We require that all objects stored in store will need to implement the following interface:

type Record interface {
   proto.Message
   ID() []byte
}

Where, ID method will return a unique key for each proto.Message stored in the store. It will use proto type URL as a prefix to handle conflicts between object classes, example:

coin.ID() == "cosmos/base/coin/<holder_address>/<denom>"

NOTE: AiB proposed different interface, but the concept is similar.

robert-zaremba · 2021-05-18T23:06:28Z

@aaronc , @alexanderbez , @alessio, @adlerjohn, @i-norden - let me know what do you think about this update? Shall we do more research before creating a PR?

tac0turtle · 2021-05-19T08:22:52Z

We require that all objects stored in store will need to implement the following interface:

This seems like it can lead to easy user errors. I could be wrong, but it adds more boilerplate IMO.

robert-zaremba · 2021-05-19T08:25:54Z

For better organization, let's continue the discussion about multistore in Replace IAVL and decouple state commitment from storage #8297 discussion.

fdymylja · 2021-05-19T08:35:43Z

Just as a short update on the work being done, we're moving away from the GetID interface.

The user does not need to implement GetID anymore but rather just mark the PrimaryKey field during the StateObject registration and then the store implementation already knows how to encode that field (not matter what type it is) into a valid byte key.

For example: https://github.com/allinbits/cosmos-sdk-poc/blob/frojdi/crisis/runtime/orm/schema/schema.go#L71

This removes the need to implement any kind of interface as now the store is smart enough to recognize, given a protobuf message, which is its primary key, how to get it and how to encode it.

alexanderbez · 2021-05-19T12:24:43Z

@aaronc , @alexanderbez , @alessio, @adlerjohn, @i-norden - let me know what do you think about this update? Shall we do more research before creating a PR?

I like it, but it requires more lift from devs, which is probably be OK? It seems like the updated proposal from @fdymylja is a bit cleaner.

tac0turtle · 2021-05-27T15:37:58Z

@ValarDragon brought up a use case for multistores. If I as an app developer want to use a different state commitment for my module, multistores allow this. Is there a way to integrate such a feature into this design?

Maybe @ValarDragon can also say a bit more about the use case.

robert-zaremba · 2021-05-28T13:43:36Z

If I as an app developer want to use a different state commitment for my module

I see the motivation. I don't think it justifies the complexity and other problems listed in the Multistore section of this ADR.

Moreover, currently multistore is using the same DB, so if we were to allow different mechanisms for SC we would nee to separate everything and add one more layer of merkle tree to commit to all stores.

robert-zaremba · 2021-05-28T13:44:40Z

Updates

Added section about multistore removal
Added requirement about accessing old state of SC
Added more details about the database setup.

robert-zaremba · 2021-05-28T13:52:05Z

docs/architecture/adr-040-storage-and-smt-state-commitments.md

+```
+
+Where `store.Code(prefix)` is a Huffman Code of `prefix` in the given `store`.
+


@aaronc originally suggested to map a module "verbose" key to a 2 byte sequence prefix for key prefix compression, eg:

bank -> 0x0001 staking -> 0x0002 ....

In both mechanisms we will need to assure that the created map will be stable (we will always construct the same mapping pairs).

NOTE: Both Huffman Codes and module key map compress the keys only for the SS. It's not needed for SC because in SC we always operate on a hash of a key.

I think the Aaron idea is easier to manage. The limitation is that we bound the number of modules to 65536 (2^16) - which is big enough to not worry about it now.

To make this prefix model work effectively, I think there should be some stateful prefix map - possibly stored under some special restricted schema prefix (probably 0) on the RootStore. We can use varint encoding and not require a restriction to 2 bytes, but also 2 bytes is probably fine.

In this restricted schema key-space, we would have a mapping from key name -> compressed prefix, which is itself prefixed to allow for other schema use cases.

How does that sound?

How important is it to have this optimization at all?

String prefixes could potentially be rather long (maybe full proto msg names eventually as discussed in #9156). I think it's relatively important.

If we want to enable variable prefix length, then huffman coding is better, because it takes frequency into account and it is also protects against common prefixes.

NOTE: we decided to use varint.

ValarDragon · 2021-06-03T00:35:06Z

IMO there are important usecases that want to be able to allow sub-trees to have a different merkelization / proof format.

E.g. for the use-case of having a multi-asset shielded pool that reuses existing SNARK circuits, we need the merkle tree to not be IAVL/ our SMT

For making SNARK proofs of merkle tree auth paths, you want different merkle tree structures (e.g. often non-binary) & different hash functions.

This was previously pretty doable by using a different store within the multistore.

docs/architecture/adr-040-storage-and-smt-state-commitments.md

Co-authored-by: Ryan Christoffersen <[email protected]>

Co-authored-by: Aleksandr Bezobchuk <[email protected]>

* Update on multistore refactor and IBC proof * cleanup whitespace * Update docs/architecture/adr-040-storage-and-smt-state-commitments.md Co-authored-by: Robert Zaremba <[email protected]> * revise for PR * add todo * Update docs/architecture/adr-040-storage-and-smt-state-commitments.md Co-authored-by: Robert Zaremba <[email protected]> Co-authored-by: Robert Zaremba <[email protected]>

robert-zaremba · 2021-10-06T15:33:35Z

Taking #9156 into account, a few things I think are important:

being able to generically compress any proto message typeURL to a 1-2 byte prefix for SS that is stable (I suggest a varint managed by a schema sub-store)
we probably can get in-memory caching for free because the ORM layer can handle that and even skip decoding for frequently accessed objects, so memory and transient stores may be less needed...

@aaronc I think this is out of the scope of this PR. That being said - it's a good update and I think we should design for key compression. I will add that note as a TODO in this PR and let's specify it in a new PR.

docs/architecture/adr-040-storage-and-smt-state-commitments.md

robert-zaremba · 2021-10-06T17:12:41Z

Updates

I rephrased the paragraph about using two separate DB instances
Added a note about private stores and use of the low level interface
The RootStore interface is work in progress and will be updated once we will progress with the implementation.
Added more proposals for the key compression and left todo to make a decision about it. Let's iterate about it later on.
removed proposal to extend KVStore with WithPrefix

Let's move forward and merge this PR. I left few TODOs which are not directly related to this PR. They will be resolved in the next iteration - the ADR is still in DRAFT.

robert-zaremba · 2021-10-06T17:17:56Z

@alexanderbez , @aaronc - let's finalize this PR and merge as an iterative update. For not resolved parts I've added TODOs.

docs/architecture/adr-040-storage-and-smt-state-commitments.md

roysc · 2021-10-07T06:45:21Z

docs/architecture/adr-040-storage-and-smt-state-commitments.md

+    assert( !k1.hasPrefix(k2) )
+```
+
+NOTE: We need to assure that the codes won't change. Huffman Coding depends on the keys and its frequency - so we would need to generate the codes and then fix the mapping in a static variable.


I don't see why this couldn't change when stores are added/renamed, since it will only be used in the SS implementation. The current mapping could just be stored in a separate namespace in the DB.

Yes, makes sense.

docs/architecture/adr-040-storage-and-smt-state-commitments.md

aaronc

Most of this looks good (pending @roysc's suggestions getting merged).

I don't agree with the Huffman Coding part and suggest we delete it. We don't know all prefixes a priori and this set can change. Instead, as I proposed before, we can keep a simple mapping of prefixes and use varint encoding. Specifically, how about this:

we define the prefix 0x0 in SS as the prefix for special schema sub-store to be used internally by RootStore
in the schema sub-store we store three things:
- map of store key -> prefix
- map of prefix -> store key
- auto-incrementing sequence for the next prefix

We can just use auto-incrementing varints which ensure no collisions and are short and easy to assign.

docs/architecture/adr-040-storage-and-smt-state-commitments.md

tac0turtle · 2021-10-14T10:41:58Z

@robert-zaremba can you address aarons comment. Lets merge this. Its been open for 6 months

Co-authored-by: Roy Crihfield <[email protected]>

* adr-40: use prefix store instead of multistore * add note about prefix.Store * Update SC and SS setup information and historical versions sepc * add note about key prefix optimization * rephrased the changes related to multistore * Apply suggestions from code review Co-authored-by: Ryan Christoffersen <[email protected]> * Update docs/architecture/adr-040-storage-and-smt-state-commitments.md * Update docs/architecture/adr-040-storage-and-smt-state-commitments.md * Update docs/architecture/adr-040-storage-and-smt-state-commitments.md Co-authored-by: Aleksandr Bezobchuk <[email protected]> * design update * update merkle proofs * Apply suggestions from code review Co-authored-by: Aleksandr Bezobchuk <[email protected]> * reword huffman compression paragraph * ADR-40: update on multi-store refactor and IBC proofs (#10191) * Update on multistore refactor and IBC proof * cleanup whitespace * Update docs/architecture/adr-040-storage-and-smt-state-commitments.md Co-authored-by: Robert Zaremba <[email protected]> * revise for PR * add todo * Update docs/architecture/adr-040-storage-and-smt-state-commitments.md Co-authored-by: Robert Zaremba <[email protected]> Co-authored-by: Robert Zaremba <[email protected]> * review updates * add todo for protobuf message type compression * add link to a discussion * guarantee atomic commit with IBC workaround proposal * adding more links to references * Apply suggestions from code review Co-authored-by: Roy Crihfield <[email protected]> * reword the module key compression part Co-authored-by: Ryan Christoffersen <[email protected]> Co-authored-by: Federico Kunze <[email protected]> Co-authored-by: Aleksandr Bezobchuk <[email protected]> Co-authored-by: Roy Crihfield <[email protected]>

adr-40: use prefix store instead of multistore

201c73e

robert-zaremba added C:Store T: ADR An issue or PR relating to an architectural decision record labels May 18, 2021

github-actions bot added the T:Docs Changes and features related to documentation. label May 18, 2021

add note about prefix.Store

c432f5e

robert-zaremba mentioned this pull request May 18, 2021

ADR-040: Storage and SMT State Commitments #8430

Merged

9 tasks

robert-zaremba mentioned this pull request May 21, 2021

Logical Commit(s) are not Atomic #6370

Closed

4 tasks

robert-zaremba added 2 commits May 28, 2021 12:13

Merge branch 'master' into robert/adr-40-update

7c4e308

Update SC and SS setup information and historical versions sepc

bbc692f

robert-zaremba marked this pull request as ready for review May 28, 2021 13:37

robert-zaremba requested review from aaronc and alexanderbez as code owners May 28, 2021 13:37

robert-zaremba requested review from tac0turtle, fdymylja and liamsi May 28, 2021 13:38

robert-zaremba commented May 28, 2021

View reviewed changes

robert-zaremba added 2 commits May 28, 2021 16:04

add note about key prefix optimization

0612088

rephrased the changes related to multistore

c12c5ec

ryanchristo reviewed Jun 3, 2021

View reviewed changes

Apply suggestions from code review

60af1b0

Co-authored-by: Ryan Christoffersen <[email protected]>

robert-zaremba and others added 2 commits September 22, 2021 18:15

Apply suggestions from code review

1781fb5

Co-authored-by: Aleksandr Bezobchuk <[email protected]>

reword huffman compression paragraph

eaa80de

liamsi mentioned this pull request Sep 27, 2021

Tree sharding to process updates in parallel celestiaorg/smt#62

Open

i-norden approved these changes Oct 6, 2021

View reviewed changes

docs/architecture/adr-040-storage-and-smt-state-commitments.md Outdated Show resolved Hide resolved

docs/architecture/adr-040-storage-and-smt-state-commitments.md Outdated Show resolved Hide resolved

robert-zaremba added 2 commits October 6, 2021 19:14

review updates

2a7b1aa

Merge remote-tracking branch 'origin/master' into robert/adr-40-update

50cf72a

robert-zaremba added 4 commits October 6, 2021 19:21

add todo for protobuf message type compression

fbe4676

add link to a discussion

5dad4ef

guarantee atomic commit with IBC workaround proposal

f9c77ae

adding more links to references

1fec94c

roysc reviewed Oct 7, 2021

View reviewed changes

aaronc reviewed Oct 7, 2021

View reviewed changes

docs/architecture/adr-040-storage-and-smt-state-commitments.md Outdated Show resolved Hide resolved

robert-zaremba mentioned this pull request Oct 13, 2021

ADR-40 Implementation #10360

Closed

20 tasks

robert-zaremba and others added 3 commits October 21, 2021 14:55

Apply suggestions from code review

74193a9

Co-authored-by: Roy Crihfield <[email protected]>

reword the module key compression part

2390332

Merge branch 'master' into robert/adr-40-update

aa9fd58

robert-zaremba added the A:automerge Automatically merge PR once all prerequisites pass. label Oct 21, 2021

robert-zaremba changed the title ~~adr-40: reduce multistore and make it atomic~~ doc: adr-40: reduce multistore and make it atomic Oct 21, 2021

Merge branch 'master' into robert/adr-40-update

6206ce1

robert-zaremba changed the title ~~doc: adr-40: reduce multistore and make it atomic~~ docs: adr-40: reduce multistore and make it atomic Oct 22, 2021

robert-zaremba merged commit f3ffb33 into master Oct 22, 2021

robert-zaremba deleted the robert/adr-40-update branch October 22, 2021 10:45

robert-zaremba mentioned this pull request Oct 26, 2021

SMT <> IBC tracking issue #10433

Closed

6 tasks

robert-zaremba mentioned this pull request Nov 10, 2021

feat: ADR-040: Add RootStore implementation #10430

Merged

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: adr-40: reduce multistore and make it atomic #9355

docs: adr-40: reduce multistore and make it atomic #9355

robert-zaremba commented May 18, 2021 •

edited

Loading

robert-zaremba commented May 18, 2021 •

edited

Loading

robert-zaremba commented May 18, 2021

tac0turtle commented May 19, 2021

robert-zaremba commented May 19, 2021

fdymylja commented May 19, 2021

alexanderbez commented May 19, 2021 •

edited

Loading

tac0turtle commented May 27, 2021

robert-zaremba commented May 28, 2021

robert-zaremba commented May 28, 2021

robert-zaremba May 28, 2021

robert-zaremba May 28, 2021 •

edited

Loading

aaronc Aug 11, 2021

alexanderbez Aug 12, 2021

aaronc Aug 12, 2021

robert-zaremba Sep 22, 2021

robert-zaremba Oct 22, 2021

ValarDragon commented Jun 3, 2021 •

edited

Loading

robert-zaremba commented Oct 6, 2021 •

edited

Loading

robert-zaremba commented Oct 6, 2021 •

edited

Loading

robert-zaremba commented Oct 6, 2021

roysc Oct 7, 2021

robert-zaremba Oct 8, 2021

aaronc left a comment

tac0turtle commented Oct 14, 2021

		```

		Where `store.Code(prefix)` is a Huffman Code of `prefix` in the given `store`.

docs: adr-40: reduce multistore and make it atomic #9355

docs: adr-40: reduce multistore and make it atomic #9355

Conversation

robert-zaremba commented May 18, 2021 • edited Loading

Description

robert-zaremba commented May 18, 2021 • edited Loading

robert-zaremba commented May 18, 2021

tac0turtle commented May 19, 2021

robert-zaremba commented May 19, 2021

fdymylja commented May 19, 2021

alexanderbez commented May 19, 2021 • edited Loading

tac0turtle commented May 27, 2021

robert-zaremba commented May 28, 2021

robert-zaremba commented May 28, 2021

Updates

Choose a reason for hiding this comment

robert-zaremba May 28, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ValarDragon commented Jun 3, 2021 • edited Loading

robert-zaremba commented Oct 6, 2021 • edited Loading

robert-zaremba commented Oct 6, 2021 • edited Loading

Updates

robert-zaremba commented Oct 6, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaronc left a comment

Choose a reason for hiding this comment

tac0turtle commented Oct 14, 2021

robert-zaremba commented May 18, 2021 •

edited

Loading

robert-zaremba commented May 18, 2021 •

edited

Loading

alexanderbez commented May 19, 2021 •

edited

Loading

robert-zaremba May 28, 2021 •

edited

Loading

ValarDragon commented Jun 3, 2021 •

edited

Loading

robert-zaremba commented Oct 6, 2021 •

edited

Loading

robert-zaremba commented Oct 6, 2021 •

edited

Loading