Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(grandfather): Snapshotted trees #3207

Closed
Tracked by #849 ...
LHerskind opened this issue Nov 2, 2023 · 8 comments · Fixed by #3468
Closed
Tracked by #849 ...

feat(grandfather): Snapshotted trees #3207

LHerskind opened this issue Nov 2, 2023 · 8 comments · Fixed by #3468
Assignees
Labels
S-blocked Status: Blocked S-needs-discussion Status: Still needs more discussion before work can start.

Comments

@LHerskind
Copy link
Contributor

LHerskind commented Nov 2, 2023

The power of the grandfather lies in reading from many different blocks in the same transaction. However, this is a true pain if the trees are not snapshotted, since you effectively need what behave as an Ethereum archive node where you can look up paths of a specific version of the note-hash and other trees.

Without snapshotting you will practically just be using the last block always. Need to figure out how state snapshotting can happen, and make it simple devex, similar to including a param with the block you want to query at.

@benesjan
Copy link
Contributor

benesjan commented Nov 3, 2023

Here is how we could do this:

1. Snapshot approach
Snapshot the whole DB at each n-th block and then replay the state changes to get to the desired block.

E.g. every 5 blocks we would copy the DB and then we would replay the changes if we need to. So if I requested a sibling path of a leaf at block 8, I would get the db at block 5 and apply changes to it from block 6, 7 and 8. These changes are stored in the archival node so it shouldn't be that hard. Advantage of this approach is that it's quite simple but a big disadvantage is that it results in a lot of storage.

2. Storing-changes approach
With each new block, store only the changed nodes of the tree. Currently we store the trees in a flat structure (key-value map). In this structure each node of the db is under a key which consists of (tree_name, level, index).

What we could do is expand this key with block number so (tree_name, level, index) would become (tree_name, block_number, level, index).

But this begs a question of how would we figure out at would blocks each node was changed upon sibling path retrieval. That is, if I want to fetch a leaf at index 6 and its sibling path at block 10 how would I determine which keys to use?

2.1 Append-only tree
For the append-only tree this can be done very efficiently. This is especially true when we are inserting blocks of fixed sizes - fixed amount of new leafs. In a scenario of the fixed block size I can determine at what block the particular leaf was inserted just from its index. Having this information I can quite easily determine at what block all the nodes in the leaf's sibling path have changed as well. This achieves optimal results when it comes to storage requirements.

2.2 Sparse tree
For the sparse tree this is way more tricky because what nodes have changed at what block is essentially random. For this reason, along with the node value information stored under the (tree_name, block_number, level, index) key, we would also need to store the information at what blocks each leaf and node has changed. I would use bitmap to store this information and a positive bit would represent a change at the given block. In each bitmap I would store changes corresponding to 256 blocks (32 bytes of info). Given that in the sparse tree most nodes are empty I would also store a [(tree_name, level, index) --> bool] map indicating whether the node was ever set.

Bellow is an example flow of how fetching a leaf and a sibling path would work when trying to get a leaf at index 9 at block 259.

  1. is non_empty_node_map(sparse_tree_1, bottom_level, index_9) empty? - NO
  2. get change_bitmap for blocks from range 257 - 512 (both indices inclusive, first block with block number 1)
  3. search back from block 259 and find at which block the node was last changed - leaf changed at block 259? - NO, 258? - NO, 257? - NO
  4. get change_bitmap for blocks from range 1 - 256 and search back from block 256.
  5. once you find the block in which the leaf changed get the leaf value at (sparse_tree_1, change_block_number, bottom_level, index_9)
  6. Perform steps 1-5 for each node in the sibling path.

It's quite likely that the search back could be performed very efficiently using bitwise operations.

Overall I think it doesn't make sense to go with approach 1. because it's too inefficient, replaying can be tricky, and approach 2. is not that hard to implement. I think for the append-only tree the approach is quite elegant but I have doubts about the sparse tree. If someone is aware of a better way how to do this LMK.

I looked at what Erigon is doing and their optimization seem to be too low-level and hence irrelevant for sandbox. Here are their docs for anyone curious.

@iAmMichaelConnor we have a nice problem here which is blocking quite a lot of stuff 😆

@benesjan benesjan added the S-needs-discussion Status: Still needs more discussion before work can start. label Nov 3, 2023
@benesjan benesjan moved this from Todo to Blocked in A3 Nov 3, 2023
@benesjan benesjan added the S-blocked Status: Blocked label Nov 3, 2023
@LHerskind
Copy link
Contributor Author

Not sure we actually need a lot of input from @iAmMichaelConnor since this is more a implementation "detail" than something that changes the protocol. Essentially how do we store it efficiently for our case, but it don't change much in the protocol itself.

I think going with solution 1 might be fine for the sandbox, where the lifetime of the chain is somewhat limited, so the data that we need to store for each snapshot is not huge. When we are talking kb/mb of storage, it seems like a no-brainer to me as it should be possible to do quite quickly (I say naivly).

Since our nullifier updates are also quite funky from the tree view, it seems like spending a lot of engineering power here now is over-engineering 🤷.

For a production note, it would be horrible to do this, but I think it is workable here.

For solution 2, it sounds like you essentially end up with something that is like a dynamic predecessor problem? (Predecessor problem, finding the largest element $S$ that is $\le x$ where $S$ is a subset of the universe of elements $U$) In our case, find the newest set of notes $S$ that is the subset of the state $U$ with the freshest elements at time $x$. Might make sense to look at translating it if what I say make a bit of sense, then should be possible to use Y-fast tries or van Emda Boas 🤔 @benesjan

@iAmMichaelConnor
Copy link
Contributor

Fun problem! I love implementing tree stuff!

An interesting future enhancement would be to enable a pxe to subscribe to hash path changes (each block) for the notes in the user's db. E.g. if a user owned a note, they could subscribe to the hash paths of the "low nullifier(s)" which 'jump over' that note's nullifier. This would allow a user to prove "this note wasn't nullified yet, in historic block b", without having to run an archive node.
Maybe it's too niche a use case, but would be cool.

@alexghr
Copy link
Contributor

alexghr commented Nov 17, 2023

I think an alternative would be to store the tree nodes in the database as content-addressed entries (e.g. like pointers in memory) where the key is the actual value (hash) of that node and the value of the entry is a tuple of the two children that make up the node.

This would allow us to structurally share subtrees between different blocks (e.g. if something changes only the right subtree then the new root can point to the old left-subtree and the new right-subtree).

Getting the value of a leaf at a particular index at a particular block number comes down to traversing the tree down from the root (conveniently this would also give us the merkle path). To get the root of a subtree we could hold known references, e.g. "contracts_tree:block_5:root" => 0x123. These references could be updated in place if a block reorganisation happens.

Example with an append tree

Initially the tree and database are both empty (don't store null hashes)

initial_state

Appending a to the tree would insert 4 new nodes and a reference for the current block

append_a

Appending b would against insert 4 new nodes but that's because the tree for the previous block still has references to the old values. The a entry in the database would be shared would be shared by both trees.

append_b

This becomes much more efficient with batched inserts because in the database we only need to keep the end state of the tree for each block.

Sparse tree

This would work the same way for sparse trees.

sparse

I'll have a think about how much storage this approach would take in the database.

WDYT?

@LHerskind
Copy link
Contributor Author

WDYT?

I quite like it. Think it have similar performance to the ideas from @benesjan with append-only, but for sparse and nullifier trees it might go crazy with nodes to be updated.

I think we can could use it as the base idea of the trees, but would probably still need something like the store only every n to keep the disk usage down, and then replaying the changes of a few blocks if needing to compute something that is not in store already.

@alexghr
Copy link
Contributor

alexghr commented Nov 18, 2023

we'll prune the branches old trees

giphy

@alexghr
Copy link
Contributor

alexghr commented Nov 28, 2023

I think for an append-only tree the only thing that needs to be stored per snapshot is up to which index we've written leaves to. Since the tree only ever grows by adding leaves and historic values never change, we just have to check if a leaf's index was set at a particular block number (ie leaf index < max leaf index at block) and maybe the tree root for good measure. Am I missing something? This sounds really cheap and efficient.

Merkle proof would be a bit more involved though since we can't compute it on the most recent tree.

@LHerskind
Copy link
Contributor Author

Merkle proof would be a bit more involved though since we can't compute it on the most recent tree.

The reason we want the snapshots is to get the sibling paths for merle inclusion proofs such that we can prove that they were part of historic state.

PhilWindle added a commit that referenced this issue Dec 1, 2023
This PR adds two different algorithms to create snapshots of merkle
trees.

## Notation

N - number of non-zero nodes
H - height of tree
M - number of snapshots

## ~Incremental~ Full snapshots

This algorithm stores the trees in a database instance in the same way
they would be stored using pointers in memory. Each node has two
key-value entries in the database: one for its left child and one for
its right child.

Taking a snapshot means just walking the tree (BF order), storing any
new nodes. If we hit a node that already exists in the database, that
means the subtree rooted at that node has not changed and so we can skip
checking it.

Building a sibling path is just a traversal of the tree from historic
root to historic leaf.

Pros:
1. its generic enough that it works with any type of merkle tree (one
caveat: we'll need an extension to store historic leaf values for
indexed-trees)
2. it shares structure with other versions of the tree: if some subtree
hasn't changed between two snapshots then it will be reused for both
snapshots (would work well for append-only trees)
3. getting a historic sibling path is extremely cheap only requiring
O(H) database reads

Cons:
1. it takes up a space to store every snapshot. Worst case scenario it
would take up an extra O(N * M) space (ie. the tree changes entirely
after every snapshot). For append-only trees it would be more
space-efficient (e.g. once we start filling the right subtree, the left
subtree will be shared by every future snapshot).

More details in this comment
#3207 (comment).

## Append-only snapshots

For append-only trees we can have an infinite number of snapshots with
just O(N + M) extra space (N - number of nodes in the tree, M - number
of snapshots) at the cost of sibling paths possibly requiring O(H)
hashes.

This way of storing snapshots only works for `AppendOnlyTree` and
exploits two properties of this type of tree:

1. once a leaf is set, it can never be changed (i.e. the tree is
append-only). This property can be generalised to: once a subtree is
completely filled then none of its nodes can ever change.
2. there are no gaps in the leaves: at any moment in time T we can say
that the first K leaves of the tree are filled and the last 2^H-K leaves
are zeroes.

The algorithm stores a copy of the tree in the database:
- the last snapshot of that node (ie v1, v2, etc)
- the node value at that snapshot

Taking the snapshot is also a BF traversal of the tree, comparing the
current value of nodes with its previously stored value. If the value is
different we update both entries. If the values are the same then the
node hasn't changed and by the first property, none of the nodes in its
subtree have changed so it returns early. For each snapshot we also
store how many leaves have been filled (e.g. at v1 we had 10 leaves, at
v2 33 leaves, etc)

Building a sibling path is more involved and could potentially require
O(H) hashes:
1. walk the historic tree from leaf to root
2. check the sibling's last snapshot
- if it's before the target snapshot version then we can safely use the
node (e.g. node A was last changed at time T3 and we're building a proof
for the tree at time T10, then by property 1 we can conclude that
neither node A nor the subtree rooted at A will ever change at any
moment in time Tn where n > 3)
- if the node has changed then we have to rebuild its historic value at
the snapshot we're interested in
- to do this we check how "wide" the subtree rooted at that node is
(e.g. what's the first leaf index of the subtree) and if it intersects
with the filled leaf set at that snapshot. If it doesn't intersect at
all (e.g. my subtree's first leaf is 11 but only 5 leaves were filled at
the time) then we can safely conclude that the whole subtree has some
hash of zeros
- if it does intersect go down one level and level apply step 2 again.
- the invariant is: we will either reach a leaf and that leaf was
changed in the version we're interested in, or we reach a node that was
changed before the version we're interested in (and we return early) or
we reach a node that was historically a hash of zero

Two (big) SVGs showing the sibling path algorithm step-by-step

[Average
case](https://github.com/AztecProtocol/aztec-packages/assets/3816165/b87fa6eb-bcf4-42ca-879d-173a76d802bb)


[Drawing of sibling path algorithm in the worst
case](https://github.com/AztecProtocol/aztec-packages/assets/3816165/dd3788ec-6357-4fab-bf78-3496a2948040)


Pros:
- low space requirements only needing O(N+M) extra space.

Cons:
- generating a sibling path involves mixed workload: partly reading from
the database and partly hashing. Worst case scenario O(H) db reads +
O(H) hashing
- doesn't work for `IndexedTree`s because even though it only supports
appending new leaves, internally it updates its leaf values.

Fix #3207

---------

Co-authored-by: PhilWindle <[email protected]>
@github-project-automation github-project-automation bot moved this from Blocked to Done in A3 Dec 1, 2023
@alexghr alexghr self-assigned this Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-blocked Status: Blocked S-needs-discussion Status: Still needs more discussion before work can start.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants