Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suggestion: use a merkle skip list to quickly calculate rolling consensus hashes and support quick querying #146

Closed
shea256 opened this issue Oct 1, 2015 · 6 comments
Labels

Comments

@shea256
Copy link
Contributor

shea256 commented Oct 1, 2015

The goal here is to calculate consensus hashes from all previous consensus hashes in O(log(T) + n) instead of O(T) where T is the number of blocks (and thus consensus hashes) since the beginning of time and n is the number of name operations in the current block.

Saving consensus hashes at all points in time is valuable so that one can easily validate all of the operations that were seen at time t. This is really nice for lightweight node support. Plus we've already established from previous discussions that saving all name operations at every point in time is valuable for verifying a database one gets from a peer for bootstrapping purposes.

Here's how we can calculate the consensus hashes...

I define a merkle skip list as a rolling merkle root calculation, where at time t the merkle tree is built from the data added at time t, plus a sublinear set of all previous merkle roots. Then the newly calculated merkle root is saved for future calculations.

As an example, if we define the first seen blockstore block as t=0, then the consensus hashes for t=0 to t=7 can be saved, and the hash at t=7 could be calculated by combining all of the name operations at time t=7 with the consensus hashes at times t=6, t=4, and t=0.

Here is a full table of all the merkle tree calculations at every step through t=7:

Block # Merkle Tree Contents
0 block 0 ops
1 block 1 ops, hash from block 0
2 block 2 ops, hashes from block 1
3 block 3 ops, hashes from blocks 2 + 0
4 block 4 ops, hashes from blocks 3 + 1
5 block 5 ops, hashes from blocks 4 + 2
6 block 6 ops, hashes from blocks 5 + 3
7 block 7 ops, hashes from blocks 6 + 4 + 0
15 block 15 ops, hashes from blocks 14 + 12 + 8 + 0
31 block 31 ops, hashes from blocks 30 + 28 + 24 + 16 + 0

Note that the merkle root for any given block validates all merkle roots from the beginning of time, but only requires O(log(T) + n) calculations, as mentioned above. We can assume that T will grow into the hundreds of thousands while n will likely remain in the hundreds or thousands (given limits on block size).

Also note that if we have a trusted consensus hash for block 7 and ask an untrusted node for the proof that a given operation in block 5, we only need to be served the block ops from blocks 7 and 5. We don't need to see all previous block ops.

I'm not sure what to call this data structure but I'll call it a merkle skip list for now. Other thoughts on this are welcome.

@jcnelson
Copy link
Member

jcnelson commented Oct 2, 2015

Given the consensus hash at 7 (call it ch[7]), the client wants to verify that a given operation is in block 5. Wouldn't the client need to do the following?

  1. Fetch ops[7], ch[6], ch[4], ch[0].
  2. Verify (1) with ch[7], so ch[6] is trusted.
  3. Fetch ops[6], ch[5], ch[3]
  4. Verify (3) with ch[6], so ch[5] is trusted.
  5. Fetch ops[5], ch[2]
  6. Verify (5) and ch[4] from (1) with ch[5], so ops[5] is trusted
  7. Verify op in ops[5]

I don't see how the client can skip downloading ops[6], since the client's penultimate goal is to verify ch[5](so it can verify ops[5]). But in order to verify ch[5], the client needs to execute steps 3 and 4.

Let me see if I can work through another example here. Let's say the client has ch[15] but wants to verify that an untrusted op is in ops[3]. To do so, the client would:

  1. Fetch ops[15], ch[14], ch[12], ch[8], ch[0]
  2. Verify (1) with ch[15], so ch[8] is trusted.
  3. Fetch ops[8], ch[7], ch[5], ch[1]
  4. Verify (3) with ch[8], so ch[5] is trusted
  5. Fetch ops[5], ch[3]
  6. Verify (5) and ch[1] from (3) with ch[5], so ch[3] is trusted
  7. Fetch ops[3], ch[2]
  8. Verify (7) and ch[0] from (1) with ch[3], so ops[3] is trusted
  9. Verify op in ops[3]

Does this look correct to you?

@shea256
Copy link
Contributor Author

shea256 commented Oct 4, 2015

@jcnelson It just hit me that we can just use the merkle root of the ops for a given block. That means that we only need to be served a bunch of hashes and the single set of ops for the block in question.

@muneeb-ali muneeb-ali added this to the Blockstore v0.1.0 milestone Oct 5, 2015
@jcnelson
Copy link
Member

jcnelson commented Nov 9, 2015

This is implemented in my patchset, in virtualchain. The consensus hash is generated by taking the root of a Merkle tree constructed from the sorted list of previous consensus hashes concatenated with the root of another Merkle tree constructed from the tx-ordered serialized record hashes discovered in the underlying blockchain. Basically, the algorithm is:

  1. get the tx-ordered list of a block's operations
  2. serialize them all, hash them, sort the hashes, make a Merkle tree, and take the root
  3. get the previous consensus hashes in the skip list
  4. append the record's Merkle tree root to the hashes, sort them, and make another Merkle tree
  5. the consensus hash for this block is the root of this new Merkle tree

Then, I only need to fetch actual operations when looking up the name's data. In all other cases, I only need to fetch the Merkle root over all of a block's operations.

@jcnelson
Copy link
Member

Added

@shea256
Copy link
Contributor Author

shea256 commented Nov 10, 2015

Hell yeah! Light nodes FTW!

@blockstack-devops
Copy link
Contributor

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@stacks-network stacks-network locked as resolved and limited conversation to collaborators Nov 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants