Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CommitmentContainer: compatibility of KZG commitments and other commitment types in SSZ #2585

Closed
wants to merge 1 commit into from

Conversation

protolambda
Copy link
Contributor

This introduces a new type to SSZ, that is very similar to the existing Container, but allows two new things:

  1. Override the hash_tree_root computation
  2. Verify embedded commitments

It still encodes/decodes the same.

With this type of container, we can implement two SSZ types that have the same output for hash_tree_root:

  • A bare commitment (E.g. an encoded G1 point: a KZG commitment)
  • A commitment-container that encodes the full data that was committed to. And optionally embeds the commitment, to avoid repeated computation.

The feature to embed the commitment may also be useful to samples, and other future commitment types that are cheaper to verify than to produce. This is not like a regular cache, as it is encoded and can then be quickly verified by another actor on the network, instead of computing the commitment.

When the bare commitment and data-backed commitment have the same hash_tree_root, this also means that the signing-root matches, and signatures over these two are the same. This enables the sharding spec to sign/verify headers everywhere, and use the signature for the full blob of shard-data. E.g. a data-tx (small header) is created based on the full data, then selected by a proposer (adds their signature), and that signature can then apply to the full blob of data, published by the builder.

With sharding specifically, this reduces the contents in the shard header, and removes the trust assumption that the SSZ data-root matches the KZG data-root. I.e. data-sampling will be enough to ensure that commitment is good, which allows us to confirm data even with < 2/3 attestations at the end of the 2 epoch attestation inclusion period (if the other header data does not have more votes, and the data sampling shows it is available, which is part of the forkchoice). Thanks to @vbuterin for finding this.

This is a draft, feedback welcome. After ACK of general direction, I'll implement this functionality in remerkleable to enable the pyspec to utilize it.

@protolambda protolambda added scope:SSZ Simple Serialize scope:sharding Sharding labels Sep 3, 2021
@protolambda protolambda force-pushed the commitment-containers branch from 4b6ef0f to d0e2049 Compare September 3, 2021 16:28
@protolambda protolambda force-pushed the commitment-containers branch from d0e2049 to 2f0ff0d Compare September 3, 2021 16:29
@protolambda
Copy link
Contributor Author

*Fixed the TOC in the docs

@Nashatyrev
Copy link
Member

Looks like a good solution to me!

Just a spontaneous idea: may be it could be done in a more generic way? I.e. by attributing any ssz member/structure with a custom hash_tree_root function. That way

# Some Python 'pseudo-code' 
class DataCommitmentPoints(Container):
    points: List[BLSPoint, POINTS_PER_SAMPLE * MAX_SAMPLES_PER_BLOB] (hash_fun = bls_commit_to_data)
    samples_count: uint64

fun bls_commit_to_data(point: List[BLSPoint]) -> BLSCommitment
    ...

I.e. hash_tree_root(points) would be just it's BLS commitment
Thus hash_tree_root(DataCommitment) == hash_tree_root(DataCommitmentPoints)

Verification of the commitment could be a part of a higher-level spec (not SSZ spec). E.g. attesters from a shard committee should match ShardBlobHeader against ShardBlob and verify ShardBlobHeader.body_summary.commitment.point against ShardBlob.body.data.points

Absolutely not sure if this approach is feasible and somewhat better than original. Just a slightly different view

@protolambda
Copy link
Contributor Author

@Nashatyrev Thank you for the suggestion, but I think just containers should be sufficient, we can add it for other types later if necessary (better than removing later if unused/unwanted). Also, I opted for CommitmentContainer instead of Container so the difference is clear. Having to check for an override in an existing SSZ type only makes things more complicated.

djrtwo
djrtwo previously approved these changes Sep 7, 2021
Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting!

do we have any other types of commitments that we might use this on? just want to think through if the scheme will satisfy alternative usecases sufficiently

Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't meant to "approve" on the last comment

@djrtwo djrtwo dismissed their stale review September 7, 2021 14:01

accidental approval

@protolambda
Copy link
Contributor Author

do we have any other types of commitments that we might use this on?

@djrtwo

Not that I am aware of, but some examples of popular ones:

  • Simple hash functions (keccak256, sha3-256, sha-256 etc.): take the data, hash it to 32 bytes, and return that as output. The commitment type is Bytes32 and matches the same hash_tree_root (HTR)
  • Hash functions with smaller/bigger outputs (e.g. sha-512): take the data, hash it to (e.g. 64 bytes), then represent that as the ideal SSZ representation of the commitment (e.g. Bytes64), then return the HTR of that. The commitment HTR will match the commitment-container HTR.
  • Crypto commitments over large data that are expensive to compute, but not to verify (maybe verkle trees): define the commitment-container type to contain both the data and the commitment. The commitment can be verified to match the data, instead of
  • Crypto commitments that cannot perfectly be described with SSZ (e.g. a BLS point has deserialization conditions): where suitable, the verification function can be used to detect the contained data (e.g. byte vector) as invalid. Generally I prefer to just default to HTR of the serialized contents (e.g. we do this for serialized pubkeys), but if we have to, we can abort before computing a HTR.

And more abstractly:

  • CommitmentContainer, like a Container can contain any data, to represent any serialized input, and optionally the corresponding serialized commitment. Wrapping the data with a container does NOT mean the commitment has to be a container type.
  • The HTR override of CommitmentContainer can take this data, convert it to the corresponding SSZ type of the commitment (and maybe encode some of the data properties, like the length, like in this shard data case), and take SSZ HTR of that. This makes the binary tree have the exact same shape right up to the commitment contents.
  • It supports verifying an embedded commitment separately, so we don't require a commitment to be recomputed on every peer of the network. HTR has no return/exception that indicates failure, thus the separate verification function. It can be used for extended data validation of the container contents as well.

@hwwhww
Copy link
Contributor

hwwhww commented Jan 9, 2025

closing this PR because it seems outdated.

@hwwhww hwwhww closed this Jan 9, 2025
@jtraglia jtraglia deleted the commitment-containers branch January 22, 2025 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope:sharding Sharding scope:SSZ Simple Serialize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants