Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ecdsa-sd-2023 cryptosuite to ECDSA Cryptosuites specification #20

Merged
merged 49 commits into from
Aug 3, 2023

Conversation

msporny
Copy link
Member

@msporny msporny commented Jul 21, 2023

This PR adds the ecdsa-sd-2023 cryptosuite to the ECDSA Cryptosuites specification. This cryptographic suite was introduced to the W3C CCG and VCWG here:

https://lists.w3.org/Archives/Public/public-credentials/2023May/0104.html

A presentation providing a background on the cryptosuite is available here:

https://docs.google.com/presentation/d/1d-04kIWhPuNscsAyUuRH3pduqrNerhigCWahKe6SNos/edit#

A complete open source implementation, with complete API documentation, exists here (with more implementations on the way):

https://github.com/digitalbazaar/ecdsa-sd-2023-cryptosuite

This PR is raised as a result of stated support by W3C Members in this group, three global standards organizations (two of whom are members of the VCWG), and a number of implementers: https://lists.w3.org/Archives/Public/public-vc-wg/2023Jul/0015.html

The cryptosuite is marked as "at risk" and may be removed from the specification if there are not enough implementations demonstrated during the Candidate Recommendation phase or if there is consensus to remove it from the specification for technical reasons before the Recommendation phase.


Preview | Diff

Copy link
Collaborator

@Wind4Greg Wind4Greg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the technical details appear sound after an in depth reading, the presentation order of the SD materials should be changed to enhance readability and understanding. Currently we have:

  1. Selective Disclosure Functions which includes some reasonably sounding items as labelReplacementCanonize, but also the completely undecipherable skolemize and deskolemize. Looking at Wikipedia: Skolem normal form isn't very enlightening. These seem to be the lowest level functions involved in the SD stack and mostly don't need to be looked at until implementation.
  2. ecdsa-sd-2023 Functions These seem a combination of small helper functions such as serializeSignData and major functions such as createDisclosureData. Some of these functions use the previously defined functions.
  3. ecdsa-sd-2023 This is finally where the overall procedures are described including most of the cryptographic operations. Need to lead with this and give it a better name!

We really need to reverse the order of these descriptions and give these sections better names to enhance readability and cryptographic review.

It seems like we would want some high level text describing the various cryptographic operations to make cryptographic review more straight forward, i.e., stuff like:

  1. We are using HMAC as a PRF to avoid blank node identifiers from RDF canonicalization leaking information. The HMAC key is not kept secret. Does it have other requirements, i.e., cannot be repeated?
  2. Mandatory disclosure information is signed as a whole along with other information (proof options hash, ephemeral public key)
  3. An ephemeral key pair is used in the signing and verification of all optionally disclosed statements. The private key is destroyed and the ephemeral public key is included as part of the signature.
  4. The actual proofValue is a serialization of a number of important items besides the baseSignature and signatures, i.e., (ephemeral)publicKey, hmacKey, and mandatoryPointers

As you can tell I'm focusing on the cryptographic aspects. I want to make sure folks see that this is not a Merkel tree approach but a "relatively straight forward" signing of individual statements. But that breaking things into statements (in a relatively secure manner) and supporting mandatory disclosure requires some machinery (regardless of signature approach). I would be happy to contribute text as this moves forward.

@msporny
Copy link
Member Author

msporny commented Jul 25, 2023

@Wind4Greg wrote:

I would be happy to contribute text as this moves forward.

Yes, please. Your analysis is spot on and I agree that we'll need to refactor the way this is presented to be more readable (and the way you suggest makes sense to me). This stuff was originally authored to build up from the basic functions to the final/complete ones... but yes, going top-down instead of bottom-up might be easier.

Your summary of how it works is also good, we should highlight that. I struggled with where to put that information because it's fairly general to just about any suite that takes this approach (BBS being the other one that could benefit from the low level functions). Yes, it would be good to point out that this is a fairly boring / straightforward approach vs. a Merkle-based approach (which we should also support in time as it has a different set of advantages).

skolemize is a term of art in the RDF community (not that that is an excuse, we should pick more accessible terminology if possible):

In any case, thanks for the review and for volunteering to make it better. We'll need that help to make the algorithm more accessible.

@Wind4Greg
Copy link
Collaborator

Wind4Greg commented Jul 25, 2023

Went through the functionality again trying to extract the key mechanisms with a bit of a cryptographic/security bias. This could lead to informative overview text that would also help in cryptographic review. The first four groupings of mechanisms could also apply to BBS.

  1. Mechanisms for specifying statements to be disclosed. For the issuer this is optionally the setting of "mandatory disclosed" statements. For the holder this is the mechanism for selecting which statements to reveal to the verifier (besides those that are "mandatory" to disclose). This is based on the JavaScript Object Notation (JSON) Pointer RFC6901, "JSON Pointer defines a string syntax for identifying a specific value within a JavaScript Object Notation (JSON) document."
  2. Mechanisms for breaking up a JSON-LD document into lists of statements that can be disclosed or must be disclosed. This is based on the RDF canonicalization of a JSON-LD document. Graph language is used to describe the result of this process with each statement making an assertion about a node. Such nodes are either given ids as part of the JSON-LD document or assigned "blank node ids". Such node ids prevent taking "properties" from one node and using them in another, i.e., given a JSON list representing items in a store each with a corresponding price making sure that the prices cannot be switched between items.
  3. Mechanism to avoid data leakage from "blank node ids". The blank node ids described above are generated in an ordered fashion according to the RDF algorithm (right?). This can inadvertently reveal information about statements that the holder may not choose to disclose. To prevent this an HMAC is used to convert pseudo-randomize the id values and hence obscure their original order. The HMAC key should be unique per signature and is included as part of the base signature, i.e., is not kept private from the holder, but is not revealed to the verifier (not included with the derived proof).
  4. Mechanisms to group above statements, i.e., into those that are mandatory or selectively disclosed. Here JSON-LD framing procedures are combined with the issuer or holder JSON Pointer information (arrays of JSON pointer values).
  5. Mechanisms for signatures for optionally disclosed statements. To prevent the holder from combining optionally disclosed statements across separate issuer generated signatures, optionally disclosed statements are individually signed with an ephemeral key pair that exists just for the generating this "base proof". Hence two key pairs are involved. The issuers (long term) key pair and the per signature ephemeral key pair.
  6. Mechanisms for protecting/signing additional information. Using the issuers (long term) key pair information such as the HMAC key, ephemeral public key, mandatory disclosed statements, mandatory disclosure pointers, configuratino options, etc... are protected. Note some of this information is concatenated and hashed.
  7. Mechanisms for serialization and deserialization of information needed for derived proofs and verification.

@dlongley
Copy link
Contributor

dlongley commented Jul 25, 2023

The HMAC key should be unique per signature and is included as part of the signature, i.e., is not kept private.

Note: It is not kept private from the holder, but it is kept private from the verifier, such that the verifier could not use it to brute-force attempt to reproduce the original order (with this being feasible when the possible elided (not disclosed) values is a small enough set).

@Wind4Greg
Copy link
Collaborator

Thanks Dave. I missed that important item!

Copy link
Contributor

@yamdan yamdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not finished reading yet, but here's a suggestion for minor nit.

index.html Outdated Show resolved Hide resolved
@msporny msporny requested a review from seabass-labrax as a code owner August 3, 2023 12:23
@msporny msporny changed the base branch from sd-functions to main August 3, 2023 12:33
msporny and others added 26 commits August 3, 2023 08:39
@msporny
Copy link
Member Author

msporny commented Aug 3, 2023

Normative, multiple reviews, changes requested and made, no objections, merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants