Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PSA Encodings for EdDSA #4174

Merged

Conversation

gilles-peskine-arm
Copy link
Contributor

Define encodings for EdDSA in the PSA API.

This is a proposed addition to the PSA Cryptography API specification, which we intend to implement in Mbed TLS ahead of its official adoption by PSA.

Resolves #4147.

@gilles-peskine-arm gilles-peskine-arm added enhancement mbed TLS team needs-review Every commit must be reviewed by at least two team members, component-crypto Crypto primitives and low-level interfaces component-psa PSA keystore/dispatch layer (storage, drivers, …) labels Feb 24, 2021
*
* PureEdDSA requires an elliptic curve key on a twisted Edwards curve.
* In this specification, the following curves are supported:
* - #PSA_ECC_FAMILY_TWISTED_EDWARDS, 255-bit: Ed25519 as specified
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key type says 256-bit. I'm not sure which one is right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, weird question. One could argue that the key is really only 251 bits. People seem to call it 256-bits though, even though, during key generation, 5 bits are always set to specific values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could argue the curve size is 255, since operations on the curve use modulo p which is 2^255 - 19. But it's a discussion about semantics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to be pedantic here. For some users reading 255 or 251 might be alarming, raising questions. I think that this time we could be nice to users and say 256. This means 2-3 bit differences in implied bit security, but that shouldn't matter in the overwhelming majority of use cases. (I know, this is a complete opposite what I was saying 2 years ago.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing bit size in PSA for Curve25519 is 255: https://github.com/ARMmbed/mbedtls/blob/7f3d10de02567954077095f28211d8f0aa53c96f/library/psa_crypto.c#L457

Since Curve25519 and Ed25519 are equivalent, can we be consistent and say 255 for Ed25519 as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generic definition in the PSA specification is “the bit size of q for a curve over F_q” for Weierstrass curves. There's a different definition for Montgomery curves, but it gives the same results for Curve25519 and Ed25519. I don't see a reason to use different sizes for Curve25519 and Ed25519. So I've aligned the size to 255.

* in RFC 8032.
* The curve is Edwards25519.
* The hash function used internally is SHA-512.
* - #PSA_ECC_FAMILY_TWISTED_EDWARDS, 448-bit: Ed448 as specified
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 448-bit consistent with the way we define private key sizes, given that the private key is a 57-byte string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is. If we go with @stevew817 's suggestion and use 255 for Ed25519, then we are signalling the bitlength of the prime defining the underlying field and not some key length.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PSA Crypto API spec does describe the [exported] private key size under psa_export_key() as the format for exported key pairs is the private key.

For Weierstrass curves it ties the output size to the key bit size:

For Weierstrass curve families PSA_ECC_FAMILY_SECT_XX, PSA_ECC_FAMILY_SECP_XX, PSA_ECC_FAMILY_FRP and PSA_ECC_FAMILY_BRAINPOOL_P_R1, the content of the privateKey field of the ECPrivateKey format defined by Elliptic Curve Private Key Structure [RFC5915].

This is a ceiling(m/8)-byte string in big-endian order where m is the key size in bits.

For Montgomery curves it does provide the computed value, but is explicit about the byte size of the exported key:

For curve family PSA_ECC_FAMILY_MONTGOMERY, the scalar value of the ‘private key’ in little-endian order as defined by Elliptic Curves for Security [RFC7748] §6. The value must have the forced bits set to zero or one as specified by decodeScalar25519() and decodeScalar448() in [RFC7748] §5.

This is a ceiling(m/8)-byte string where m is the key size in bits. This is 32 bytes for Curve25519, and 56 bytes for Curve448.

For the Edwards curves we could drop the computational definition and just provide the byte-size of the exported keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to psa_export_key, there's information about what is considered the “size” of an elliptic curve key in the specification of ECDH (probably not the right place for that!):

The shared secret produced by key agreement is the x-coordinate of the shared secret point. It is always ceiling(m / 8) bytes long where m is the bit size associated with the curve, i.e. the bit size of the order of the curve’s coordinate field. When m is not a multiple of 8, the byte containing the most significant bit of the shared secret is padded with zero bits. The byte order is either little-endian or big-endian depending on the curve type.

  • For Montgomery curves (curve family PSA_ECC_FAMILY_MONTGOMERY), the shared secret is the x-coordinate of Z = d_A Q_B = d_B Q_A in little-endian byte order.

    • For Curve25519, this is the X25519 function defined in Curve25519: new Diffie-Hellman speed records [Curve25519]. The bit size m is 255.
    • For Curve448, this is the X448 function defined in Ed448-Goldilocks, a new elliptic curve [Curve448]. The bit size m is 448.
  • For Weierstrass curves (curve families PSA_ECC_FAMILY_SECP_XX, PSA_ECC_FAMILY_SECT_XX, PSA_ECC_FAMILY_BRAINPOOL_P_R1 and PSA_ECC_FAMILY_FRP) the shared secret is the x-coordinate of Z = h d_A Q_B = h d_B Q_A in big-endian byte order. This is the Elliptic Curve Cryptography Cofactor Diffie-Hellman primitive defined by SEC 1: Elliptic Curve Cryptography [SEC1] §3.3.2 as, and also as ECC CDH by NIST Special Publication 800-56A: Recommendation for Pair-Wise Key-Establishment Schemes Using Discrete Logarithm Cryptography [SP800-56A] §5.7.1.2.

    • Over prime fields (curve families PSA_ECC_FAMILY_SECP_XX, PSA_ECC_FAMILY_BRAINPOOL_P_R1 and PSA_ECC_FAMILY_FRP), the bit size is m = ceiling(log_2(p)) for the field F_p.

    • Over binary fields (curve families PSA_ECC_FAMILY_SECT_XX), the bit size is m for the field F_{2^m}.

Let's try to find a better compromise between consistency, correctness, comprehensibility and backward compatibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Steven's comment on 255 also applies here: in the Montgomery family we use 448, so let's be consistent and use 448 here too.

(Also, as Janos said, let's be nice to users: 448 is what's in the common name, require people to use another number would be confusing.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanesca - we are already inconsistent with a "what's in the common name" formula - secp224k1 is defined to use key_bits = 225.

However, [I think] we are currently consistent with the rule "for ECC keys, key_bits is the bit size of the order of the curve’s coordinate field". I.e. for prime fields over p, it is ceiling(log2(p)). (secp224k1 uses a 225 bit prime number.)

If we use the latter rule, then Ed25519 would have key_bits = 255 and Ed448 has key_bits = 448.

The ECDH description can be simplified by removing some of the formulae, as the normative references provide the detail of the algorithm implementation. Comparing the description with that for psa_export_public_key(), we can make the key bit-size -> output relationship specific to the curve type, instead of trying to provide a one size fits all definition for the shared secret size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the various reasons exposed in this thread, I've kept 448.

(((alg) & ~PSA_ALG_HASH_MASK) == PSA_ALG_HASH_EDDSA_BASE)

/** Edwards-curve digital signature algorithm with prehashing (HashEdDSA),
* using SHAKE256 and the Edwards448 curve.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presume copypasta here - should be SHA-512 and Edwards25519 curve?

* The hash function used internally is SHA-512, with
* `dom2(0, "") = ASCII("SigEd25519 no Ed25519 collisions") || 0x00 0x00`
* prepended to the input.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be used with sign/verify message or just sign/verify hash?

*
* This algorithm is Ed25519 as specified in RFC 8032.
* The curve is Edwards25519.
* The input is first hashed with SHA-512.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the hashing to be done by the caller, or is this done within the operation? - the wording here is unclear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the information was already present in the description of the signature function, but since EdDSA is the only algorithm family that has a non-hash-and-sign variant, I've added an explicit explanation to the algorithm description.

*
* This algorithm is Ed448 as specified in RFC 8032.
* The curve is Edwards448.
* The input is first hashed by taking the first 64 bytes of the SHAKE256
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above - should the caller do the hashing, or does the implementation? - or does this depend on whether sign_message or sign_hash is called?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a user, I would assume that:

  • sign_message is the all-in-one operation for message m
  • sign_hash on an EdxxxPH algorithm takes in the prehashed message, i.e. PH(m) as defined in RFC 8032

But I can see an argument for making sign_hash take the fully-hashed input as well (in this case this would be SHA512(dom2(F, C) || R || A || PH(M))). Defining the API that way would probably make a driver implementation underneath easier, but is not as intuitive for a user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with other hash-and-sign algorithms, you can either hash the message first and call {sign,verify}_hash, or let the all-in-one signature take care of that with {sign,verify}_message. What PSA calls the hashing step is what EdDSA calls the prehash. The hashing step is independent of the key. The hash calculation used internally as part of the EdDSA calculation is irrelevant to the user, so it isn't exposed through the API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it our place (at least in these docs) to discuss why this is done (at least the possible security implications) of the difference between directly signing and the hash and sign? At least the authors argue of protecting against possible weaknesses in the hash function with their construct, and those protections only work by signing the entire message.

Somewhat related, would we want to have an API that would allow the entire message to be signed, but allow the internal hashing to be done in pieces? Or would users just be expected to hash the message first themselves in this case? Although, if someone wished for the possible security benefit of signing the entire message, they would need to have internal access to how the hash that is happening under the hood happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of limiting HashEdDSA to psa_(sign|verify)_hash(), on the justification that an application which can call psa_(sign|verify)_message() (i.e. it has the entire message in a single buffer) should use PureEdDSA because it has better security properties. Interoperability may require the use of HashEdDSA, and we should not make it harder for an application than necessary by denying the use of psa_(sign|verify)_message().

If/when we provide a multi-part asymmetric signature operation API, there is a strong case to make it not support algorithms which require two passes of the input. Although that would make for an interesting debate about whether to support verification of PureEdDSA (only needs one hash computation on the message data), but not signature (which requires two sequentially dependent passes)?

* scenarios where a hash function based on SHA3/SHAKE is desired, SHA3-512
* has the same output size and a (theoretically) higher security strength.
*/
#define PSA_ALG_SHAKE256_64 ((psa_algorithm_t)0x02000014)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0x02000014 is already allocated to SM3. Please use 0x02000015 instead.

@@ -787,6 +803,13 @@
#define PSA_ALG_SHA3_384 ((psa_algorithm_t)0x02000012)
/** SHA3-512 */
#define PSA_ALG_SHA3_512 ((psa_algorithm_t)0x02000013)
/** The first 64 bytes of the SHAKE256 output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a difference in how SHAKE256 is defined in FIPS202 and in RFC8032: in the former this algorithm is parameterized by the bit-size of the output length, the definition used in RFC8032 is parameterized by the octet-length of the output.

Thus PSA_ALG_SHAKE256_64 maps easily onto the normative Ed448ph definition, but is not consistent with the FIPS202 normative definition for SHA3 and SHAKE functions, nor with the use of a bit-length suffix for the other SHA hash algorithms.

Would PSA_ALG_SHAKE256_512 be more appropriate for PSA Crypto?.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have no strong opinion here. It could even be PSA_ALG_ED448_PREHASH!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That would inspire some confusion with the proposed PSA_ALG_ED448PH signature algorithm - so it might be better to name it after the underlying hash algorithm rather than the purpose to which it is put?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think naming the hash for its intended purpose would make sense, but I don't really think PSA_ALG_ED448_PREHASH is a good name for this reason.

@athoelke
Copy link
Contributor

athoelke commented Mar 3, 2021

A comment on the commit message in github.com//pull/4174/commits/41bfc0e1fb5eca6e735140c30fb9956d2a52f7ab:

SM3 was added to v1.0.1, which is a published specification, so the collision does not merely occur in the next draft.

Note: once we have the encoding details published in an appendix (coming for vNext of the spec) it will be easier to identify an appropriate encoding for a new algorithm.

@gilles-peskine-arm
Copy link
Contributor Author

I've corrected the commit message.

#define PSA_ALG_PURE_EDDSA ((psa_algorithm_t)0x06000800)

#define PSA_ALG_HASH_EDDSA_BASE ((psa_algorithm_t)0x06000900)
#define PSA_ALG_IS_HASH_EDDSA(alg) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The equivalent helper functions for RSA and ECDSA algorithms are part of the PSA spec, which argues that this should also be documented in the Crypto spec.

Note: they are not actually used/referenced in the spec, other than in the example definition of PSA_ALG_IS_HASH_AND_SIGN(). So we could probably have removed them from the spec before v1.0.0 - but trickier now, and the example definition is useful to illustrate the encoding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helper functions such as PSA_ALG_IS_RSA are useful in libraries for protocols that are parametrized by asymmetric algorithms such as X.509 and TLS. E.g. if PSA_ALG_IS_RSA(signature_algorithm) then pick an RSA cipher suite, if PSA_ALG_IS_ECDSASA(signature_algorithm) then pick an ECDSA cipher suite.

* The hash function used internally is the first 114 bytes of the
* SHAKE256 output, with
* `dom4(1, "") = ASCII("SigEd448") || 0x01 0x00`
* prepended to the input.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is misleading (based on my reading of RFC8032):

The signature algorithm uses this prefix in the input to the two invocations of SHAKE256 (step 2 and step 4 in §5.2.6), with the message data as the suffix of the SHAKE input - but there is additional data computed during the algorithm that is between the prefix and the message in each invocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed all references to dom2 and dom4.

Copy link
Contributor

@athoelke athoelke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors in the presentation of the RFC8032 internal functions?

* The curve is Edwards448.
* The hash function used internally is the first 114 bytes of the
* SHAKE256 output, with
* `dom4(1, "") = ASCII("SigEd448") || 0x01 0x00`
Copy link
Contributor

@athoelke athoelke Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Ed448, flag F is 0 (it is 1 for Ed448ph) - so the correct parameterization is dom4(0, "") which is ASCII("SigEd448") || 0x00 0x00.

But I am not sure this is necessary to specify in the description - the RFC is clear, it is just necessary to specify that that context is an empty string for all of the currently defined EdDSA algorithms.

* The curve is Edwards25519.
* The prehash is SHA-512.
* The hash function used internally is SHA-512, with
* `dom2(0, "") = ASCII("SigEd25519 no Ed25519 collisions") || 0x00 0x00`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be dom2(1, "") - but also see comments above about correctness and relevance of this statement.

* The prehash is the first 64 bytes of the SHAKE256 output.
* The hash function used internally is the first 114 bytes of the
* SHAKE256 output, with
* `dom4(0, "") = ASCII("SigEd448") || 0x00 0x00`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be dom4(1, "") - but also see comments above about correctness and relevance of this statement.

@gilles-peskine-arm
Copy link
Contributor Author

I force-pushed to amend the last commit “Use a bit-size in the algorithm name ” which was incomplete and broke the build.

I added new commits to address the remaining comments about bit sizes and dom functions.

athoelke
athoelke previously approved these changes Mar 23, 2021
Copy link
Contributor

@athoelke athoelke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me.

[I have also reflected the new updates in PRs against the specification]

stevew817
stevew817 previously approved these changes Mar 23, 2021
Copy link
Contributor

@stevew817 stevew817 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Implementers and users would have to refer to the RFC for the detailed
specification of the algorithm anyway.

Keep a mention of the curves and hashes involved for avoidance of doubt.

Signed-off-by: Gilles Peskine <[email protected]>
@gilles-peskine-arm
Copy link
Contributor Author

I force-pushed a rebased version of this PR to resolve the conflict in the systematically generated test data. In the commit “Add key material for twisted Edwards curves”, in addition to re-running generate_psa_tests.py, I fixed a mistake where I'd accidentally swapped the test data for Montgomery and Edwards curves. This wasn't caught before the rebase because that test data was only used for not-supported tests, but after the rebase, the storage format tests caught that the purported Curve25519 private key was not in canonical form.

stevew817
stevew817 previously approved these changes Mar 29, 2021
Copy link
Contributor

@mpg mpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good to me! I just have two questions, which are probably just me missing some context, but I'd like to make sure before I approve.

I also left one comment about an issue that's pre-existing and should not block this PR.

include/psa/crypto.h Show resolved Hide resolved
tests/suites/test_suite_psa_crypto_metadata.data Outdated Show resolved Hide resolved
d3zd3z
d3zd3z previously approved these changes Apr 2, 2021
Copy link
Contributor

@d3zd3z d3zd3z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a possible comment about make/cmake, this looks good to me.

@@ -239,7 +239,7 @@ psa/key_ladder_demo$(EXEXT): psa/key_ladder_demo.c $(DEP)
echo " CC psa/key_ladder_demo.c"
$(CC) $(LOCAL_CFLAGS) $(CFLAGS) psa/key_ladder_demo.c $(LOCAL_LDFLAGS) $(LDFLAGS) -o $@

psa/psa_constant_names$(EXEXT): psa/psa_constant_names.c $(DEP)
psa/psa_constant_names$(EXEXT): psa/psa_constant_names.c psa/psa_constant_names_generated.c $(DEP)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if this needs to be done for cmake? I suspect that cmake's include dependency detection will catch this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess automatic dependency detection picks it up but I haven't checked.

This was a mistake, there's no reason for the dependencies to be
commented out. The dependencies on PSA_WANT_ALG_EDDSA aren't actually
necessary at the moment, but they might be in certain configurations
if some macros are simplified to save code size.

Signed-off-by: Gilles Peskine <[email protected]>
@gilles-peskine-arm gilles-peskine-arm dismissed stale reviews from d3zd3z and stevew817 via 5ef0b97 April 6, 2021 10:51
Copy link
Contributor

@mpg mpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for answering my questions and fixing the commented-out dependencies. It all looks good to me now.

Copy link
Contributor

@stevew817 stevew817 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving based on my previous approval and after reviewing the latest commit.

@gilles-peskine-arm gilles-peskine-arm merged commit b420259 into Mbed-TLS:development Apr 7, 2021
bensze01 added a commit to bensze01/mbedtls that referenced this pull request Apr 8, 2021
The reference was introduced in Mbed-TLS#4174

Signed-off-by: Bence Szépkúti <[email protected]>
bensze01 added a commit to bensze01/mbedtls that referenced this pull request Apr 8, 2021
The reference was introduced in Mbed-TLS#4174.

Signed-off-by: Bence Szépkúti <[email protected]>
daverodgman pushed a commit that referenced this pull request Apr 23, 2021
daverodgman pushed a commit that referenced this pull request Apr 23, 2021
The reference was introduced in #4174.

Signed-off-by: Bence Szépkúti <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component-crypto Crypto primitives and low-level interfaces component-psa PSA keystore/dispatch layer (storage, drivers, …) enhancement needs-review Every commit must be reviewed by at least two team members,
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Publish the PSA encodings for EdDSA
6 participants