-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: x448 and Ed448 #3933
Feature: x448 and Ed448 #3933
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass. Didn't look at the actual implementation of Ed/X448, yet. Mostly, higher-level API and design nits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more remarks mostly focussing on the utility classes.
* the value might be larger than the prime modulus. When calling the to_bytes() method, the | ||
* canonical representation is returned. | ||
*/ | ||
class Gf448Elem { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps its worthwhile defining some commonly used types:
using byte_span = std::span<const uint8_t, to_bytes(448)>;
using word_span = std::span<const uint64_t, to_words(448)>;
// same for arrays, perhaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't use such aliases because they hide the actual type from the user. Also, I would need one variant with and without const. However, I think using constants for 56 (bytes per 448) and 7 (words per 448) are sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments regarding the actual implementation of x448. Looks really concise to me! 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments regarding Ed448.
Generally, the implementation is well-structured and thought out. 😃 Don't be put off by the number of comments, most address small nits or potential memory inefficiencies. Some minor architectural things. One thing, I'd really appreciate in general: Let's try to consolidate magic strings and magic numbers. First and foremost, the lengths of key material arrays (both in word length and byte length). But also the hash functions used. Please try to have as little "atomic" constants as possible and calculate dependent values were reasonably possible. As mentioned before somewhere, sooner or later we should establish a standard interface to be able to access such constants statically from within the library. |
/// @return a word array for c = 0x8335dc163bb124b65129c96fde933d8d723a70aadc873d6d54a7bb0d | ||
consteval std::array<word, WORDS_C> c_words() { | ||
// Currently load_le does not work with constexpr. Therefore, we have to use this workaround. | ||
const std::array<uint8_t, WORDS_C * sizeof(word)> c_bytes{0x0d, 0xbb, 0xa7, 0x54, 0x6d, 0x3d, 0x87, 0xdc, 0xaa, 0x70, | ||
0x3a, 0x72, 0x8d, 0x3d, 0x93, 0xde, 0x6f, 0xc9, 0x29, 0x51, | ||
0xb6, 0x24, 0xb1, 0x3b, 0x16, 0xdc, 0x35, 0x83}; | ||
return load_le<std::array<word, WORDS_C>>(c_bytes); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a blunt nerd-sniping attempt: if we would manage to make Botan::hex_decode()
constexpr, this could be:
/// @return a word array for c = 0x8335dc163bb124b65129c96fde933d8d723a70aadc873d6d54a7bb0d | |
consteval std::array<word, WORDS_C> c_words() { | |
// Currently load_le does not work with constexpr. Therefore, we have to use this workaround. | |
const std::array<uint8_t, WORDS_C * sizeof(word)> c_bytes{0x0d, 0xbb, 0xa7, 0x54, 0x6d, 0x3d, 0x87, 0xdc, 0xaa, 0x70, | |
0x3a, 0x72, 0x8d, 0x3d, 0x93, 0xde, 0x6f, 0xc9, 0x29, 0x51, | |
0xb6, 0x24, 0xb1, 0x3b, 0x16, 0xdc, 0x35, 0x83}; | |
return load_le<std::array<word, WORDS_C>>(c_bytes); | |
} | |
consteval std::array<word, WORDS_C> c_words() { | |
return load_le<std::array<word, WORDS_C>>(Botan::hex_decode( | |
"0dbba7546d3d87dcaa703a728d3d93de6fc92951b624b13b16dc3583")); | |
} |
... which would be quite fantastic.
Thanks for your re-review, @reneme! However, it seems that you somehow managed to review the files that were moved (i.e., the files in Regarding:
No, I measured the default static build. |
Mhm, I replied to discussions that were already open, while having started a pending review. Apparently GitHub treats those comments in a funny way; half reply, half free-standing but unanswerable new threads. >.< |
Mhh, on Windows the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One little nit, otherwise I'm happy. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good! Didn’t have time to finish review but here are some initial comments.
Ed448Point Ed448Point::scalar_mul(const Scalar448& s) const { | ||
Ed448Point res(0, 1); | ||
|
||
// Square and multiply (double and add) in constant time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of room for optimization here if performance becomes an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have any optimizations in mind, let me know 👍 Otherwise, I think the performance is currently acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just that double-and-always-add is (literally) the worst case scenario for a multiplication algorithm since for scalar length l
it performs l
doublings and l
additions. Easy optimization would be a fixed window with a constant time table lookup, eg a 4 bit window you instead do l
doublings and l/4
additions.
Fine to use this if you consider current performance ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds promising! I will leave it with the simple variant for now since the PR is already complex enough. Maybe we can speed it up in a follow-up PR.
Thanks for your comments, @randombit. Any suggestions are welcome :) |
@FAlbertDev thanks. This needs a rebase post #3869 and possibly some history cleanup. I'll do a final review after. |
Done :) |
I'm guessing this caused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good, thanks
Done. Should be ready to merge. |
Co-authored-by: René Meusel <[email protected]>
The test can be re-enabled when we support Botan 3 due to: randombit/botan#3933 Signed-off-by: Björn Svensson <[email protected]>
The test can be re-enabled when we support Botan 3 due to: randombit/botan#3933 Signed-off-by: Björn Svensson <[email protected]>
The test can be re-enabled when we support Botan 3 due to: randombit/botan#3933 Signed-off-by: Björn Svensson <[email protected]>
The test can be re-enabled when we support Botan 3 due to: randombit/botan#3933 Signed-off-by: Björn Svensson <[email protected]>
This pull request adds support for the x448 key agreement and the ed448 signature scheme as defined in RFC 7748 and RFC 8032. Requested in #3895.
Pull Request Dependencies
Features
Internals
The algorithms used for modular arithmetic in$GF(p)$ , where $p = 2^{448} - 2^{224} - 1$ , are based on the paper "Reduction Modulo 2^448 - 2^224 - 1" from Nath and Sarkar. These algorithms operate on 64-bit limbs and are implemented in the
Gf448Elem
class.In addition, the Ed448 algorithm requires modular arithmetic on the group order of Curve448. To handle this, a
Scalar448
class has been implemented, providing the necessary operations modulo the group order (L in RFC 8032).Both implementations are designed to be constant-time. As the current big integer implementation is not constant-time for all required operations (especially reduction), the
Gf448Elem
andScalar448
classes are independent of the big integer implementation and directly utilize the mp_core operations. This design may have future potential for an abstractGfElem
class that can be used for other prime fields as well.The test cases for these implementations are based on wycheproof.
Performance