-
Notifications
You must be signed in to change notification settings - Fork 585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kyber optimizations #3387
Kyber optimizations #3387
Conversation
Codecov ReportPatch coverage:
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## master #3387 +/- ##
=======================================
Coverage 88.12% 88.13%
=======================================
Files 617 616 -1
Lines 70331 70303 -28
Branches 6985 6985
=======================================
- Hits 61978 61960 -18
+ Misses 5424 5401 -23
- Partials 2929 2942 +13
... and 14 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks for taking this on. I have added a few minor suggestions.
Out of curiosity: Did you go through the code looking for optimization opportunities or was this the result of some profiling?
m_polynomials(std::move(polynomials)), | ||
m_seed(std::move(seed)), | ||
m_public_key_bits_raw(concat(m_polynomials.to_bytes<std::vector<uint8_t>>(), m_seed)), | ||
m_H_public_key_bits_raw(unlock(m_mode.H()->process(m_public_key_bits_raw))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side-note: Maybe a Hash_Function::process()
with a templated output container would be great to avoid the copy. Similar to the new RandomNumberGenerator::random_vec()
:
Lines 202 to 212 in c810e6c
template<typename T = secure_vector<uint8_t>> | |
requires(concepts::contiguous_container<T> && | |
concepts::resizable_container<T> && | |
concepts::default_initializable<T> && | |
std::same_as<typename T::value_type, uint8_t>) | |
T random_vec(size_t bytes) | |
{ | |
T result; | |
random_vec(result, bytes); | |
return result; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I definitely had that in mind for a follow up
This was sparked by seeing this PR adding X25519+Kyber key exchange for TLS ziglang/zig#14920 where the author quotes numbers for Kyber which were much better than reported by Using his X25519 numbers versus what I see locally as a scale, Kyber encryption is now as fast as the Zig implementation. However decryption is still significantly slower. I'm still confused on that point, because that PR quotes decryption as faster than encryption. But Edit: I realized I did not actually answer your question. This work was all based on profiling with |
Encapsulation recomputes H(pk) whereas decapsulation doesn't. Yeah, Kyber is so fast that such a short hash makes a difference. |
You're missing this trick which the Zig implementation uses. |
b90c027
to
1123636
Compare
The amortizes the overhead of the virtual call and the stream ciphers buffering logic.
Co-authored-by: René Meusel <[email protected]>
Co-authored-by: René Meusel <[email protected]>
@randombit I took the liberty to adapt this to the now merged #3297 and #3294. Due to merge-conflicts I needed to force push. |
In aggregate these improve Kyber performance between 1.5x and 2.5x
cc @boricm @reneme