-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply Soter KDF after PBKDF2 for passphrases #640
Conversation
Export a bunch of functions from sym_enc_message.c which are used by the symmetric key API to perform ZRTP-style Soter KDF based on Secure Cell message length and user-provided associated context. This API is not very convenient due to compatibility reasons, but it's okay for an internal API to force its users to deal with themis_auth_sym_kdf_context() and its friends. Note that secure_cell_seal_passphrase.c already has a function named themis_auth_sym_derive_encryption_key() which has a similar goal, but uses passphrase-based derivation algorithm. Rename it with a customary "passphrase" suffix to avoid ambiguity and naming conflicts.
Since we are going to use passphrase to derive a prekey, we should not depend on the default encryption algorithm choice. Instead, always derive full 32-byte prekey from passphrase with PBKDF2. After that we will use Soter KDF to truncate the key to the length suitable for the AES flavor in use. soter_alg_key_length() verification will also be performed at that point to ensure that Secure Cell header is not corrupted.
Instead of using PBKDF2 output directly, pass the resulting prekey through an additional round of Soter KDF to associate the encryption key with user-provided associated context as well as the encrypted message length, similar to how symmetric key API does this. This changes the encryption algorith, but it's fine to do this before public relese. This is the only chance we have to do this update without it becoming a massive backward-compatibility pain. The change slightly improves security of the encryption and makes the algorithms a bit more compatible and similar in structure. However, this does not change the API or data format in the slightest.
Some of the tests process preencrypted data to make sure we can decrypt something encrypted earlier, possibly by a differen wrapper. Update the hardcoded values since the encryption algorithm has changed a bit. Themis Core tests check that we can handle different key lengths. Other language test verify mostly handling of the passphrase encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's weird to see such huge performance drop after adding one Soter KDF call 🤔 We assume that it has only one round, right?
Alrighty. I realized that I do suck at benchmarks: I've been running them on a laptop with CPU frequency scaling enabled. However... After having the CPUs run at their full 3.4 GHz glory and collecting 50 samples instead of 10 (urgh, 15 minutes for one benchmark run) — I still see consistent 6-8% drop in performance. Benchmark outputBefore:
After:
No serious ideas at the moment:
It would be nice if someone could reproduce the experiment. How to benchmark# Prepare repo
git clone https://github.com/cossacklabs/themis.git
cd themis
git remote add ilammy https://github.com/ilammy/themis.git
git fetch ilammy
# Build and install master
git checkout master
make
sudo make install
# Build benchmarks
cd benches/themis
cargo bench --no-run
# Wait for Rust to build stuff...
# Run benchmarks (master)
cargo bench -- --sample-size 10 passphrase
# Wait for benchmarks to complete...
# Build and install this branch
git checkout zrtp-goes-in-every-field
cd ../..
make
sudo make install
cd -
# Run benchmarks again
cargo bench -- --sample-size 10 passphrase
# Wait for benchmarks to complete...
Yeah, it should. There are basically I haven't done any profiling yet so I have no idea where it spends time. |
Yknow, maybe that's soter_wipe(). Here are top ten functions that we're spending time in when running benchmarks, based on
OPENSSL_cleanse() is the one used by soter_wipe(). What are others, I don't know yet. But EVP_MD_CTX_copy_ex() looks suspicious. |
In progress.. |
Benchmarks on my machine (openssl 1.1.1g, 10 passphrases):
Benchmark outputBefore
After
|
Can confirm on my macOS laptop there's also no statistically significant change in performance either. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Okay, I've done a couple more attempts at measuring with clean rebuilds. This time I've been getting inconclusive results which are fluctuating around. So I guess it is fine, that first time was a spike. Unless @Lagovas comes out with some surprises, I'd be merging this tomorrow morning. |
Ubuntu
Ubuntu 18.04
Archlinux didn't stored openssl/linux versions, look tomorrow Archlinux
|
Alrighty. There seems to be some unexplained performance drop, but it’s not clear how exactly it’s linked to the implementation. Since this is a new feature, it won’t be a regression so I guess it’s okay to merge this as is, then do detailed profiling and optimizations later. I’ve noted this in out internal task tracker, I hope we’ll get back to this issue at some point. |
Instead of using PBKDF2 output directly, pass the resulting prekey through an additional round of Soter KDF to associate the encryption key with user-provided associated context as well as the encrypted message length, similar to how symmetric key API does this. Now if different associated context is used for each individual Secure Cell, the encryption key will be different even if PBKDF2 salt somehow ends up the same.
Compatibility considerations
This changes the encryption algorithm, but it's fine to do this before public release. This is the only chance we have to do this update without it becoming a massive backward-compatibility pain.
The change slightly improves security of the encryption and makes the algorithms a bit more compatible and similar in structure. However, this does not change the API or data format in the slightest.
Benchmarks
PBKDF2 is the main time sink in passphrase API. It is expected that 1 additional round of Soter KDF (basically, one HMAC-SHA-256 computation) will not have any noticeable effect on the performance of passphrase encryption and decryption, given that we already do 200,000 rounds for PBKDF2.
Here are measurements before and after this change, on my machine with 4 x Intel Core i5-8250U, using Linux with generic OpenSSL 1.1.1g, running only the benchmarks:
Benchmark output
Before:
After:
Well, this is intredesting 🤔 But not very relevant, given the noise brackets.
It doesn't seem like that much. I'd expect some variation, but not a consistent performance drop across all input sizes. I guess I suck at microbenchmarks and by environment is noisy. I'll keep looking into it.
Nevertheless, it's great that we have benchmarks at all.
Checklist
Changelog is updated(nothing notable happens here)Original passphrase encryption implemented in Passphrase-based API of Secure Cell #577, this PR cross-reference is enough.