Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: low entropy mac passphrase in cipher seed (see issue #4182) #4296

Conversation

jorgeantonio21
Copy link
Contributor

Description
---The following is an attempt to improve CipherSeed mnemonic generation by tackling MAC being keyed directly with a low entropy passphrase. We use proper domain separation to attain this.

Motivation and Context
--- The generation of MAC, within the context of a CipherSeed instance, is obtained through keying a (low) entropy passphrase. In order to reduce the chances of success of an attack involving offline key pre-computation, it is desirable to hash the passphrase, before MAC keying, using proper domain separation. The current PR is an attempt in this direction.

How Has This Been Tested?
--- With previous unit tests.

@jorgeantonio21
Copy link
Contributor Author

Currently, we use a Blake256 to generate domain separation. It might be desirable to use some other hashing algorithm, for purposes I don't understand.

Also feel free to comment below, as many of these concepts are not completely understood by the creator of the current PR.

Copy link
Collaborator

@SWvheerden SWvheerden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change a bunch of &[T] to Vec?

dan_layer/core/src/services/checkpoint_manager.rs Outdated Show resolved Hide resolved
dan_layer/core/src/services/checkpoint_manager.rs Outdated Show resolved Hide resolved
@hansieodendaal
Copy link
Contributor

Hi @jorgeantonio21, maybe you can squash all commits and rebase? There seems to be many unrelated changes in the code changes.

@jorgeantonio21 jorgeantonio21 force-pushed the ja-mac-keyed-directly-with-low-entropy-passphrase branch from d9b4d3b to ab486ad Compare July 11, 2022 12:48
@jorgeantonio21
Copy link
Contributor Author

Hi @jorgeantonio21, maybe you can squash all commits and rebase? There seems to be many unrelated changes in the code changes.

Hi @hansieodendaal, thanks for the advice. I did rebase and the merge conflicts seem to be tackled.

Comment on lines 277 to 313
if salt.len() > 16 || persona.len() > 16 {
return Err(KeyManagerError::DecryptionFailed);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This limitation is not required anymore with the use of the hashing API

error::KeyManagerError,
mnemonic::{from_bytes, to_bytes, to_bytes_with_language, Mnemonic, MnemonicLanguage},
};

const CIPHER_SEED_VERSION: u8 = 0u8;
const CIPHER_SEED_PERSONA: &str = "CIPHER_PERSONA";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const CIPHER_SEED_PERSONA: &str = "CIPHER_PERSONA";
const CIPHER_SEED_PERSONA: &str = "com.tari.key_manager.cipher_seed";

Copy link
Collaborator

@CjS77 CjS77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed Birthday module primarily

let days = u64::try_from(chrono::Utc::now().timestamp()).unwrap() / SECONDS_PER_DAY;
let birthday = u16::try_from(days).unwrap_or(0u16);
CipherSeed::new_with_birthday(birthday)
let birthday_data = Birthday::default();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use Birthday instead of tearing it apart?

Comment on lines 123 to 125
let version = birthday_data.version();
let birthday = birthday_data.birthday();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of all this, just store a birthday instance

pub struct Birthday {
zero_point_time: u64,
birthday: u16,
version: u8,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way version is calculated seems very weird. Why is the version dependent on the current time??

There are no docs to explain this.

edit: Ok, it's just named in a very misleading way with no docs. See comments below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a more detailed comment below.

Comment on lines 37 to 40
pub fn default() -> Self {
let network = Network::Dibbler;
Self::new(network)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impl Default!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Self::new_from_current_time(network, current_time)
}

fn new_from_current_time(network: Network, current_time: u64) -> Self {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn new_from_current_time(network: Network, current_time: u64) -> Self {
fn new_from_time(network: Network, time: u64) -> Self {

if it was the current time you were using, you wouldn't need to pass the time in

Copy link
Contributor Author

@jorgeantonio21 jorgeantonio21 Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I only implemented this way for testing purposes. I am going to refactor it conveniently.


let mut zero_point_time = Self::get_network_genesis_time(network);

let days = (current_time - zero_point_time) / SECONDS_PER_DAY;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would. be more idiomatic to use Duration and Instant


let days = (current_time - zero_point_time) / SECONDS_PER_DAY;
let birthday = u16::try_from(days % PERIOD_LENGTH).unwrap();
let version = u8::try_from(days / PERIOD_LENGTH).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

????

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh! This is an epoch counter.

Please rename it some other than version (epoch / period for example), and provide a doc string explaining what it is for.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then wouldn't it be simpler to make the birthday a u32/u64 than manually reconstruct what is effectively a u24?

Copy link
Contributor Author

@jorgeantonio21 jorgeantonio21 Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to add docs. When I implemented the code, I had in mind the third suggestion described here.

From what I understood, the version field in Birthday should really work as an epoch counter. This would be used later on to index the version in CipherSeed. Following this approach, versioning is epoch-dependent.

Do you think this approach makes sense ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one decides to keep the versioning for CipherSeed to be dependent on the epoch. Another approach would be to migrate the version computation to the CipherSeed impl, thus make it more explicit the use case we have in mind.

Copy link
Collaborator

@AaronFeickert AaronFeickert Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of the overall version field should be an unambiguous identification of all parts of the protocol that all software agrees on. So if you were to change Argon2 parameters or migrate to a different MAC algorithm, you'd define a new version; any software that knows about it will do the right thing, and any software that doesn't will simply produce an error.

I don't, however, think it's a good idea to attempt to compute a dynamic version number from the birthday. The intent of the comment in this issue was that once enough time passes such that a given birthday could potentially overflow, then you (or whoever is doing development that far into the future!) would manually define a new version that accounts for this. Doing it dynamically feels like a risky approach that could cause conflicts with any non-birthday protocol changes requiring new versions.

let mut zero_point_time = Self::get_network_genesis_time(network);

let days = (current_time - zero_point_time) / SECONDS_PER_DAY;
let birthday = u16::try_from(days % PERIOD_LENGTH).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason it's not (days % PERIOD_LENGTH) as u16?

Copy link
Contributor Author

@jorgeantonio21 jorgeantonio21 Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much more readable this way, I am going to change it.

let birthday = u16::try_from(days % PERIOD_LENGTH).unwrap();
let version = u8::try_from(days / PERIOD_LENGTH).unwrap();

zero_point_time += PERIOD_LENGTH * u64::from(version);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer PERIOD_LENGTH * (version as u64) but simply stylistic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely so ! I am going to change it :)

Comment on lines 82 to 84
pub fn get_network_genesis_time(network: Network) -> u64 {
get_genesis_block(network).block().header.timestamp.as_u64()
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to not have the dependency on tari_core::blocks::genesis_block at all.
Better to pass in the genesis time as a parameter to the constructor. Then this module is more independent / decoupled. Then you ALSO can drop the dependence on Network it looks like. Bonus!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is definitely a possibility. However, I don't feel confident that the birthday genesis time might depend on the API use. Since we are primarily interested in using Birthday to set up a timestamp to scan the blockchain, it might be good to have a consistent genesis time for birthday. I am possibly missing something in your commentary.

Moreover, following issue, it is referred that one should have an hard-coded birthday genesis time, and it is suggested to use the timestamp of the genesis network. I also really want to remove the dependency on the Network, as well. One could just hard-code a recent date (like now) and set it for the future use of the Birthday type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind hardcoding all parameters is to ensure that all implementations and installations agree on them unambiguously. Then you don't need to worry about differences in defaults or other computed parameters; everyone is working from the same playbook.

@jorgeantonio21 jorgeantonio21 force-pushed the ja-mac-keyed-directly-with-low-entropy-passphrase branch 2 times, most recently from 59c1d3a to 11ffe02 Compare July 15, 2022 08:14
Copy link
Contributor

@hansieodendaal hansieodendaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just the failing "test key manager wasm" tests left to fix

@jorgeantonio21 jorgeantonio21 force-pushed the ja-mac-keyed-directly-with-low-entropy-passphrase branch from b26b7f9 to b842bdc Compare July 15, 2022 12:22
@@ -65,7 +72,15 @@ pub const CIPHER_SEED_MAC_BYTES: usize = 5;
/// checksum 4 bytes
///
/// In its enciphered form we will use the MAC-the-Encrypt pattern of AE so that the birthday and entropy will be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, MAC-then-encrypt is in general not a safe approach to authenticated encryption. It can be shown to be safe under certain designs, like a stream cipher with a strong MAC, but it isn't clear that the truncated hash function approach here strictly meets this requirement.

Comment on lines 75 to 76
/// encrypted. Moreover, instead of generating the MAC directly using the user provided low entropy passphrase, we
/// shall instead produce a 32-byte hashing of it. In reality, such 32-byte hashing is obtained by applying Argon2,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much more than just a hash of the password, which would not be safe for this purpose. The intent is to use a password-based key derivation function to generate a derived key of higher effective entropy through the use of a carefully-designed function like Argon2 that's built for this purpose. Then we use separate portions of that derived key for the MAC and for ChaCha20 encryption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarification here, Aaron. I reused part of your explanation in the documentation, I hope it makes more clear the goal of the current refactor.

.chain(passphrase.as_bytes())
.finalize_variable(|res| hmac.copy_from_slice(res));
// generate the hash MAC
let mut hmac = Self::generate_hmac(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity I would rename this, since HMAC has a specific definition that isn't being used here. In this design the MAC is generated from a hash function, but it's not strictly HMAC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rename it simply to mac.

let blake2_nonce_hasher: VarBlake2b = VarBlake2b::new(size_of::<Nonce>())
.expect("Should be able to create blake2 hasher; will only panic if output size is 0 or greater than 64");

let mut encryption_nonce = [0u8; size_of::<Nonce>()];

// TODO: use hashing API to generate suitable domain separation instead
let label = b"chacha20_salt";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this domain separation label too generic? Other separators elsewhere are more detailed to avoid any accidental reuse elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agree. I changed this to, following the current convention, com.tari.key_manager.cipher_seed.chacha20_enc. Similarly for the MAC generation label.

blake2_nonce_hasher
.chain(label)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A safer approach (and one used by the hashing API) is to ensure that any variable-length input to the underlying hash function be prepended with a fixed-length encoding of the input's length. This avoids accidental or malicious collisions. If you're taking a belt-and-suspenders approach, it's best to do this even for fixed-length inputs, in case they change later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the code to use the hashing API and take the first n byte elements, depending on the use case.

Comment on lines 261 to 266
let derived_encryption_key: Vec<u8> = Self::generate_domain_separated_passphrase_hash(passphrase, salt)?
.into_iter()
.rev()
.take(32)
.rev()
.collect();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more straightforward and/or more idiomatic way to do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored it to use slicing, it makes the code cleaner and simpler to read. Thanks !

passphrase: &str,
) -> Result<Vec<u8>, KeyManagerError> {
// birthday should be 2 bytes long
if birthday.len() != 2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to avoid the use of a magic number and move this length to a constant, in case it changes in a future version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch here :) !

Comment on lines 312 to 318
// we take the first 32 bytes of the generated derived encryption key for MAC generation, see documentation
let passphrase_key: Vec<u8> = Self::generate_domain_separated_passphrase_hash(passphrase, salt)?
.into_iter()
.rev()
.take(32)
.rev()
.collect();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this take the last 32 bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely a bug. Corrected !

let mut hmac = [0u8; CIPHER_SEED_MAC_BYTES];

blake2_mac_hasher
.chain(DOMAIN_SEPARATION_LABEL.as_bytes())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted elsewhere, ideally this should prepend the length of the label using a fixed-length encoding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the code to use the hashing API, this safety check should be taken care by the API.

@jorgeantonio21 jorgeantonio21 force-pushed the ja-mac-keyed-directly-with-low-entropy-passphrase branch 2 times, most recently from 44ec4da to 8ad0b78 Compare July 18, 2022 07:42
Copy link
Contributor

@hansieodendaal hansieodendaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, looking better and better. I have added a suggestion for using the MacDomain when generating the mac

base_layer/key_manager/src/cipher_seed.rs Show resolved Hide resolved
// we take the first 32 bytes of the generated derived encryption key for MAC generation, see documentation
let passphrase_key = Self::generate_domain_separated_passphrase_hash(passphrase, salt)?[..32].to_vec();

let mac = DomainSeparatedHasher::<Blake256, GenericHashDomain>::new(DOMAIN_SEPARATION_LABEL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the MacDomain from the hasing API:

Suggested change
let mac = DomainSeparatedHasher::<Blake256, GenericHashDomain>::new(DOMAIN_SEPARATION_LABEL)
// TODO: Replace with generic `HashingDomain::mac_hasher` implementation
let mac = Self::mac_hasher::<Blake256>(DOMAIN_SEPARATION_LABEL)

@jorgeantonio21 jorgeantonio21 force-pushed the ja-mac-keyed-directly-with-low-entropy-passphrase branch from 3e35799 to d6bbab4 Compare July 18, 2022 12:52
Comment on lines 324 to 331
let mac = Self::mac_hasher::<Blake256>(DOMAIN_SEPARATION_LABEL)
.chain(birthday)
.chain(entropy)
.chain(cipher_seed_version)
.chain(passphrase_key.as_slice())
.finalize()
.into_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the salt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally my fault and a problematic bugs. Thanks for taking notice of it, I corrected accordingly!

@jorgeantonio21 jorgeantonio21 force-pushed the ja-mac-keyed-directly-with-low-entropy-passphrase branch from 2f2d280 to b623f6c Compare July 18, 2022 22:41
hansieodendaal
hansieodendaal previously approved these changes Jul 19, 2022
Copy link
Contributor

@hansieodendaal hansieodendaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…C. Add domain separation for the generation of salt's and keys and applied the principle of not reusing the same key twice. Also add cipher attack tests, in which attacker changes either salt, entropy or birthday along with appropriate checksum. Use of hashing API for domain separation
hansieodendaal
hansieodendaal previously approved these changes Jul 26, 2022
Copy link
Contributor

@hansieodendaal hansieodendaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - one naming conflict

/// Usage:
/// let hash = comms_dht_hash_domain().digest::<Blake256>(b"my secret");
/// etc.
pub fn comms_dht_hash_domain() -> HashingDomain {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn comms_dht_hash_domain() -> HashingDomain {
pub fn base_layer_key_manager() -> HashingDomain {

@@ -11,3 +11,29 @@ pub mod mnemonic_wordlists;
#[allow(clippy::unused_unit)]
#[cfg(feature = "wasm")]
pub mod wasm;

use tari_common::hashing_domain::*;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer something more like this:

Suggested change
use tari_common::hashing_domain::*;
hash_domain!(KeyManagerHashDomain, "com.tari.keymanager");
pub fn create_hasher<D:Digest>(label: &'static str) -> DomainSeparatedHasher<....> {
DomainSeparatedHasher::new::<D>(label)
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or using the create_hasher!() macro directly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or finally, one method for each use case...

Suggested change
use tari_common::hashing_domain::*;
hash_domain!(KeyManagerHashDomain, "com.tari.keymanager");
pub fn create_passphrase_hasher() -> DomainSeparatedHasher<VarBlake, KeyManagerHashDomain> {}
pub fn create_cipher_seed_mac_hasher() -> Domain....

Copy link
Collaborator

@stringhandler stringhandler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy, but will wait for confirmation from @AaronFeickert before merging

@stringhandler stringhandler merged commit 1c5ec0d into tari-project:development Aug 2, 2022
@AaronFeickert
Copy link
Collaborator

Happy, but will wait for confirmation from @AaronFeickert before merging

A bit delayed (my apologies), but utACK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants