Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate some design tweaks over Crypto Implementation #1252

Closed
wants to merge 45 commits into from

Conversation

Farhad-Shabani
Copy link
Member

As a reference for suggestion made in #1238

  • Referenced an issue explaining the need for the change
  • Updated all relevant documentation in docs
  • Updated all code comments where relevant
  • Wrote tests
  • Added entry in .changelog/

blasrodri and others added 30 commits November 17, 2022 15:36
It has been deprecated for some time.
The API needs to be abstract, so make p256k1 signature
an associated type of CryptoProvider.
Guarded by the rust-crypto feature, this is an implementation of
CryptoProvider with pure Rust crates.
Move CryptoProvider definition out of the crypto module into
the provider sub-module. Rename default_provider to default.
Add generic methods Header::hash_with and ValidatorSet::hash_with
enabling a custom CryptoProvider to be plugged in for calculating
hashes.
The .hash() methods, implemented with the DefaultCryptoProvider,
are feature-gated behind "rust-crypto".
2usize.next_power_of_two() / 1 == 1, and we have eliminated the other
two explicitly matched cases at the call site, so the non-catchall
match branches and the panics are dead code.
The Hasher trait is obviated by CryptoProvider.
Also, we could do with less dynamic dispatching.
Instead of a super-trait whose sole purpose is to bind down some
associated types that provide the actual functionality, provide:
- Sha256, a purpose-specific trait for SHA256 hashing that has
  a more human-friendly interface than rust-crypto.
- Nothing else for signing and verifying! These are covered by the
  signature framework, and it's easy to plug into that as the
  alt_crypto test demonstrates.

The crypto::default module, gated by the "rust-crypto" feature,
provides aliases for pure Rust implementations.
An alt implementation would not be able to reuse the signature
type from k256.
Remove the only purpose for using a hasher in the implementation
by returning the set of validators in
VerificationErrorDetail::FaultySigner.

CommitValidator can now revert to being a (weird) default-implemented
trait, with a ProdCommitValidator to anchor the implementation.
A critical code block was hastily commented out in this
"unstable" part of code.
Use the Sha2 alias from crypto::default instead of the direct
references to sha2 crate. All code that used Sha2 non-generically
is feature-gated behind "rust-crypto".
The host API for obtaining SHA256 digests on Substrate is a simple
function, it cannot work incrementally. Change the Sha256 trait to
match this, but provide a MerkleHash trait to retain incremental
Merkle hashing with Rust-Crypto conformant digest implementations.
The merkle::NonIncremental adapter type is provided to fit the lowest common
denominator Sha256 API to a Merkle hash implementation, at the
cost of some allocations and extra copying.
Also rename crypto::default::ecdsa_secp256 to ecdsa_secp256k1,
to harmonize the naming with the feature that gates this module.
@Wizdave97
Copy link

This is a much better implementation for substrate chains

Copy link
Contributor

@mzabaluev mzabaluev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments, mostly flak. I think we'll need to come up with a minimal set of changes that make sense and reduce coupling between implementations and specific hash functions.

@@ -50,16 +50,15 @@ tendermint-proto = { version = "0.28.0", default-features = false, path = "../pr
time = { version = "0.3", default-features = false, features = ["macros", "parsing"] }
zeroize = { version = "1.1", default-features = false, features = ["zeroize_derive", "alloc"] }
flex-error = { version = "0.4.4", default-features = false }
sha2 = { version = "0.10", optional = true, default-features = false }
sha2 = { version = "0.10", default-features = false }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effectively reverts part of the changes in #1238.

One of the objectives of the whole exercise, as I understood them, is to make the pure Rust crypto implementations replaceable and be able to exclude them from the end artefact (e.g. the wasm package).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, comparatively, it doesn't buy much. I looked it as a side objective

/// Implementation of Merkle tree hashing for Tendermint.
pub trait MerkleHash {
pub trait MerkleHash: DigestSha256 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make Merkle hashing dependent on the particular digest algorithm?
It can be independent, and you will be able to say "I want a Merkle hasher that uses the SHA2 256 digest algorithm" with the MerkleHash + DigestSha256 bounds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It already do so. With the DigestSha256 trait (included types from higher-level HashType trait), basically you can introduce any type of hashing functions. but I caught the issue. It should be renamed to make more sense (e.g.as DigestHash)

Comment on lines 30 to 32
let mut buf = Vec::with_capacity(1 + bytes.len());
buf.push(0x00);
buf.extend_from_slice(bytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default implementation allocates memory to compute interim hashes.
The blanket implementation in the #1238 base does not. This is by design: we don't need to cut to the dumbest common denominator hashing API if we can get a better performing implementation from anything that can work incrementally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did so, looking for a minimal set of changes compared to the orig files and hoping this be covered in a separate PR for this purpose. but Right, if there is a demand and a performance matter, would be better to get it back and also have it here.


// tmhash(0x01 || left || right)
// Pre and post-conditions: the hasher is in the reset state
// before and after calling this function.
fn inner_hash(&mut self, left: Hash, right: Hash) -> Hash;
fn inner_hash(&mut self, left: Hash, right: Hash) -> Hash {
// This is why non-incremental digest APIs are daft.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also why there should not be a default implementation doing this.

/// Compute a simple Merkle root from vectors of arbitrary byte vectors.
/// The leaves of the tree are the bytes of the given byte vectors in
/// the given order.
fn simple_hash_from_byte_vectors(&mut self, byte_vecs: &[Vec<u8>]) -> Hash {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be converted to a method?
With the freestanding function, we can't misuse the API by e.g. calling it on a digest object with non-reset digest state.
simple_hash_from_byte_vectors is a one-stop-shop function that creates the digest state internally, hashes the input and returns the result.

Copy link
Member Author

@Farhad-Shabani Farhad-Shabani Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, I had looked at it from the opposite angle, but with the same concern. Having a bare function, so Merkle hasher could be triggered by a fake similar one. I encapsulated it to operate only on that data. Btw, I dropped it, since I found out MerkleHash can also be invoked through hash_byte_vectors() method w/o needing simple_hash_from_byte_vectors

pub struct NonIncremental<H>(PhantomData<H>);

impl<H> Default for NonIncremental<H> {
pub struct DefaultMerkleHash;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is wrong, we specifically don't want a stateless function to be the basis for default hashing implementation, when in most programming environments we can use the sha2 crate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know if I catch you correctly, but as it stands here, it is bounded by sha2::Sha256

Comment on lines 21 to 24
fn empty_hash(&mut self) -> Hash {
// Get the output of an empty digest state.
Self::digest_sha256(&Vec::with_capacity(0))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why bake the implementation into the trait?
This pattern is used elsewhere, for example, in VerificationPredicates, but I don't think it's needed here.

@@ -154,7 +97,7 @@ mod tests {
let empty_tree_root = &hex::decode(empty_tree_root_hex).unwrap();
let empty_tree: Vec<Vec<u8>> = vec![vec![]; 0];

let root = simple_hash_from_byte_vectors::<Sha256>(&empty_tree);
let root = DefaultMerkleHash.simple_hash_from_byte_vectors(&empty_tree);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks non-idiomatic, the object instance has no purpose here, as explained above.

Comment on lines 84 to 86
// Feed the data to the hasher, prepended with 0x00.
Digest::update(self, [0x00]);
Digest::update(self, bytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note how this does not allocate any memory temporarily or copy data around. This is what we want most of our users to actually use, while the few platforms where the NonIncremental wrapper is needed could be kept as integration freak cases.

@Farhad-Shabani
Copy link
Member Author

In light of the need Mikhail aptly raised for incremental hashing and the comments mostly arose because of losing it, I've reverted that part.
Here, just remains the suggestions you may want to consider, including the refactored parts for verifying signatures verification through an interface.

@tony-iqlusion
Copy link
Collaborator

@Farhad-Shabani I don't understand what advantage that signature verification interface is supposed to provide over signature::Verifier and similar other signature verification traits in the signature crate.

It seems needlessly convoluted, and also has the disadvantage that it isn't natively supported by commonly used Rust ecosystem cryptography crates.

@Farhad-Shabani
Copy link
Member Author

@tony-iqlusion agree with you on this. Actually, I've tried it first, but having PublicKey as enum, supporting different key/signature variants made it tricky to refactor and provide the needed interface for validator/Info.
But, thanks bringing this up, made me rethought and came up with a new idea. Check that please. Now the validator/Info can provide the generic interface through verify_with method. if it works overall, it can be fastened

Base automatically changed from mikhail/add-crypto-provider to main January 31, 2023 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants