Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from SHA-512 to BLAKE3 #646

Merged
merged 4 commits into from
Mar 9, 2020
Merged

Conversation

GeorgeHahn
Copy link
Contributor

BLAKE3 is designed to be a very high performance cryptographic hash. The BLAKE3 team has shown significantly higher single-thread performance than SHA-512 on modern server hardware [1].

This change resulted in a minor improvement to observed local build times.

Comparison

cargo check on sccache repo (at commit 32e40bb) with a warm cache for each hash. Ten samples were collected for each hash.

SHA512 mean: 60.06 seconds
BLAKE3 mean: 57.77 seconds

This data shows a mean improvement of 2.29 seconds (about 4%). My statistics skills are rough, but this appears to be a very statistically significant improvement.

This test was performed on an Intel i5-7600K and a fairly ancient SATA3 SSD. It would be interesting to see a test result from a higher performance machine. The BLAKE3 team's performance analysis shows 8.5x faster single threaded performance vs SHA-512 on an AWS c5d.metal instance.

Raw data sha512 initial cold build: 1m 47s
blake3 initial cold build: 1m 52s
sha512 build times: 57.02s, 1m 03s, 1m 02s, 58.72s, 1m 00s, 59.48s, 1m 02s, 1m 01s, 58.54s, 58.85s
blake3 build times: 54.87s, 59.82s, 56.72s, 56.91s, 1m 00s, 56.46s, 57.85s, 57.13s, 59.35s, 58.57s

Related issues & PRs:

References:

  1. Blake3 repo
  2. Blake3 paper

BLAKE3 is designed to be a very high performance cryptographic hash. The
BLAKE3 team has shown 8.5x higher single-thread performance than SHA-512
on modern server hardware (AWS `c5.metal`). This change did not result
in a significant improvement to my observed local build times, but
newer hardware may see a meaningful improvement.

Signed-off-by: George Hahn <[email protected]>
@luser
Copy link
Contributor

luser commented Jan 17, 2020

Per the other discussions I'm generally supportive of this. Two minor notes:

  1. You should be able to make ring an optional dependency with this change since only the gcs feature will rely on it. That would fix some of the issues that motivated other people to look into replacing ring (as mentioned in Use RustCrypto crates #310).
  2. For completeness' sake you probably ought to bump the instances of CACHE_VERSION to make it very clear that the keys are incompatible, although given that the digest length will change maybe this is overkill:
    pub const CACHE_VERSION: &[u8] = b"7";

    const CACHE_VERSION: &[u8] = b"4";

@GeorgeHahn
Copy link
Contributor Author

Done & done, thanks for the comments.

@chmanchester chmanchester self-requested a review January 18, 2020 00:09
@froydnj
Copy link
Contributor

froydnj commented Jan 22, 2020

An actual speedup from this switch sounds fantastic!

Just to play devil's advocate for a bit: my understanding of why SHA-512 was used is that we wanted something with actual collision resistance, and so we used a cryptographically secure hash instead of something like MurmurHash, xxhash, etc. etc.

But now we're going to replace SHA-512 with a hash that was published ~months ago and hasn't been subjected to analysis outside of the designers (AFAICT; I see the paper does include some results for BLAKE2 and why the designers felt comfortable with differing parameters for BLAKE3 based on the BLAKE2 results, but this is not quite the same thing as analysis of BLAKE3 itself)? So we're replacing a known-good hash function with one that hasn't been subjected to nearly as much scrutiny?

I don't think this means that we'd reject this pull request, but I'd like us to be clear on the tradeoffs we're making.

@jrmuizel
Copy link
Contributor

For my curiosity, what's the threat model for a collision attack? A shared untrusted object cache? Do we include a hash of the entire toolchain in our hashes?

@luser
Copy link
Contributor

luser commented Jan 22, 2020

Do we include a hash of the entire toolchain in our hashes?

For C compiles we only include a hash of the compiler binary but for Rust compiles we include a hash of all the shared libraries under $SYSROOT/lib (since the rustc binary is a shim in the rustup case, and a tiny wrapper around the implementation in the shared libraries in any case).

@GeorgeHahn
Copy link
Contributor Author

@froydnj That's a great outline of the hash change, thanks.

BLAKE3 aims to be secure against collision attacks and should offer excellent collision resistance for cached objects. However, there are currently no third-party assurances that BLAKE3's collision resistance will hold up to malicious actors.

Is a 'shared untrusted cache' scenario is within the planned threat model? If so, it may be desirable to wait for a cryptanalysis of BLAKE3.

@luser
Copy link
Contributor

luser commented Jan 23, 2020

Is a 'shared untrusted cache' scenario is within the planned threat model? If so, it may be desirable to wait for a cryptanalysis of BLAKE3.

I really don't think so. sccache fully trusts the cache contents and does no verification of them. A malicious actor with write access to the cache can already easily poison the cache using known hashes without needing to find a hash collision.

@drahnr
Copy link
Collaborator

drahnr commented Feb 21, 2020

is there any work item that needs to be done for this to be merged?

@chmanchester chmanchester requested a review from froydnj February 28, 2020 23:07
Copy link
Contributor

@chmanchester chmanchester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the reasons mentioned in the thread I don't think any decreased collision resistance has security implications for us. I'll leave the final sign off on this to froydnj, though.

@froydnj froydnj merged commit ed8b7e1 into mozilla:master Mar 9, 2020
@froydnj
Copy link
Contributor

froydnj commented Mar 9, 2020

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants