-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace FNV with a faster hash function. #37229
Conversation
r? @Aatch (rust_highfive has picked a reviewer for you, use r? to override) |
Do we have any backing data for this algorithm? Maybe from the Firefox development process/source? Smhasher run? |
I forgot to mention that there is something of an explanation about this hash function in the Firefox source: https://dxr.mozilla.org/mozilla-central/source/mfbt/HashFunctions.h#74-117. I modified it from 32-bits to 64-bits by changing the multiplication factor from 0x9E3779B9 (the golden ratio in fixed point) to 0x517cc1b727220a95 (pi in fixed point). I changed it from the golden ratio to pi because the golden ratio in 64-bit fixed point is even -- see http://stackoverflow.com/questions/5889238/why-is-xor-the-default-way-to-combine-hashes#comment54810251_27952689 This hash function was introduced into Firefox in https://bugzilla.mozilla.org/show_bug.cgi?id=729940. There's very little discussion in that bug report about how it was derived. I'm happy to try Smhasher on it. But the ultimate workload for the hash function used within rustc is rustc itself, and it's clearly working well there. |
I think it's worth discussing a couple more things while we're at it, so we nail this for good.
|
I think you miswrote your second dot point... but I did some ad hoc profiling and found that the vast majority of occurrences are |
Did I? I can't find it. I guess I need more more coffee. Interesting, but I'm almost sure it'll show up when we eventually move everything away from siphasher (string interner for example). Siphasher is still even higher in the profiles. |
Fnv and SipHasher both have the property that the stream of bytes to hash is "untyped": a u16 fed as u16 or its byte representation is hashed the same way. But I don't think that the But what I do think is that a well behaved hasher must hash a slice of bytes the same way, regardless of how you split it into subslices (as long as the order is the same). That means that any whole-word optimization for |
Yeah, luckily the Hash trait doesn't impose any special streaming requirement. |
I got curious so I ran smhasher on the 64bit (PR) and original 32bit hashes, I had to include 2 variants of each to be able to see how both modes of the hasher behave (integral and byte-byte...) see gist for results: https://gist.github.com/arthurprs/5e57cd59586acd8c52dbb02b55711096 A few comments considering the code in the PR. Hashing integral types (write_...) The quality is really bad but it's so cheap to calculate for integral types (what rustc seems to be using fnv for) that it's still a win for the combination of the workload + hashmap implementation. I'm fairly sure that the compiler sees the 0 seed and the hash boils down to a single IMUL instruction. Hashing slices (write_usize() + write()) The write_usize(slice.len()) will be faster and the write() slower compared to fnv. So it could potentially regress those cases. I think the right way forward is to have two hashers in the rustc codebase, one general purpose-ish and another for integral types. This PR has potential for the later. |
@arthurps: Thank you for running these! I was about to do it myself but you've saved me the trouble. Looking at the results... whelp, there are a lot of numbers there that I don't know how to interpret, though the "FAIL" results sound bad.
Why will |
It's a 15% difference in my Intel Skylake processor, 690MB/s vs 800MB/s. You can see some rough numbers in the gist. |
Are you sure? Where does that requirement come from? I was thinking about changing |
e8ac705
to
7be7488
Compare
New version. I've made the following changes.
r? @arthurps: what do you think? |
@nnethercote I'm not sure; it's something that needs to be discussed and put into the documentation. I think it's the logical rule by the construction of |
To make a concrete example, imagine |
(New version removes the |
Looks good to me. Somebody from the core team should weight about how to move this forward. I wouldn't be worried about the Hasher having the "strict streaming" characteristic as the Hash trait is "strongly typed" and will make the same writes to hasher every time. |
☔ The latest upstream changes (presumably #37292) made this pull request unmergeable. Please resolve the merge conflicts. |
cc @rust-lang/compiler |
☔ The latest upstream changes (presumably #37270) made this pull request unmergeable. Please resolve the merge conflicts. |
With the notable exception of @arthurprs, this is being ignored. It's a big compile speed win, the biggest one I know of, but I fear that concerns about theoretical worst cases will overwhelm the benefit that's been demonstrated widely in practice. How can we move this forward? |
(Nominated for discussion amongst compiler team; hopefully that will help it move forward...) |
I think the problem is that @Aatch hasn't been too active of late, so the PR went unnoticed. I have no strong opinion about what hash function we use --- basically, if it's faster, I'm for it. I'm curious if anyone has any objections. |
You mean this?
Sure. I'll wait until I get full approval from the compiler team, because I have some other conflicts that I need to fix and I might as well do them later to reduce the likelihood of more conflicts afterwards. |
How about defining type alias |
Although there's no one size fits all for hashers I think it's easier to opt-out of it if necessary than the other way around. So +1 for the DefaultMap/Set. |
@nnethercote everybody is in favor! |
I rebased and split the PR into two commits: one adding FxHasher, and one converting all FnvHash instances to FxHash instances. I also remeasured and the results are similar to before.
(reddit-stress and ostn15_phf are a couple of programs that aren't in rust-benchmarks that I've been measuring.) |
This speeds up compilation by 3--6% across most of rustc-benchmarks.
Ugh, this PR is so conflict-prone. |
@bors r+ |
📌 Commit 00e48af has been approved by |
Replace FNV with a faster hash function. Hash table lookups are very hot in rustc profiles and the time taken within `FnvHash` itself is a big part of that. Although FNV is a simple hash, it processes its input one byte at a time. In contrast, Firefox has a homespun hash function that is also simple but works on multiple bytes at a time. So I tried it out and the results are compelling: ``` futures-rs-test 4.326s vs 4.212s --> 1.027x faster (variance: 1.001x, 1.007x) helloworld 0.233s vs 0.232s --> 1.004x faster (variance: 1.037x, 1.016x) html5ever-2016- 5.397s vs 5.210s --> 1.036x faster (variance: 1.009x, 1.006x) hyper.0.5.0 5.018s vs 4.905s --> 1.023x faster (variance: 1.007x, 1.006x) inflate-0.1.0 4.889s vs 4.872s --> 1.004x faster (variance: 1.012x, 1.007x) issue-32062-equ 0.347s vs 0.335s --> 1.035x faster (variance: 1.033x, 1.019x) issue-32278-big 1.717s vs 1.622s --> 1.059x faster (variance: 1.027x, 1.028x) jld-day15-parse 1.537s vs 1.459s --> 1.054x faster (variance: 1.005x, 1.003x) piston-image-0. 11.863s vs 11.482s --> 1.033x faster (variance: 1.060x, 1.002x) regex.0.1.30 2.517s vs 2.453s --> 1.026x faster (variance: 1.011x, 1.013x) rust-encoding-0 2.080s vs 2.047s --> 1.016x faster (variance: 1.005x, 1.005x) syntex-0.42.2 32.268s vs 31.275s --> 1.032x faster (variance: 1.014x, 1.022x) syntex-0.42.2-i 17.629s vs 16.559s --> 1.065x faster (variance: 1.013x, 1.021x) ``` (That's a stage1 compiler doing debug builds. Results for a stage2 compiler are similar.) The attached commit is not in a state suitable for landing because I changed the implementation of FnvHasher without changing its name (because that would have required touching many lines in the compiler). Nonetheless, it is a good place to start discussions. Profiles show very clearly that this new hash function is a lot faster to compute than FNV. The quality of the new hash function is less clear -- it seems to do better in some cases and worse in others (judging by the number of instructions executed in `Hash{Map,Set}::get`). CC @brson, @arthurprs
@nnethercote this hash is super fast on my dataset. Here are my tests for this hash one a personal round robin hashset implementation for about 4500 u32 (unicode):
The |
@cbreeden I'm happy if you want to make a crate out of it. Make sure you observe the rustc license (of course) and you should probably make it clear in the docs that it's not a "well-designed" hash and so may not be suitable in all situations. Thanks. |
sounds good. Yeah, I got pretty lucky there, I'd say. |
I went ahead and decided to modify the fn write(&mut self, bytes: &[u8]) {
let mut buf = bytes;
while buf.len() >= 4 {
let n = buf.read_u32::<NativeEndian>().unwrap();
self.write_u32(n);
}
for byte in buf {
let i = *byte;
self.add_to_hash(i as usize);
}
} Testing this with a few ascii byte slices yield these results: name old ns/iter chunks ns/iter diff ns/iter diff % speedup
bench_3chars 2 3 1 50.00% x 0.67
bench_4chars 3 2 -1 -33.33% x 1.50
bench_11chars 8 5 -3 -37.50% x 1.60
bench_12chars 9 3 -6 -66.67% x 3.00
bench_23chars 21 8 -13 -61.90% x 2.62
bench_24chars 24 6 -18 -75.00% x 4.00 It appears that there is a clear win for hashing any byte slice with length > 3, which I believe is the common case. For some reason there is a regression when hashing in chunks of u64. (x64 Intel i7-6600U @ 2.6 GHz, Windows 10). @nnethercote I know that you said |
@nnethercote nevermind, sorry for the spam. I think you were using https://github.com/rust-lang-nursery/rustc-benchmarks. I'll try it out when I get back home on a computer that can compile rustc in a reasonable amount of time. |
Doc change: Remove mention of `fnv` in HashMap Disclaimer: I am the author of [aHash](https://github.com/tkaitchuck/aHash). This changes the Rustdoc in `HashMap` from mentioning the `fnv` crate to mentioning the `aHash` crate, as an alternative `Hasher` implementation. ### Why Fnv [has poor hash quality](https://github.com/rurban/smhasher), is [slow for larger keys](https://github.com/tkaitchuck/aHash/blob/master/compare/readme.md#speed), and does not provide dos resistance, because it is unkeyed (this can also cause [other problems](https://accidentallyquadratic.tumblr.com/post/153545455987/rust-hash-iteration-reinsertion)). Fnv has acceptable performance for integers and has very poor performance with keys >32 bytes. This is the reason it was removed from the standard library in rust-lang#37229 . Because regardless of which dimension you value, there are better alternatives, it does not make sense for anyone to consider using `fnv`. The text mentioning `fnv` in the standard library continues to create confusion: rust-lang/hashbrown#153 rust-lang/hashbrown#9 . There are also a number of [crates using it](https://crates.io/crates/fnv/reverse_dependencies) a great many of which are hashing strings (Which is when Fnv is the [worst](https://github.com/cbreeden/fxhash#benchmarks), [possible](https://github.com/tkaitchuck/aHash#speed), [choice](http://cglab.ca/~abeinges/blah/hash-rs/).) I think aHash makes the most sense to mention as an alternative because it is the most credible option (in my obviously biased opinion). It offers [good performance on numbers and strings](https://github.com/tkaitchuck/aHash/blob/master/compare/readme.md#speed), is [of high quality](https://github.com/tkaitchuck/aHash#hash-quality), and [provides dos resistance](https://github.com/tkaitchuck/aHash/wiki/How-aHash-is-resists-DOS-attacks). It is popular (see [stats](https://crates.io/crates/ahash)) and is the default hasher for [hashbrown](https://crates.io/crates/hashbrown) and [dashmap](https://crates.io/crates/dashmap) which are the most popular alternative hashmaps. Finally it does not have any of the [`gotcha` cases](https://github.com/tkaitchuck/aHash#fxhash) that `FxHash` suffers from. (Which is the other popular hashing option when DOS attacks are not a concern) Signed-off-by: Tom Kaitchuck <[email protected]>
Hash table lookups are very hot in rustc profiles and the time taken within
FnvHash
itself is a big part of that. Although FNV is a simple hash, it processes its input one byte at a time. In contrast, Firefox has a homespun hash function that is also simple but works on multiple bytes at a time. So I tried it out and the results are compelling:(That's a stage1 compiler doing debug builds. Results for a stage2 compiler are similar.)
The attached commit is not in a state suitable for landing because I changed the implementation of FnvHasher without changing its name (because that would have required touching many lines in the compiler). Nonetheless, it is a good place to start discussions.
Profiles show very clearly that this new hash function is a lot faster to compute than FNV. The quality of the new hash function is less clear -- it seems to do better in some cases and worse in others (judging by the number of instructions executed in
Hash{Map,Set}::get
).CC @brson, @arthurprs