Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite for version 2 #101

Merged
merged 167 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
167 commits
Select commit Hold shift + click to select a range
95b7080
Initial commit
shepmaster Jun 23, 2024
ccb5723
Add comparison to native implementation
shepmaster Jun 23, 2024
e61bdb7
Add benchmarks
shepmaster Jun 23, 2024
c7eed0e
Add commandline sum tool
shepmaster Jun 24, 2024
c5150c9
Move to a workspace
shepmaster Jun 24, 2024
e9bc136
Adjust submodule
shepmaster Jun 24, 2024
198fe0b
Extract a helper
shepmaster Jun 24, 2024
f9ff61c
Use threads for the CLI
shepmaster Jun 24, 2024
7fa13e6
Parameterize and tune the buffer size and count
shepmaster Jun 24, 2024
51beb3f
Reduce allocation count
shepmaster Jun 24, 2024
41b899c
twox-hash bench
shepmaster Jun 24, 2024
19848d0
const it more
shepmaster Jun 24, 2024
c4fdb7f
Document unchecked decision
shepmaster Jun 24, 2024
ebebd1b
cleaning
shepmaster Jun 24, 2024
bd9f192
Proptest oneshot methods
shepmaster Jun 24, 2024
addb9ac
moar tests
shepmaster Jun 25, 2024
537f5f8
Simplify oneshot
shepmaster Jun 25, 2024
8012bef
inline it
shepmaster Jun 25, 2024
8233b36
add little endian
shepmaster Jun 25, 2024
542a9cf
std and serialize impls
shepmaster Jun 25, 2024
678f579
moar
shepmaster Jun 25, 2024
006bd68
tweaks
shepmaster Jun 26, 2024
efd13bd
simpelr
shepmaster Jun 26, 2024
3f722df
to-test
shepmaster Jun 26, 2024
ae7b388
moar
shepmaster Jun 26, 2024
5d455ff
move to new file
shepmaster Jun 26, 2024
6e3961c
32-bit too
shepmaster Jun 27, 2024
a4eb4bd
align
shepmaster Jun 27, 2024
f6156b3
more
shepmaster Jun 27, 2024
c0fdd56
benchmark
shepmaster Jun 27, 2024
c1fc63c
faster
shepmaster Jun 27, 2024
926f257
more
shepmaster Jun 27, 2024
e568a2e
inline
shepmaster Jun 28, 2024
8422030
offset
shepmaster Jun 28, 2024
59836ed
simpla
shepmaster Jun 28, 2024
3e30866
inline
shepmaster Jun 28, 2024
5c1f977
moar tests
shepmaster Jun 28, 2024
63b1799
dox
shepmaster Jun 28, 2024
3dd6526
dox
shepmaster Jun 28, 2024
424f847
rename
shepmaster Jun 29, 2024
7c0f281
error check
shepmaster Jul 4, 2024
b784c61
rename random
shepmaster Jul 4, 2024
16a5f73
junk
shepmaster Jul 4, 2024
7cf0826
xxh3
shepmaster Jul 4, 2024
67eb967
xxh3
shepmaster Jul 5, 2024
56a91e7
xxh3
shepmaster Jul 5, 2024
4c8b99e
xxh3
shepmaster Jul 5, 2024
48683ca
chunks
shepmaster Jul 5, 2024
2934e73
moar
shepmaster Jul 9, 2024
7b233ea
fmt
shepmaster Jul 9, 2024
4116f6b
recover
shepmaster Jul 9, 2024
03683aa
little helper
shepmaster Jul 9, 2024
a245546
bencha
shepmaster Jul 9, 2024
460904a
moar
shepmaster Jul 9, 2024
59d71b4
moresafe
shepmaster Jul 9, 2024
73ad587
reorg
shepmaster Jul 10, 2024
b26aebc
reorg
shepmaster Jul 10, 2024
4854283
faster
shepmaster Jul 10, 2024
273e81f
faster
shepmaster Jul 10, 2024
010cc96
asm compare for C
shepmaster Jul 10, 2024
baa9799
clean
shepmaster Jul 10, 2024
d680cd9
doc cfgs
shepmaster Jul 11, 2024
fbe55dd
flag it
shepmaster Jul 11, 2024
8833e84
seed interface
shepmaster Jul 11, 2024
5ebb61e
secret interface
shepmaster Jul 11, 2024
215d2c3
build scalar and optimized and compare head-to-head
shepmaster Jul 11, 2024
c6c9a12
simd some
shepmaster Jul 14, 2024
843e762
checkpoint simd scramble
shepmaster Jul 14, 2024
39ec48a
checkpoint simd scramble
shepmaster Jul 15, 2024
4ca9573
organize
shepmaster Jul 15, 2024
feb485b
more link
shepmaster Jul 21, 2024
5835fdf
NEON performance parity
shepmaster Jul 21, 2024
938d94e
organize simd
shepmaster Jul 21, 2024
de5b5d7
bench simd on off
shepmaster Jul 22, 2024
50da623
Simplify the control flow
shepmaster Jul 22, 2024
3408fa7
cleanup
shepmaster Jul 22, 2024
087edbf
stub out x64 simd
shepmaster Jul 22, 2024
51ded36
hack in one simd
shepmaster Jul 23, 2024
2c7b465
simd cleanup
shepmaster Jul 23, 2024
be7325c
use cc for builds and a forced avx2 variant
shepmaster Jul 23, 2024
f7ec3bc
better choosin
shepmaster Jul 23, 2024
ffb2e32
Add detect
shepmaster Jul 23, 2024
c00c286
Move neon to trait impl
shepmaster Jul 23, 2024
a37289a
avx cleanup
shepmaster Jul 24, 2024
8e2a359
add sse2 implementation
seritools Jul 24, 2024
816e8ce
Add SSE2 C code variant
shepmaster Jul 24, 2024
f0b3ad4
A few more inlines for good measure
shepmaster Jul 24, 2024
8b5b56d
Simplify and cross-pollinate the AVX2 and SSE2 implementations
shepmaster Jul 24, 2024
3ff0716
Oops this is aarch64 only
shepmaster Jul 24, 2024
e5f1779
flag no go on msvc
shepmaster Jul 24, 2024
3b812a5
Add cfg flags to select the implementation
shepmaster Jul 25, 2024
113e848
Add benchmark for small data
shepmaster Jul 25, 2024
cc1fc5a
Force inling of the xxhash3_64 implementation
shepmaster Jul 25, 2024
3ed1516
format
shepmaster Jul 25, 2024
f0b2cc2
manual unroll
shepmaster Jul 25, 2024
1a73b59
retarget benches
shepmaster Jul 26, 2024
6c8e6de
extra
shepmaster Jul 26, 2024
5f4a1d8
reorder match
shepmaster Jul 26, 2024
90060cc
use array mix_step everywhere
shepmaster Jul 26, 2024
88a738e
Add a streaming implementation for XxHash3_64
shepmaster Aug 12, 2024
2485882
Testing on x86
shepmaster Aug 13, 2024
79d3d99
Use paste to reduce duplication of C wrappers
shepmaster Aug 13, 2024
694b946
Use forcing cfgs for streaming functions too
shepmaster Aug 14, 2024
9ef64a7
Lift computing the secret end up a function call
shepmaster Aug 14, 2024
6e2d440
checkpoint rewrite smaller buffer
shepmaster Aug 14, 2024
0f1980f
speeeeds
shepmaster Aug 14, 2024
7c3a8ed
this is actually neon oops
shepmaster Aug 16, 2024
3a41326
push it
shepmaster Aug 16, 2024
9657b00
Address some todos
shepmaster Aug 17, 2024
9566f32
Use unified dispatch mechanism
shepmaster Aug 17, 2024
e9d17b9
re-inline helpers
shepmaster Aug 17, 2024
935e701
improve names
shepmaster Aug 17, 2024
e7662a4
reduce unsafe
shepmaster Aug 17, 2024
ab749c2
reduce unsafe
shepmaster Aug 17, 2024
7c0cc3f
unsafe-op-in-fn
shepmaster Aug 17, 2024
dde22b4
extract secret start
shepmaster Aug 17, 2024
849434a
oneshot asmasm
shepmaster Aug 18, 2024
4b181d9
keep order consistent for now
shepmaster Aug 18, 2024
eee337e
tweak inlines
shepmaster Aug 18, 2024
019ef11
tweak bench
shepmaster Aug 18, 2024
a2c946a
disable lib bench
shepmaster Aug 18, 2024
0c0597d
Revert "keep order consistent for now"
shepmaster Aug 18, 2024
3591d7b
One category for each range
shepmaster Aug 18, 2024
5a6f1a4
asmasm and inline never
shepmaster Aug 18, 2024
e0f4466
asserts and unsafe
shepmaster Aug 19, 2024
1e6f66d
sum with new
shepmaster Aug 20, 2024
b9fccc5
categ
shepmaster Aug 20, 2024
88ecd94
document LLVM missed optimization
shepmaster Aug 20, 2024
80db4d9
safety
shepmaster Aug 20, 2024
bb68484
simpler blackbox
shepmaster Aug 21, 2024
dc2ba20
Inline the sys crate functions
shepmaster Aug 21, 2024
902a0a7
x86fixin
shepmaster Aug 21, 2024
31a751c
x86
shepmaster Aug 21, 2024
2391766
safety
shepmaster Aug 21, 2024
b6825e6
safety
shepmaster Aug 21, 2024
c955390
Move details to separate modules
shepmaster Aug 21, 2024
f7c221e
Return errors
shepmaster Aug 21, 2024
9329ff8
Add basic docs
shepmaster Aug 21, 2024
4bebba7
error
shepmaster Aug 21, 2024
dec4046
Rename
shepmaster Aug 21, 2024
00c5464
Generate comparison graphs from benchmarks
shepmaster Aug 22, 2024
b0b21c7
test secret and seed
shepmaster Aug 27, 2024
7dfcb11
Merge refreshed implementations from out-of-tree
shepmaster Oct 7, 2024
1a2b658
Rename xx-renu as twox-hash
shepmaster Oct 7, 2024
58c0566
Format code that slipped through
shepmaster Oct 7, 2024
1497bfc
Set minimum Rust version
shepmaster Oct 7, 2024
9e3510b
Avoid inline assembly when testing with Miri
shepmaster Oct 7, 2024
c7e348c
Feature `std` implies `alloc`
shepmaster Oct 7, 2024
b664dd0
Use correct functions for big- and little-endian
shepmaster Oct 7, 2024
39d1b39
Overwrite original version with refresh
shepmaster Oct 9, 2024
19b73de
Restore tests to working condition when serialization is enabled
shepmaster Oct 9, 2024
f480537
Move Box-specific trait impls behind the feature gate
shepmaster Oct 9, 2024
c0b5ca5
Don't warn when we use one of the implementation forcing cfgs
shepmaster Oct 9, 2024
98e3aa6
Update CI configuration
shepmaster Oct 9, 2024
b4ee2a5
Ignore dead-code warnings for our integer conversion traits
shepmaster Oct 16, 2024
cfab4eb
Place SIMD code behind the `std` feature
shepmaster Oct 17, 2024
36975a0
Implement `Clone` for the hashers and states
shepmaster Oct 16, 2024
e039f13
CI: features
shepmaster Oct 16, 2024
0d1105f
Pin to xxHash 0.8.2
shepmaster Oct 17, 2024
28f0d83
Unify the README and crate documentation
shepmaster Oct 17, 2024
45f21b0
Document the feature flags
shepmaster Oct 17, 2024
4c577a3
Test for minimal dependency versions
shepmaster Oct 17, 2024
9bd1943
Don't create empty ranges in proptests
shepmaster Oct 17, 2024
5ce7f4b
Introduce a changelog
shepmaster Oct 17, 2024
a635afe
Tweaks to get benchmarking running again after renaming
shepmaster Oct 17, 2024
979e71b
Update benchmarks for Rust 1.81 / xxHash 0.8.2
shepmaster Oct 18, 2024
6d4ffd4
Remove vestigial comment
shepmaster Oct 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 126 additions & 59 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,106 +2,173 @@ on: push

name: Continuous integration

env:
RUSTFLAGS: -D warnings
RUSTDOCFLAGS: -D warnings

jobs:
library:
runs-on: ubuntu-latest
strategy:
matrix:
platform:
- ubuntu-latest

rust:
- stable
- beta
- nightly
- 1.37.0 # MSRV
- 1.81.0 # MSRV

include:
- platform: macos-latest # This serves as our aarch64 / arm64 runner
rust: stable

- platform: windows-latest
rust: stable

runs-on: ${{ matrix.platform }}

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- uses: actions-rs/toolchain@v1
- run: git submodule update --init --recursive

- uses: dtolnay/rust-toolchain@master
with:
profile: minimal
toolchain: ${{ matrix.rust }}
override: true
components: rustfmt, clippy

- uses: actions-rs/cargo@v1
with:
command: build
- name: Unit Tests
run: cargo test --all-features

- uses: actions-rs/cargo@v1
with:
command: test
- name: Property Tests
run: cargo test -p comparison --all-features

- uses: actions-rs/cargo@v1
with:
command: test
args: --all-features
miri:
runs-on: ubuntu-latest
env:
MIRIFLAGS: --cfg _internal_xxhash3_force_scalar

steps:
- uses: actions/checkout@v4

- uses: actions-rs/cargo@v1
- uses: dtolnay/rust-toolchain@master
with:
command: fmt
args: --all -- --check
if: ${{ matrix.rust == 'stable' }}
toolchain: nightly
components: miri

- uses: actions-rs/cargo@v1
- name: Unsafe Code
run: cargo miri test --all-features

- name: Big Endian Platform
run: cargo miri test --all-features --target s390x-unknown-linux-gnu

lints:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- run: git submodule update --init --recursive

- uses: dtolnay/rust-toolchain@master
with:
command: clippy
args: --all-features -- -D warnings
if: ${{ matrix.rust == 'stable' }}
toolchain: stable
components: rustfmt, clippy

- run: cargo fmt --all

- run: cargo clippy --all --all-targets --all-features

- run: cargo doc --all-features

no-std:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@master
with:
profile: minimal
toolchain: stable
target: thumbv6m-none-eabi
override: true
targets: thumbv6m-none-eabi

- uses: actions-rs/cargo@v1
with:
command: build
args: --no-default-features --target thumbv6m-none-eabi --lib
- run: >
cargo build
--no-default-features
--features=xxhash32,xxhash64,xxhash3_64
--target thumbv6m-none-eabi

compatibility-tests:
features:
runs-on: ubuntu-latest
strategy:
matrix:
test:
- digest_0_8
- digest_0_9

env:
IMPLEMENTATIONS: xxhash32 xxhash64 xxhash3_64
FEATURE_SET: random serialize std alloc

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4

- run: git submodule update --init --recursive

- uses: actions-rs/toolchain@v1
- uses: dtolnay/rust-toolchain@master
with:
profile: minimal
toolchain: stable
override: true

- uses: actions-rs/cargo@v1
with:
command: test
args: --manifest-path "compatibility-tests/${{ matrix.test }}/Cargo.toml"
- name: Compute Powerset
shell: "ruby {0}"
run: |
features = ENV['FEATURE_SET']
.split(' ')
.reduce([[]]) { |ps, i| ps + ps.map { |e| e + [i] } }
.map { |s| s.join(',') }
.join(" ")

File.open(ENV['GITHUB_ENV'], 'a') { |f| f.write("FEATURES=#{features}") }

big_endian:
- name: Check implementations with features
run: |
for impl in ${IMPLEMENTATIONS}; do
echo "::group::Implementation ${impl}"

# Check the implementation by itself
cargo check --no-default-features --features="${impl}"

# And with extra features
for feature in ${FEATURES}; do
echo "::group::Features ${feature}"
cargo check --no-default-features --features="${impl},${feature}"
echo "::endgroup::"
done

echo ::endgroup::
done

minimal-versions:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- run: git submodule update --init --recursive

- uses: dtolnay/rust-toolchain@master
with:
platforms: s390x
toolchain: 1.81.0 # MSRV

- name: Cross test
uses: actions-rs/cargo@v1
- uses: dtolnay/rust-toolchain@master
with:
use-cross: true
command: test
args: --target s390x-unknown-linux-gnu
toolchain: nightly

- name: Remove non-essential dependencies
run: |
# Remove workspace dependencies
sed -i '/\[workspace]/,/#END-\[workspace]/d' Cargo.toml

# Remove dev-dependencies
sed -i '/\[dev-dependencies]/,/#END-\[dev-dependencies]/d' Cargo.toml

- name: Downgrade to minimal dependencies
run: |
cargo +nightly -Z minimal-versions update

- run: cargo +1.81.0 build --all-features
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
target
Cargo.lock
/Cargo.lock
/target
6 changes: 3 additions & 3 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[submodule "comparison/xxHash"]
path = comparison/xxHash
url = https://github.com/Cyan4973/xxHash.git
[submodule "xxHash"]
path = xx_hash-sys/xxHash
url = https://github.com/Cyan4973/xxHash.git
85 changes: 85 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.0.0] - Unreleased

[2.0.0]: https://github.com/shepmaster/twox-hash/tree/v2.0.0

This release is a complete rewrite of the crate, including
reorganization of the code. The XXH3 algorithm now matches the 0.8
release of the reference C xxHash implementation.

### Added

- `XxHash32::oneshot` and `XxHash64::oneshot` can perform hashing with
zero allocation and generally improved performance. If you have code
that creates a hasher and hashes a slice of bytes exactly once, you
are strongly encouraged to use the new functions. This might look
like:

```rust
// Before
let mut hasher = XxHash64::new(); // or XxHash32, or with seeds
some_bytes.hash(&mut hasher);
let hash = hasher.finish();

// After
let hash = XxHash64::oneshot(some_bytes);
```

- There is a feature flag for each hashing implementation. It is
recommended that you opt-out of the crate's default features and
only select the implementations you need to improve compile speed.

### Changed

- The crates minimum supported Rust version (MSRV) is now 1.81.

- Functional and performance comparisons are made against the
reference C xxHash library version 0.8.2, which includes a stable
XXH3 algorithm.

- Support for randomly-generated hasher instances is now behind the
`random` feature flag. It was previously combined with the `std`
feature flag.

### Removed

- The deprecated type aliases `XxHash` and `RandomXxHashBuilder` have
been removed. Replace them with `XxHash64` and
`xxhash64::RandomState` respectively.

- `RandomXxHashBuilder32` and `RandomXxHashBuilder64` are no longer
available at the top-level of the crate. Replace them with
`xxhash32::RandomState` and ``xxhash64::RandomState` respectively.

- `Xxh3Hash64` and `xx3::Hash64` have been renamed to `XxHash3_64` and
`xxhash3_64::Hasher` respectively.

- The free functions `xxh3::hash64`, `xxh3::hash64_with_seed`, and
`xxh3::hash64_with_secret` are now associated functions of
`xxhash3_64::Hasher`: `oneshot`, `oneshot_with_seed` and
`oneshot_with_secret`. Note that the argument order has changed.

- Support for the [digest][] crate has been removed. The digest crate
is for **cryptographic** hash functions and xxHash is
**non-cryptographic**.

- `XxHash32` and `XxHash64` no longer implement `Copy`. This prevents
accidentally mutating a duplicate instance of the state instead of
the original state. `Clone` is still implemented so you can make
deliberate duplicates.

- The XXH3 128-bit variant is not yet re-written. Work is in progress
for this.

- We no longer provide support for randomly-generated instances of the
XXH3 64-bit variant. The XXH3 algorithm takes both a seed and a
secret as input and deciding what to randomize is non-trivial and
can have negative impacts on performance.

[digest]: https://docs.rs/digest/latest/digest/
53 changes: 40 additions & 13 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
name = "twox-hash"
version = "1.6.3"
authors = ["Jake Goulding <[email protected]>"]
edition = "2018"
edition = "2021"
rust-version = "1.81"

description = "A Rust implementation of the XXHash and XXH3 algorithms"
readme = "README.md"
Expand All @@ -14,19 +15,45 @@ documentation = "https://docs.rs/twox-hash/"

license = "MIT"

[workspace]
members = [
"asmasm",
"comparison",
"twox-hash-sum",
"xx_hash-sys",
]
#END-[workspace]

[features]
default = ["random", "xxhash32", "xxhash64", "xxhash3_64", "std"]

random = ["dep:rand"]

serialize = ["dep:serde"]

xxhash32 = []
xxhash64 = []
xxhash3_64 = []

std = ["alloc"]
alloc = []

[lints.rust.unexpected_cfgs]
level = "warn"
check-cfg = [
'cfg(_internal_xxhash3_force_scalar)',
'cfg(_internal_xxhash3_force_neon)',
'cfg(_internal_xxhash3_force_sse2)',
'cfg(_internal_xxhash3_force_avx2)',
]

[dependencies]
cfg-if = { version = ">= 0.1, < 2", default-features = false }
static_assertions = { version = "1.0", default-features = false }
rand = { version = ">= 0.3.10, < 0.9", optional = true }
serde = { version = "1.0", features = ["derive"], optional = true}
digest = { package = "digest", version = "0.8", default-features = false, optional = true }
digest_0_9 = { package = "digest", version = "0.9", default-features = false, optional = true }
digest_0_10 = { package = "digest", version = "0.10", default-features = false, optional = true }
rand = { version = "0.8.0", optional = true, default-features = false, features = ["std", "std_rng"] }
serde = { version = "1.0.0", optional = true, default-features = false, features = ["derive"] }

[dev-dependencies]
serde_json = "1.0"
serde_json = "1.0.117"
#END-[dev-dependencies]

[features]
default = ["std"]
serialize = ["serde"]
std = ["rand"]
[package.metadata.docs.rs]
all-features = true
Loading