forked from anza-xyz/agave
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
implements weighted shuffle using N-ary tree (anza-xyz#259)
This is port of firedancer's implementation of weighted shuffle: https://github.com/firedancer-io/firedancer/blob/3401bfc26/src/ballet/wsample/fd_wsample.c anza-xyz#185 implemented weighted shuffle using binary tree. Though asymptotically a binary tree has better performance, compared to a Fenwick tree, it has less cache locality resulting in smaller improvements and in particular slower WeightedShuffle::new. In order to improve cache locality and reduce the overheads of traversing the tree, this commit instead uses a generalized N-ary tree with fanout of 16, showing significant improvements in both WeightedShuffle::new and WeightedShuffle::shuffle. With 4000 weights: N-ary tree (fanout 16): test bench_weighted_shuffle_new ... bench: 36,244 ns/iter (+/- 243) test bench_weighted_shuffle_shuffle ... bench: 149,082 ns/iter (+/- 1,474) Binary tree: test bench_weighted_shuffle_new ... bench: 58,514 ns/iter (+/- 229) test bench_weighted_shuffle_shuffle ... bench: 269,961 ns/iter (+/- 16,446) Fenwick tree: test bench_weighted_shuffle_new ... bench: 39,413 ns/iter (+/- 179) test bench_weighted_shuffle_shuffle ... bench: 364,771 ns/iter (+/- 2,078) The improvements become even more significant as there are more items to shuffle. With 20_000 weights: N-ary tree (fanout 16): test bench_weighted_shuffle_new ... bench: 200,659 ns/iter (+/- 4,395) test bench_weighted_shuffle_shuffle ... bench: 941,928 ns/iter (+/- 26,492) Binary tree: test bench_weighted_shuffle_new ... bench: 881,114 ns/iter (+/- 12,343) test bench_weighted_shuffle_shuffle ... bench: 1,822,257 ns/iter (+/- 12,772) Fenwick tree: test bench_weighted_shuffle_new ... bench: 276,936 ns/iter (+/- 14,692) test bench_weighted_shuffle_shuffle ... bench: 2,644,713 ns/iter (+/- 49,252)
- Loading branch information
1 parent
b01d792
commit 30eecd6
Showing
1 changed file
with
77 additions
and
55 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters