-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve sift_down performance in BinaryHeap #81127
Conversation
r? @shepmaster (rust-highfive has picked a reviewer for you, use r? to override) |
…e, r=Mark-Simulacrum Document BinaryHeap unsafe functions `BinaryHeap` contains some private safe functions but that are actually unsafe to call. This PR marks them `unsafe` and documents all the `unsafe` function calls inside them. While doing this I might also have found a bug: some "SAFETY" comments in `sift_down_range` and `sift_down_to_bottom` are valid only if you assume that `child` doesn't overflow. However it may overflow if `end > isize::MAX` which can be true for ZSTs (but I think only for them). I guess the easiest fix would be to skip any sifting if `mem::size_of::<T> == 0`. Probably conflicts with rust-lang#81127 but solving the eventual merge conflict should be pretty easy.
…e, r=Mark-Simulacrum Document BinaryHeap unsafe functions `BinaryHeap` contains some private safe functions but that are actually unsafe to call. This PR marks them `unsafe` and documents all the `unsafe` function calls inside them. While doing this I might also have found a bug: some "SAFETY" comments in `sift_down_range` and `sift_down_to_bottom` are valid only if you assume that `child` doesn't overflow. However it may overflow if `end > isize::MAX` which can be true for ZSTs (but I think only for them). I guess the easiest fix would be to skip any sifting if `mem::size_of::<T> == 0`. Probably conflicts with rust-lang#81127 but solving the eventual merge conflict should be pretty easy.
…e, r=Mark-Simulacrum Document BinaryHeap unsafe functions `BinaryHeap` contains some private safe functions but that are actually unsafe to call. This PR marks them `unsafe` and documents all the `unsafe` function calls inside them. While doing this I might also have found a bug: some "SAFETY" comments in `sift_down_range` and `sift_down_to_bottom` are valid only if you assume that `child` doesn't overflow. However it may overflow if `end > isize::MAX` which can be true for ZSTs (but I think only for them). I guess the easiest fix would be to skip any sifting if `mem::size_of::<T> == 0`. Probably conflicts with rust-lang#81127 but solving the eventual merge conflict should be pretty easy.
☔ The latest upstream changes (presumably #82359) made this pull request unmergeable. Please resolve the merge conflicts. |
Because child > 0, the two statements are equivalent, but using saturating_sub and <= yields in faster code. This is most notable in the binary_heap::bench_into_sorted_vec benchmark, which shows a speedup of 1.26x, which uses sift_down_range internally. The speedup of pop (that uses sift_down_to_bottom internally) is much less significant as the sifting method is not called in a loop.
bdf2962
to
095bf01
Compare
r? @dtolnay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hanmertens. I am not familiar with the BinaryHeap implementation but I am prepared to accept this on the basis of the into_sorted_vec benchmark. I confirmed that the behavior is logically identical to before in both places.
-
If
end >= 2
thenchild < end - 1
is equivalent tochild <= end - 2
is equivalent tochild <= end.saturating_sub(2)
. -
If
end == 1
thenchild < end - 1
is false whilechild <= end.saturating_sub(2)
is equivalent tochild == 0
. However it's a loop invariant thatchild == 2 * hole.pos() + 1 > 0
sochild != 0
andchild <= end.saturating_sub(2)
is also false. -
If
end == 0
then contradiction because it's a precondition of the function that0 <= pos < end
, andend
is not mutated.
@bors r+ |
📌 Commit 095bf01 has been approved by |
…rf, r=dtolnay Improve sift_down performance in BinaryHeap Replacing `child < end - 1` with `child <= end.saturating_sub(2)` in `BinaryHeap::sift_down_range` (surprisingly) results in a significant speedup of `BinaryHeap::into_sorted_vec`. The same substitution can be done for `BinaryHeap::sift_down_to_bottom`, which causes a slight but probably statistically insignificant speedup for `BinaryHeap::pop`. It's interesting that benchmarks aside from `bench_into_sorted_vec` are barely affected, even those that do use `sift_down_*` methods internally. | Benchmark | Before (ns/iter) | After (ns/iter) | Speedup | |--------------------------|------------------|-----------------|---------| | bench_find_smallest_1000<sup>1</sup> | 392,617 | 385,200 | 1.02 | | bench_from_vec<sup>1</sup> | 506,016 | 504,444 | 1.00 | | bench_into_sorted_vec<sup>1</sup> | 476,869 | 384,458 | 1.24 | | bench_peek_mut_deref_mut<sup>3</sup> | 518,753 | 519,792 | 1.00 | | bench_pop<sup>2</sup> | 446,718 | 444,409 | 1.01 | | bench_push<sup>3</sup> | 772,481 | 770,208 | 1.00 | <sup>1</sup>: internally calls `sift_down_range` <sup>2</sup>: internally calls `sift_down_to_bottom` <sup>3</sup>: should not be affected
Rollup of 8 pull requests Successful merges: - rust-lang#81127 (Improve sift_down performance in BinaryHeap) - rust-lang#81879 (Added #[repr(transparent)] to core::cmp::Reverse) - rust-lang#82048 (or-patterns: disallow in `let` bindings) - rust-lang#82731 (Bump libc dependency of std to 0.2.88.) - rust-lang#82799 (Add regression test for rust-lang#75525) - rust-lang#82841 (Change x64 size checks to not apply to x32.) - rust-lang#82883 (Update Cargo) - rust-lang#82887 (Update CONTRIBUTING.md) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Replacing
child < end - 1
withchild <= end.saturating_sub(2)
inBinaryHeap::sift_down_range
(surprisingly) results in a significant speedup ofBinaryHeap::into_sorted_vec
. The same substitution can be done forBinaryHeap::sift_down_to_bottom
, which causes a slight but probably statistically insignificant speedup forBinaryHeap::pop
. It's interesting that benchmarks aside frombench_into_sorted_vec
are barely affected, even those that do usesift_down_*
methods internally.1: internally calls
sift_down_range
2: internally calls
sift_down_to_bottom
3: should not be affected