-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tweak LEB128 reading some more. #69157
Conversation
Local
|
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 6cc131cd42435647d537bf73c1b1b5e16d54f14e with merge 3c3657919929d11ff9535e70167778577db99f0a... |
☀️ Try build successful - checks-azure |
Queued 3c3657919929d11ff9535e70167778577db99f0a with parent 21ed505, future comparison URL. |
PR rust-lang#69050 changed LEB128 reading and writing. After it landed I did some double-checking and found that the writing changes were universally a speed-up, but the reading changes were not. I'm not exactly sure why, perhaps there was a quirk of inlining in the particular revision I was originally working from. This commit reverts some of the reading changes, while still avoiding `unsafe` code. I have checked it on multiple revisions and the speed-ups seem to be robust.
6cc131c
to
e25bd1f
Compare
The CI results show a clear regression, in contrast to my local results, hmm. I have rebased against a more recent revision. Let's try doing another perf CI run, just for interest's sake. @bors try @rust-timer queue |
Awaiting bors try build completion |
Tweak LEB128 reading some more. PR #69050 changed LEB128 reading and writing. After it landed I did some double-checking and found that the writing changes were universally a speed-up, but the reading changes were not. I'm not exactly sure why, perhaps there was a quirk of inlining in the particular revision I was originally working from. This commit reverts some of the reading changes, while still avoiding `unsafe` code. I have checked it on multiple revisions and the speed-ups seem to be robust. r? @michaelwoerister
☀️ Try build successful - checks-azure |
Queued 921540b with parent 5e7af46, future comparison URL. |
So, we have two quite different sets of results when the same change is measured on top of different revisions. I'm going to abandon this PR for the following reasons.
|
As it turns out, the Rust compiler uses variable length LEB128 encoded integers internally. It so happens that they spent a fair amount of effort micro-optimizing the decoding functionality [0] [1], as it's in the hot path. With this change we replace our decoding routines with these optimized ones. To make that happen more easily (and to gain some base line speed up), also remove the "shift" return from the respective methods. As a result of these changes, we see a respective speed up: Before: test util::tests::bench_u64_leb128_reading ... bench: 128 ns/iter (+/- 10) After: test util::tests::bench_u64_leb128_reading ... bench: 103 ns/iter (+/- 5) Gsym decoding, which uses these routines, improved as follows: main/symbolize_gsym_multi_no_setup time: [146.26 µs 146.69 µs 147.18 µs] change: [−7.2075% −5.7106% −4.4870%] (p = 0.00 < 0.02) Performance has improved. [0] rust-lang/rust#69050 [1] rust-lang/rust#69157 Signed-off-by: Daniel Müller <[email protected]>
As it turns out, the Rust compiler uses variable length LEB128 encoded integers internally. It so happens that they spent a fair amount of effort micro-optimizing the decoding functionality [0] [1], as it's in the hot path. With this change we replace our decoding routines with these optimized ones. To make that happen more easily (and to gain some base line speed up), also remove the "shift" return from the respective methods. As a result of these changes, we see a respectable speed up: Before: test util::tests::bench_u64_leb128_reading ... bench: 128 ns/iter (+/- 10) After: test util::tests::bench_u64_leb128_reading ... bench: 103 ns/iter (+/- 5) Gsym decoding, which uses these routines, improved as follows: main/symbolize_gsym_multi_no_setup time: [146.26 µs 146.69 µs 147.18 µs] change: [−7.2075% −5.7106% −4.4870%] (p = 0.00 < 0.02) Performance has improved. [0] rust-lang/rust#69050 [1] rust-lang/rust#69157 Signed-off-by: Daniel Müller <[email protected]>
As it turns out, the Rust compiler uses variable length LEB128 encoded integers internally. It so happens that they spent a fair amount of effort micro-optimizing the decoding functionality [0] [1], as it's in the hot path. With this change we replace our decoding routines with these optimized ones. To make that happen more easily (and to gain some base line speed up), also remove the "shift" return from the respective methods. As a result of these changes, we see a respectable speed up: Before: test util::tests::bench_u64_leb128_reading ... bench: 128 ns/iter (+/- 10) After: test util::tests::bench_u64_leb128_reading ... bench: 103 ns/iter (+/- 5) Gsym decoding, which uses these routines, improved as follows: main/symbolize_gsym_multi_no_setup time: [146.26 µs 146.69 µs 147.18 µs] change: [−7.2075% −5.7106% −4.4870%] (p = 0.00 < 0.02) Performance has improved. [0] rust-lang/rust#69050 [1] rust-lang/rust#69157 Signed-off-by: Daniel Müller <[email protected]>
PR #69050 changed LEB128 reading and writing. After it landed I did some
double-checking and found that the writing changes were universally a
speed-up, but the reading changes were not. I'm not exactly sure why,
perhaps there was a quirk of inlining in the particular revision I was
originally working from.
This commit reverts some of the reading changes, while still avoiding
unsafe
code. I have checked it on multiple revisions and the speed-upsseem to be robust.
r? @michaelwoerister