-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize string handling in lit_token(). #50525
Conversation
r? @pnkfelix (rust_highfive has picked a reviewer for you, use r? to override) |
src/libsyntax/parse/mod.rs
Outdated
// new symbol because the string in the LitKind is different to the | ||
// string in the Token. | ||
let s = &sym.as_str(); | ||
if s.contains('\\') || s.contains('\r') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are already micro-optimizing: This will iterate the string twice. Something like the following might be faster:
if s.as_bytes().iter().any(|&c| c == b'\\' || c == b'\r') {
Looks good to me. |
In the common case, the string value in a string literal Token is the same as the string value in a string literal LitKind. (The exception is when escapes or \r are involved.) This patch takes advantage of that to avoid calling str_lit() and re-interning the string in that case. This speeds up incremental builds for a few of the rustc-benchmarks, the best by 3%.
I updated to include @michaelwoerister's suggestion. |
@bors r+ Thanks, @nnethercote! |
📌 Commit 65ea0ff has been approved by |
…rister Optimize string handling in lit_token(). In the common case, the string value in a string literal Token is the same as the string value in a string literal LitKind. (The exception is when escapes or \r are involved.) This patch takes advantage of that to avoid calling str_lit() and re-interning the string in that case. This speeds up incremental builds for a few of the rustc-benchmarks, the best by 3%. Benchmarks that got a speedup of 1% or more: ``` coercions avg: -1.1% min: -3.5% max: 0.4% regex-check avg: -1.2% min: -1.5% max: -0.6% futures-check avg: -0.9% min: -1.4% max: -0.3% futures avg: -0.8% min: -1.3% max: -0.3% futures-opt avg: -0.7% min: -1.2% max: -0.1% regex avg: -0.5% min: -1.2% max: -0.1% regex-opt avg: -0.5% min: -1.1% max: -0.1% hyper-check avg: -0.7% min: -1.0% max: -0.3% ```
Rollup of 11 pull requests Successful merges: - #49988 (Mention Result<!, E> in never docs.) - #50148 (turn `ManuallyDrop::new` into a constant function) - #50456 (Update the Cargo submodule) - #50460 (Make `String::new()` const) - #50464 (Remove some transmutes) - #50505 (Added regression function match value test) - #50511 (Add some explanations for #[must_use]) - #50525 (Optimize string handling in lit_token().) - #50527 (Cleanup a `use` in a raw_vec test) - #50539 (Add more logarithm constants) - #49523 (Update RELEASES.md for 1.26.0) Failed merges:
In the common case, the string value in a string literal Token is the
same as the string value in a string literal LitKind. (The exception is
when escapes or \r are involved.) This patch takes advantage of that to
avoid calling str_lit() and re-interning the string in that case. This
speeds up incremental builds for a few of the rustc-benchmarks, the best
by 3%.
Benchmarks that got a speedup of 1% or more: