-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vec::clone and String::clone are very slow #17844
Comments
Can't reproduce on x86_64 using either
Which rustc version are you using? Also I'm curious why your output says that it's running 7 tests but shows only 6 results. |
I removed an irrelevant test, but the results themselves are correct. I'm using Windows nightlies downloaded a few hours before filing the issue, namely
I just realized that So there are two problems:
Bizarrely, if you duplicate Minimal test case:
LLVM generates a non-inlined call to Perhaps this sort of thing is why f39ba69 claimed that "LLVM is easily confused". |
I found that String::push_str and Vec::push_all didn't compile to memcpy either. Following you @rusty-nail2, it could be that in a bigger program, some instances of them will compile correctly? |
Maybe #31414 will help with this. |
This seems to be fixed? String::clone could still be improved to unconditionally use memcpy (benefits debug perf) |
String::clone is not slow (is not generic and is supplied by libstd). Vec::::clone uses .extend_from_slice(), but it looks good except in cases where #33518 hits, so it will be fixed with that issue. Updated benchmark shows no problem. https://gist.github.com/bluss/3a0afb7345000ec76bb2baeadf10f921 |
…d, r=alexcrichton Work around pointer aliasing issue in Vec::extend_from_slice, extend_with_element Due to missing noalias annotations for &mut T in general (issue #31681), in larger programs extend_from_slice and extend_with_element may both compile very poorly. What is observed is that the .set_len() calls are not lifted out of the loop, even for `Vec<u8>`. Use a local length variable for the Vec length instead, and use a scope guard to write this value back to self.len when the scope ends or on panic. Then the alias analysis is easy. This affects extend_from_slice, extend_with_element, the vec![x; n] macro, Write impls for Vec<u8>, BufWriter, etc (but may / may not have triggered since inlining can be enough for the compiler to get it right). Fixes #32155 Fixes #33518 Closes #17844
fix: Fix find_path not respecting non-std preference config correctly Fixes rust-lang/rust-analyzer#17840
Platform: Windows, rust nightly.
The optimizations made to fix #13472 appear to have been undone by #15471 and 3316b1e
In 32-bit mode,
rustc -O --test test.rs && ./test --bench
gives:With
SSE enabled (-C target-cpu=x86-64
-C target-feature=+sse2
), things look better butString::clone
still stands out:Performance seems very dependent on minor details:
clone_vec_from_incremental
is much slower thanclone_vec_from_fn
in32 bitnon-SSE mode.Benchmark program:
The text was updated successfully, but these errors were encountered: