-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large memory usage and long time on compiling large number of println #86244
Comments
Another test, echo 'fn main() {' > hw.rs
for i in {0..16384}
do
echo ' dbg!("Hello, world!");' >> hw.rs
done
echo '}' >> hw.rs
/usr/bin/time -f "%Uuser %Ssystem %Eelapsed %PCPU (%Xtext+%Ddata %Mmax)k\n%Iinputs+%Ooutputs (%Fmajor+%Rminor)pagefaults %Wswaps\nUsed %MKbytes Max Resident MEM" rustc hw.rs
log:
|
I simplify the code to tons of Generator script: #!/usr/bin/env bash
exec >test.rs
echo 'fn main() {'
for i in $(seq 16384); do
echo '&"foo";'
done
echo '}' It costs 2.6GB memory and 50s to compile. While adding one more reference ( For comparation, |
@rustbot label I-compilemem I-compiletime T-compiler |
Profiling shows that 97% of the time is spent in mir_borrowck. |
Did some further analysis with
Majority of the time is spent in
I don't understand the code well enough to make any conclusions or make any changes, so perhaps someone more familiar with this code could give some hints. |
#50994 uses 5,000 |
The example by @oxalica shows that this issue has nothing to do with #50994 is related to LLVM performance issue while this one is related to borrow checking. Note that when that issue is posted, Rust does not yet have stable NLL, so it's certainly a different issue from this one. |
Both issues were opened with the premise of benchmarking the compiler with the same |
perf: Don't track specific live points for promoteds We don't query this information out of the promoted (it's basically a single "unit" regardless of the complexity within it) and this saves on re-initializing the SparseIntervalMatrix's backing IndexVec with mostly empty rows for all of the leading regions in the function. Typical promoteds will only contain a few regions that need up be uplifted, while the parent function can have thousands. For a simple function repeating println!("Hello world"); 50,000 times this reduces compile times from 90 to 15 seconds in debug mode. The previous implementations re-initialization led to an overall roughly n^2 runtime as each promoted initialized slots for ~n regions, now we scale closer to linearly (5000 hello worlds takes 1.1 seconds). cc rust-lang#50994, rust-lang#86244
…dtwco perf: Don't track specific live points for promoteds We don't query this information out of the promoted (it's basically a single "unit" regardless of the complexity within it) and this saves on re-initializing the SparseIntervalMatrix's backing IndexVec with mostly empty rows for all of the leading regions in the function. Typical promoteds will only contain a few regions that need up be uplifted, while the parent function can have thousands. For a simple function repeating println!("Hello world"); 50,000 times this reduces compile times from 90 to 15 seconds in debug mode. The previous implementations re-initialization led to an overall roughly n^2 runtime as each promoted initialized slots for ~n regions, now we scale closer to linearly (5000 hello worlds takes 1.1 seconds). cc rust-lang#50994, rust-lang#86244
perf: Don't track specific live points for promoteds We don't query this information out of the promoted (it's basically a single "unit" regardless of the complexity within it) and this saves on re-initializing the SparseIntervalMatrix's backing IndexVec with mostly empty rows for all of the leading regions in the function. Typical promoteds will only contain a few regions that need up be uplifted, while the parent function can have thousands. For a simple function repeating println!("Hello world"); 50,000 times this reduces compile times from 90 to 15 seconds in debug mode. The previous implementations re-initialization led to an overall roughly n^2 runtime as each promoted initialized slots for ~n regions, now we scale closer to linearly (5000 hello worlds takes 1.1 seconds). cc rust-lang/rust#50994, rust-lang/rust#86244
Not sure whether it's a known bug.
I tried this code:
(To save you some time, you can download the file here directly rather than generating it yourself.)
I expected to see this happen: it should take reasonable amount of memory and reasonable time to compile.
Instead, this happened: on my machine, I saw it took up to 34GB memory and 2.5min of time just to compile this.
Meta
rustc --version --verbose
:The text was updated successfully, but these errors were encountered: