-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The compiler looses vector capacity information #82801
Comments
Don't know what role the MIR opts play here, but on the LLVM side this looks like a phase ordering issue. The final IR has:
The load can be load store forwarded to 3 and everything else folds from there. A sequence of |
Took a while to root cause this, but should be fixed by llvm/llvm-project@3fedaf2. The issue was that GVN did not fully fold the function because available loads were materialized suboptimally, leading to unnecessary poisoning of MDA cache entries. https://reviews.llvm.org/D98114 would also fix fix this case, though it's arguably more of a workaround. |
I'm not completely sure what I'm supposed to be looking for here, but I think this has not been fixed by the LLVM 13 upgrade. It looks like the IR substantially changed in 1.53 and the goal now is presumably to eliminate the branch that contains the |
It wasn't fixed by LLVM 13, tested locally as the compiler on godbolt is outdated. It fails to see that if vec.len() == vec.capacity() {
unsafe { std::hint::unreachable_unchecked() }
} in the beginning of the loop removes the |
Principally, a combination of loop load PRE and SCCP should be able to handle this. However, loop load PRE will currently fail for two reasons: First, the clobbering instruction (do_reserve_and_handle) is an invoke terminator, which would need edge splitting support in loop load PRE. Second, we don't have a guarantee that the call won't free the vec. The second issue is the more problematic one. |
The example now produces good assembly if the loop gets unrolled (possibly improved by #91352), no change otherwise. |
Probably going to be fixed by #114370 |
A very similar case generates 3 separate calls to reserve_for_push: pub fn test(vec: &mut Vec<u32>) {
vec.reserve(3);
let mut i = 0;
while i < 3 {
vec.push(32);
i += 1;
}
} |
Unfortunately, I don't think that neither this issue nor #105156 will be fixed by either:
The issue is actually pretty hard to fix. One needs the compiler to deduce that given an assume that after the reserve the, capacity - len >= 3 (easy), there is at each point capacity > len. For now, I haven't even got LLVM to deduce that capacity > len for the first push (which should come from the fact that 3 > 0 and no overflow occurs). The diff I have tried variations of: diff --git a/library/alloc/src/vec/mod.rs b/library/alloc/src/vec/mod.rs
index 35015238e6e..acf18bb6dc9 100644
--- a/library/alloc/src/vec/mod.rs
+++ b/library/alloc/src/vec/mod.rs
@@ -907,6 +907,10 @@ pub fn capacity(&self) -> usize {
#[stable(feature = "rust1", since = "1.0.0")]
pub fn reserve(&mut self, additional: usize) {
self.buf.reserve(self.len, additional);
+ unsafe {
+ core::intrinsics::assume(self.buf.capacity() >= self.len.unchecked_add(additional));
+ // core::intrinsics::assume(additional == 0 || self.buf.capacity() > self.len);
+ }
}
/// Reserves the minimum capacity for at least `additional` more elements to
@@ -1878,6 +1882,7 @@ fn drop(&mut self) {
pub fn push(&mut self, value: T) {
// This will panic or abort if we would allocate > isize::MAX bytes
// or if the length increment would overflow for zero-sized types.
+ let old_delta = unsafe {self.buf.capacity().unchecked_sub(self.len)};
if self.len == self.buf.capacity() {
self.buf.reserve_for_push(self.len);
}
@@ -1885,6 +1890,8 @@ pub fn push(&mut self, value: T) {
let end = self.as_mut_ptr().add(self.len);
ptr::write(end, value);
self.len += 1;
+ let new_delta = self.buf.capacity().unchecked_sub(self.len);
+ core::intrinsics::assume(old_delta == 0 || old_delta == new_delta.unchecked_add(1));
}
} Commenting the commented assert allow to remove the first push but not subsequent ones, so I'm not sure it is really worth a PR... |
Hint optimizer about reserved capacity The fact that `Vec` had capacity increased is not known to the optimizer, because functions like `do_reserve_and_handle` are not inlined. (rust-lang#82801) Because of that, code such as: ```rust vec.try_reserve(123)?; vec.extend_from_slice(&s[..123]); ``` Tries to reserve the capacity **twice**. This is unnecessary code bloat. https://rust.godbolt.org/z/YWY16Ezej Adding a hint after reserve optimizes out the next check of `self.needs_to_grow(len, additional)`.
Hint optimizer about reserved capacity The fact that `Vec` had capacity increased is not known to the optimizer, because functions like `do_reserve_and_handle` are not inlined. (rust-lang#82801) Because of that, code such as: ```rust vec.try_reserve(123)?; vec.extend_from_slice(&s[..123]); ``` Tries to reserve the capacity **twice**. This is unnecessary code bloat. https://rust.godbolt.org/z/YWY16Ezej Adding a hint after reserve optimizes out the next check of `self.needs_to_grow(len, additional)`.
llvm/llvm-project#75524 has been merged today. It makes it so that LLVM understands that "x + y <= z" implies "x <= z" when y is non-negative. In our case it means that "len + additional <= capacity" implies "len <= capacity". Once rustc's LLVM is updated to take it into account this is likely to change what rustc can optimize or not regarding this issue. |
Any news? |
In this example:
the compiler produces sub-optimal assembly.
https://rust.godbolt.org/z/1T3Knodn8
The text was updated successfully, but these errors were encountered: