-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression in consteval: error[E0080]: could not evaluate static initializer (unable to turn pointer into raw bytes) #99923
Comments
You should've never needed to transmute the pointer side to pub const unsafe fn from_bytes_unchecked<T>(bytes: &[u8]) -> &[T] {
let (data, mut metadata): (*const u8, usize) = core::mem::transmute(bytes);
metadata /= core::mem::size_of::<T>();
core::mem::transmute((data, metadata))
} |
Yeah, I managed to make it work with this instead:
|
That PR improved our ability to detect UB in const code. Transmuting a ptr (that was not created via int2ptr cast) to a usize in const is UB. So I am not sure this is a regression. Code with UB is not subject to stability guarantees.
|
I understand, but I will point out that currently what is and isn't UB is in flux (and almost completely undocumented), and also a lot of the APIs to do this safely are not yet available (const I don't think "this is UB" is sufficient to ignore breaking changes when there is no documentation on what Rust considers UB, and Rust is still pinning down what it considers UB. This is not to say that we shouldn't make changes like this, but I think it would be good for things like this to go through a period of warning with better diagnostics, where possible. I had to bisect the compiler to understand what went wrong here, which is fine in my case since I'm used to it, but I don't really think it's acceptable for us to decide that this is an okay user experience for others. (To be clear, my reaction here is to the statement that this isn't a regression, not to the breakage happening in the first place. Breakages happen, it's hard to predict it all, but I would prefer breakages like this to be handled more gentler once discovered) Footnotes
|
At minimum we need to start emitting a lot better diagnostics about how And if people start hearing about how We do have more latitude to make something stop compiling if it's UB, of course. Rust programmers will probably be unsurprised if they hear "well, that |
Keep in mind the provenance changes pretty much only affect runtime execution. The irony here is that where before miri was (wrongly?) doing nothing (resulting in an Now, why did CTFE miri not complain about ptr2int transmutes before? Is it only UB now? union Transmute<T: Copy, U: Copy> {from:T,to:U}
pub static PTR_TO_USIZE: usize = unsafe { Transmute{from:&0}.to }; (using an Since 1.33, and still the case now, in 1.62 (wow, half the lifetime of Rust stable), it errors with:
(the error has gotten more detailed since, but this is what it used to look like) Okay, but why does the "more local" one not error? Is there somehow "less UB" deeper in CTFE? Nah: rust/compiler/rustc_const_eval/src/interpret/machine.rs Lines 452 to 455 in bd84c73
It's just... turned off. It has been turned off ever since the functionality was added in: It's only enabled by But for Presumably it would be very expensive for large arrays, but it seems like an increasing hazard (as more people are building all sorts of abstractions using @RalfJung did we ever do perf/crater runs with |
I do agree with the concern that our UB rules can be hard to follow. However, note that the UB involved here is much more clearly documented than most other kinds of UB that we have. Quoting from the
So, if this is not enough, then I don't know what is. We also have an RFC saying that the compiler is not required to detect all UB. (I was arguing for mandating some amount UB detection, but the consensus was we shouldn't do that.) Put together, I am not quite sure what we could have done differently here. It is very easy for interpreter changes to "accidentally" improve or regress UB detection, and we need the ability to continue evolving that interpreter without going through a month-or-year-long deprecation process each time. Do you have concrete suggestions for something that should have been done differently when landing #97684? Or is your concern that we need more docs elsewhere (where, and which kinds of docs?), and perhaps some way for crate authors to opt-in to stricter UB checks for CTFE? The irony is that for runtime UB we have Miri, but there is no way to have compile-time UB checks that are as strict as what Miri is doing... Transmutation is a hornet's nest, and it is super hard to even talk about it precisely without first giving an entire lecture on Rust operational semantics (the details of which do not have consensus yet). This is true for runtime transmute and even more true for CTFE transmute.
Yeah, this is only tangentially related to the strict provenance discussion. (The relationship is that as an outcome of that discussion I was able to make the handling of pointers in Miri a lot more coherent, and one side-effect of that refactor is improved UB detection even for CTFE.) The rules for what you are allowed to do with pointers during CTFE have never changed, we are now just able to detect rule violations a bit better.
There recently was an experiment to selectively enable it for some types/situations, and even that was too expensive: #95377. |
Yeah, I'm aware (since opening this issue and digging further) that this is documented. Thing is, it's on a function that has existed for ages; users aren't going to recheck that documentation each time they use it once they're familiar with it. If Rust had a complete story for UB it would be totally worth it to go and read through the UCG docs, the reference, the nomicon, and the stdlib docs to learn what is and isn't UB for once and for all. While stuff is in flux and largely undocumented it is not practical to do such a literature survey each time, for the slim chance you actually find an answer for your specific kind of UB. I don't know if it's possible for it to ever be "enough" until the work of the UCG group is finished. I absolutely would appreciate more docs like this, redundantly placed on the stdlib and UCG docs, but the problem here is a more general one. And it's a completely understandable problem — of course UCG can't wave a magic wand and be done in a week — I'd just like more understanding extended towards lib authors dealing with it. I think of this Douglas Adams quote when thinking about how we document UB today:
In a situation where most UB is not documented, users do not expect UB to be documented, and there are four places to go look for UB documentation (hell, the transmute docs link to two of them), a documentation change on a function that has existed for ages and hasn't really had changes to UB-ness is not actually that prominent IMO.
I'm aware. My point is based on what we ought to do, not what we can do.
As I said earlier my pushback is more about the reaction here rather than #97684 itself, since mistakes happen, but here you go: If it had been known at the time of #97684 that it would break code, I would like to have seen some discussion into whether or not a future incompat warning could have been introduced, with better diagnostics. I get that this is not always possible, but it should have been discussed. Furthermore as @workingjubilee said, given that const is spicy when it comes to UB we should probably be adding better diagnostics in general, but especially around any kind of error. More docs are always good. However, while UCG is still figuring stuff out, I personally have almost zero expectation of seeing more docs on everything y'all figure out. I'm happy to let y'all figure stuff out first and doc it later, my pushback is against UB being exploited or being errory till we're in that final stage. It would also be nice if there were generally more communication on "UB in const is spicy". Stuff helping unsafe code writers learn that UB in const is different and how to reason about UB in const would be great (perhaps a blog post?). Which I don't necessarily expect, I know y'all have a lot on your plate I just think it would be nice to have, and I would kinda consider that having been adequately communicated as a prerequisite to const code breaking. But basically, I would like to see some thought given to the user experience of unsafe writers around stuff like this. I'm more plugged in than most to this stuff work; I've been paying some attention to the UCG discussions and I've had chats with various friends who are involved or knowledgeable. If this bit me, this kind of thing will quite likely happen again with more unsuspecting users, and I'd rather they had ample warning and a good user experience for things like this. To give an idea of why I'm pushing back here; the status quo of writing unsafe Rust is to try and avoid UB as much as possible based on what you know is definitely UB, what you have seen documented, running Let me repeat that: we are all assuming we have UB, and that assumption is probably correct. Which means when you say "UB fixes are not regressions" with no caveats, in our case, just means "you don't get stability at all". It doesn't mean "be careful and you can avoid breakages". It doesn't mean "be careful there may be breakages around weird niche stuff". It just means "no stability for you". Of course we cannot expect long term stability on this, rustc should not be prevented from eventually being stricter, none of us expect that it will. However the "ample warning" is something we can expect. It's the only way to make a situation like this work at all. In general we have been operating under the assumption that UB , especially kinds of UB not found in C, will start breaking once UCG has actually reached a point where there's documentation that covers most UB that we can basically read a couple times, internalize, and apply. We have not been expecting these things to break until then; we have been expecting that one day we will sit down with the Complete Docs and go through our unsafe code and fix it alll, and later the old code will start breaking. And until then, we pay some attention and if Exciting New kinds of UB are announced, we go fix those; we can expect some time pass before that UB starts breaking. Again, as far as I can tell, this is the only practical way to write unsafe code right now. So a position that ample warning is not necessary because it's a soundness change and thus allowed to be breaking is very scary, since I have a hard time reading that as anything but "if you write unsafe code, all bets are off on the stability of your software and it might break tomorrow". That's a bit of a rug pull for a language where the fact that you can write While the assumption of ample warning is the only practical way to make this situation work, a lot of this assumption also comes from the fact that for Exciting New kinds of Rust UB, there's basically no chance of LLVM suddenly exploiting it, and it's extra work for Rust folks to write optimizations that depend on that UB, so it won't happen until after there's a full story.
I do think it's worth spending some time communicating the way UB in const is different (and worth making the diagnostics for it better). Pairing that with an understanding that UB-breakages are more likely in const (since it's a compile time error) than in actual compiled code could work. I would prefer ample warning for const code as well; and given that it's compile time the "warning" can actually come in the form of a future incompat warning, but "if you're writing unsafe in const, expect breakages" is actually workable (unlike "if you're writing unsafe, expect breakages", which is extremely not workable). I'd expect it to be clearly communicated before the breakages start, but it's a workable position to have. Ultimately, a compile time error in const with no warning is far less scary than subtle "miscompiles" suddenly happening due to UB, so I'm kinda glad we're having this discussion here instead of later 😄 |
Oh, also, like I said before, I totally understand stuff like this slipping; there are many times when it's too subtle to realize and this case might be one of them. But, in general, future incompat cycles aren't that long and it would be nice to at least consider them on a case by case basis. |
The largest regression in the entire rustc-perf benchmark suite was 1.94% on |
@saethlin Probably, yeah, I'd even be happy to run a build under some rustflag to opt in if I knew about it (but stuff here changes so often it's hard for me to know about these things) However in this case it's unclear to me why there would have to be a perf regression for this. We currently already detect this UB, what I would have preferred is if we threw a future incompat warning for a stable release cycle and then moved on. It's understandable if that's not compatible with the architecture of miri, but so far I haven't seen any discussion of this. |
What @saethlin did was try out how expensive adding any checking would be without changing the miri logic. What Ralf did was a cleanup that accidentally opted into some of that checking by just not having a code path for the UB case anymore. So yes, for this specific case we could have written a future incompat lint had we known about it. Had we known about it we'd have cratered it and discussed it, too. The change just seemed innocent to me as it didn't actually make things simpler, but just avoided some relaxed logic that I though only miri ever used. Turns out CTFE also went through that path in the presence of UB. I guess it all boils down to the fact that we don't even have full branch coverage via out test suite. I guess the only answer I have here is that i'll be more careful in future reviews. Not sure what else to do with the current capacity we have for CTFE work. |
@oli-obk to try and be abundantly clear, I do not think that this was a problem at review time. Mistakes happen, it's fine, and I've stated that multiple times before above. Even with more stringent reviews it's hard to catch all of this. It's going to happen. I've worked on rustc before, I understand how hard it is to catch all the things. My reaction here is entirely against the response to this bug that "it's not a regression because it's UB". I'm okay with stuff breaking by accident. The accident is not the problem. Stating that such things are not accidents in the first place is what I'm bristling against. If your position is that if y'all had known then y'all would have considered a future incompat lint I'm quite happy with that. That was not the impression I got from Ralf's original comments, however, hence my reaction. |
To be fair, the comment has been there ever since the function was callable on CTFE in stable. The problem is, nobody expects there to be a difference between CTFE UB and run-time UB, and so nobody thinks of re-checking the docs when calling a function in CTFE for the first time. CTFE and run-time are two pretty fundamentally different ways of running code, bu we make them mostly the same, and that seems to work well enough that most people are not even aware of the distinction. And for safe code that's just fine, but in unsafe the seams do show (and I don't think we can entirely avoid that, we can just work harder on making this less jarring).
Yeah, I don't disagree, and I appreciate the feedback! I just wonder what the concrete next steps are that we could do here, given our resource constraints. I can't currently think of a feasible way that would make better CTFE UB detection into future-incompat lints. So on the technical level the best I can think of is to have a flag for extra CTFE UB detection. But that won't be nearly enough. As you said we need docs to ensure const unsafe code authors know that const unsafe has stricter rules than regular unsafe, and I don't know where to put those docs.
Thanks for sharing this; I was not aware that this is a common sentiment. I think it reflects well how we have been treating runtime UB, but not CTFE UB, as this issue shows.
Yeah that's kind of where we are at, I think.
I'm happy we agree on that. :) This is why we are less careful with CTFE UB raising more errors than with LLVM exploiting more runtime UB. But I think I failed to appreciate just how much of a black box CTFE UB is for many people.
(Also replying to @oli-obk's reply to this) I don't think a future incompat lint would have been feasible. Currently a CTFE query can either succeed or fail, there's no infrastructure for "succeed with a result but show a warning". Worse, the codepath that would produce the successful result doesn't even exist any more, and getting rid of that codepath was the entire point of the PR that caused this problem. (And I have at least one more similar PR planned. I guess we'll crater that one.^^) My understanding is that RFC 3016 was passed with the explicit knowledge that when CTFE UB detection becomes better, that will cause some code with CTFE UB to fail to compile. Had I known there is an expectation that each time we do this, we put up a future incompat lint, I would have pushed back a lot more -- this will seriously impact our ability to evolve the CTFE & Miri shared interpreter. So I am fully onboard with working on better docs, opting-in to more checks, all of that, but if the expectation is a future incompat warning then I am concerned. |
EDIT: please ignore, see #99923 (comment) for why I was wrong(click to open me being wrong)
1. Undoing #97684's "
|
let read_provenance = |s: abi::Primitive, size| { | |
// Should be just `s.is_ptr()`, but we support a Miri flag that accepts more | |
// questionable ptr-int transmutes. | |
let number_may_have_provenance = !M::enforce_number_no_provenance(self); | |
s.is_ptr() || (number_may_have_provenance && size == self.pointer_size()) | |
}; |
We can make it always true
in CTFE for beta, effectively reverting #97684 for CTFE, but not e.g. cargo miri
.
The code looks a bit different currently, but the fix should identical (always true
in CTFE):
.read_scalar(alloc_range(Size::ZERO, size), /*read_provenance*/ s.is_ptr())?; |
/*read_provenance*/ a.is_ptr(), |
.read_scalar(alloc_range(b_offset, b_size), /*read_provenance*/ b.is_ptr())?; |
To get CTFE diagnostics here when lossiness would occur (either an error or a future-incompat warning), I believe we would have to pick a different type than bool
here, with a third case of "diagnose lossiness".
I haven't seen the discussion so far about ptr::addr
in CTFE but I would expect we would never expose the intra-CTFE-allocation offset of a CTFE pointer (if possible within reason).
2. Future-incompat warnings for CTFE validation
Note that the "lossiness diagnostics" discussion above is superseded by simply always doing lossless reads in CTFE, and then always validating the read value, which should catch misuses with greater accuracy.
We should start doing CTFE validation in as many cases as possible (and ideally e.g. crater the remaining ones and provide a -Z
flag that aggressively checks as much as possible), but we transform the errors produced by the additional validation, into future-incompat warnings, in CTFE.
I think we can move pretty fast with the warnings, but it may depend on crater results and/or user feedback.
re: "do we need future incompat lints for every such change?" Yes, I think that's the expectation in the absence of thorough expectation management, which is why my earliest remarks were "we need to be doing a lot more expectation management". If it was thoroughly understood by Rust programmers that UB in |
With regards to backports, the release is August 11th I believe, which is a little under 2 weeks out. Backports over the next week (before August 5th) are fine, past that please communicate with t-release (pinging me on Zulip in #t-release is best) to make sure the release includes the relevant changes. |
Nah, the first transmute is UB. We just only detect this when its result is used in the 2nd transmute. Here is similar code that errors already on stable: pub static FOO: usize = unsafe {
let r: &char = &'x';
let r2i: usize = std::mem::transmute(r);
r2i + 0
}; And no it doesn't extract the offset, it just returns the pointer value at integer type. The 2nd transmute errors because what would happen here at runtime is that it'd strip provenance, but that is not possible in CTFE (because we only have an offset, not an absolute address), so we have to error. The change in behavior is due to the fact that loads at type
Indeed it would be a critical bug if that offset was ever observable by CTFE code, including unsafe CTFE code.
Allowing values with provenance at integer type in the interpreter means we have to consider what happens to them everywhere, which is a test matrix explosion and source of subtle bugs that we should avoid. And what you say about accuracy is backwards; it is the check-on-creating-a-
Comments like this make me feel like the discussion for RFC 3016 was lead without stating all the assumptions people were making. I would not have agreed to the RFC in that form with the expectation of future incompat lints for every case of better CTFE UB detection. I don't see how that can be workable, with the way the interpreter works, for cases of UB like this. This is not UB because some check says "false", it is UB because the interpreter simply doesn't have a (feasible and tested and conforming to interpreter invariants) codepath to continue executing the program. In fact the RFC explicitly states
So parts of this discussion feel to me like an attempt to alter the RFC decision. So, @eddyb no I don't think I agree to that plan. I don't see how it can work. We can add some kind of temporary work-around and hope that having integers with provenance will still work for other parts of the interpreter, but we need to get rid of that codepath ASAP so that this does not become a burden for the rest of the interpreter. And I don't see how we can have a future incompat warning here.
Yeah I guess that is where things went wrong. |
Unnominating... (and edited the original comment). Alright, that was really pointless on my part, I thought I understood what effect the PR change had, but I missed the interaction between Well, I don't personally have any stake in this (and would prefer if CTFE UB was maximally diagnosed sooner than later), I just got simultaneously excited by "hey this is a simple change" and jump-scared by "oh no this will hit stable any day now", sorry for wasting your time, should've PM'd on Zulip instead. |
I am working on a patch that should defuse the problem at least for beta, hopefully giving us the time to discuss this a bit more. Your suggestion for mitigation made sense @eddyb, I just don't think this is a long-term solution. That's why |
Hm, right now I don't think we'd want to put Stacked Borrows into rustc. But once we have a better idea for the final aliasing model, that might be worth doing. However note that doing this will have a performance cost even when the flag is disabled -- right now we have a lot of careful engineering ensure that the representation of an So, definitely not part of the short-term plan. :) @workingjubilee do you have concrete cases, besides aliasing violations, that you think should error in paranoid/pedantic mode but not in strict mode? A lot of the ambiguity is around the exact contract of library functions, which Miri is not able to check (Miri detects language UB, not library UB). There are some lingering questions around validity invariants but Miri already takes almost the strictest possible interpretation here (except for recursive validation of references, but that seems unlikely to be our final choice, and if it is we can just make the flag stricter long before any change that would affect CTFE without the flag). There are some things around padding that we cannot detect yet (padding being unstable on copies); once I implement that for Miri, the CTFE flag will also get those checks. |
I agree that it might be more useful from the library dev (hello, I am a contributor to the library) perspective. Would it be possible to have that flag allow a |
Hm, that doesn't seem entirely obvious but we should definitely explore that possibility. It has the usual problem that debug assertions in the standard library have -- the stdlib that people actually download is built without debug assertions... How close is |
@RalfJung the proposed plan sounds pretty great! |
#100229 implements the first step of that plan. |
Re-tagging to 1.64; I believe we reverted the behavior change that prompted this in the 1.63 release (#99965). |
…lcnr add -Zextra-const-ub-checks to enable more UB checking in const-eval Cc rust-lang#99923 r? `@oli-obk`
From the crater run, regression on Also, the crater run didn't bring up any other cases of this in particular, and |
It seemed to me to be the same category of issue, where we break backwards compatibility in const unsafe code that currently compiles. Whatever decision is made about how to deal with such cases applies for that case. Am I wrong in my reading? |
#100545 is more about runtime than const UB I think? Certainly it also applies at runtime, which this one here does not. And anyway the question in this issue currently isn't whether we'd back out #97684, we already pretty much decided against that. The question is what else we can do to help people. And we have a plan for that. |
#101101 implements the next step in the plan I outlined -- giving more context for "read pointer as bytes" errors. |
…oli-obk interpret: make read-pointer-as-bytes a CTFE-only error with extra information Next step in the reaction to rust-lang#99923. Also teaches Miri to implicitly strip provenance in more situations when transmuting pointers to integers, which fixes rust-lang/miri#2456. Pointer-to-int transmutation during CTFE now produces a message like this: ``` = help: this code performed an operation that depends on the underlying bytes representing a pointer = help: the absolute address of a pointer is not known at compile-time, so such operations are not supported ``` r? `@oli-obk`
…oli-obk interpret: make read-pointer-as-bytes a CTFE-only error with extra information Next step in the reaction to rust-lang#99923. Also teaches Miri to implicitly strip provenance in more situations when transmuting pointers to integers, which fixes rust-lang/miri#2456. Pointer-to-int transmutation during CTFE now produces a message like this: ``` = help: this code performed an operation that depends on the underlying bytes representing a pointer = help: the absolute address of a pointer is not known at compile-time, so such operations are not supported ``` r? ``@oli-obk``
interpret: make read-pointer-as-bytes a CTFE-only error with extra information Next step in the reaction to rust-lang/rust#99923. Also teaches Miri to implicitly strip provenance in more situations when transmuting pointers to integers, which fixes rust-lang/miri#2456. Pointer-to-int transmutation during CTFE now produces a message like this: ``` = help: this code performed an operation that depends on the underlying bytes representing a pointer = help: the absolute address of a pointer is not known at compile-time, so such operations are not supported ``` r? ``@oli-obk``
…=pnkfelix beta-backport of provenance-related CTFE changes This is all part of dealing with rust-lang#99923. The first commit backports the effects of rust-lang#101101. `@pnkfelix` asked for this and it turned out to be easy, so I think this is uncontroversial. The second commit effectively repeats rust-lang#99965, which un-does the effects of rust-lang#97684 and therefore means rust-lang#99923 does not apply to the beta branch. I honestly don't think we should do this; the sentiment in rust-lang#99923 was that we should go ahead with the change but improve diagnostics. But `@pnkfelix` seemed to request such a change so I figured I would offer the option. I'll be on vacation soon, so if you all decide to take the first commit only, then someone please just force-push to this branch and remove the 2nd commit.
All steps in the plan laid out above have been taken, hence I'll close this issue. |
interpret: make read-pointer-as-bytes a CTFE-only error with extra information Next step in the reaction to rust-lang/rust#99923. Also teaches Miri to implicitly strip provenance in more situations when transmuting pointers to integers, which fixes rust-lang/miri#2456. Pointer-to-int transmutation during CTFE now produces a message like this: ``` = help: this code performed an operation that depends on the underlying bytes representing a pointer = help: the absolute address of a pointer is not known at compile-time, so such operations are not supported ``` r? ``@oli-obk``
Unfortunately I'm not able to come up with a minimal working example for this, but running
cargo check -p icu_datetime --examples
on https://github.com/unicode-org/icu4x (commit unicode-org/icu4x@021a92e should be fine)ends up throwing a ton of errors that look like this:
(full errors; they're long since the "baked" stuff is a bunch of internationalization data that has been "serialized" to Rust code using https://docs.rs/databake)
We have a lot of consteval using unsafe code, which is what's tripping up here. The code that's breaking is code that's currently manually constructing a DST of
&[T::ULE]
from&[u8]
, and in aconst
environment that requires some pointer manipulation:Bisecting the regression points to #97684 (cc @oli-obk, @RalfJung), which seems to be about integer provenance. The code here isn't ideal; there are slightly better ways to do this but nothing is
const
, so this is what we've got.Either way, this is a regression.
Bisection:
searched nightlies: from nightly-2022-06-01 to nightly-2022-06-30
regressed nightly: nightly-2022-06-07
searched commit range: fee3a45...50b0025
regressed commit: 9d20fd1
bisected with cargo-bisect-rustc v0.6.3
Host triple: x86_64-unknown-linux-gnu
Reproduce with:
The text was updated successfully, but these errors were encountered: