-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loop detection for const evaluation #51702
Infinite loop detection for const evaluation #51702
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
r? @oli-obk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already pretty solid for the current state of const eval. Seems very hard to test though ;)
@@ -33,11 +33,8 @@ pub struct EvalContext<'a, 'mir, 'tcx: 'a + 'mir, M: Machine<'mir, 'tcx>> { | |||
/// Bounds in scope for polymorphic evaluations. | |||
pub param_env: ty::ParamEnv<'tcx>, | |||
|
|||
/// The virtual memory system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The machine
field also needs to go into the EvalState
. It's unused in the compiler, but miri
the tool needs it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, when I first started on this, I assumed that I'd need to provide some custom implementations of common methods on EvalState
, as well as pass it by reference to something like HashSet::get
. However, since I implemented custom Eq/Hash on Frame
and Memory
, observe
could simply accept references to each individual field. Right now, whenever I look up an EvalState
in a hash table, I'm either inserting it right after, or aborting const eval due to an infinite loop, so there's no need to get
through a reference.
This means I could revert my first commit and not break everything, but if someday we need to look up an EvalState
in a hash table without inserting it, we'd have to clone it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I could use #[derive] more as well as remove a lot of where bounds if I add Eq + Hash + Clone
to the supertraits of Machine
. What's rustc's general philosophy on this? I try to avoid needlessly adding supertraits, but here it would save a lot of boilerplate.
src/librustc_mir/interpret/memory.rs
Outdated
@@ -992,12 +992,12 @@ impl<'a, 'mir, 'tcx, M: Machine<'mir, 'tcx>> HasMemory<'a, 'mir, 'tcx, M> for Me | |||
impl<'a, 'mir, 'tcx, M: Machine<'mir, 'tcx>> HasMemory<'a, 'mir, 'tcx, M> for EvalContext<'a, 'mir, 'tcx, M> { | |||
#[inline] | |||
fn memory_mut(&mut self) -> &mut Memory<'a, 'mir, 'tcx, M> { | |||
&mut self.memory | |||
self.memory_mut() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels very infinite recursion dangerous. Or at least surprising to read. Can we just get rid of the memory_mut
and memory
inherent methods in favour of this trait impl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HasMemory
is private to this crate, and tools/miri
would need to call these methods.
The solution above would obviate this problem.
stmt, | ||
} = self; | ||
|
||
ptr::eq(mir, &other.mir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to check the instance
for equality, because it contains the substs
, which make up the generic arguments of the current frame's function. It's probably not problematic due to the upper frame's stmt
field, but we should still have this PartialEq
impl be correct.
This means you also don't have to check the Mir, because the Mir depends directly on the DefId
inside the Instance
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I've read the rustc book, but some of the finer points eluded me.
} = self; | ||
|
||
ptr::eq(mir, &other.mir) | ||
&& *return_to_block == other.return_to_block // TODO: Are these two necessary? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not in a global way (due to stmt
on the upper stack frame), but still nice to have this impl be very correct.
stmt, | ||
} = self; | ||
|
||
(mir as *const _ as usize).hash(state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, hash the instance, not the mir.
src/librustc_mir/interpret/memory.rs
Outdated
alloc_kind: self.alloc_kind.clone(), | ||
alloc_map: self.alloc_map.clone(), | ||
uninitialized_statics: self.uninitialized_statics.clone(), | ||
cur_frame: self.cur_frame.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for .clone()
here.
src/librustc_mir/interpret/memory.rs
Outdated
&& *alloc_map == other.alloc_map | ||
&& *uninitialized_statics == other.uninitialized_statics | ||
&& *cur_frame == other.cur_frame | ||
&& ptr::eq(tcx, &other.tcx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to compare the tcx
, there is only ever one, and the Span
changing is not something we care about.
src/librustc_mir/interpret/memory.rs
Outdated
|
||
data.hash(state); | ||
cur_frame.hash(state); | ||
(tcx as *const _ as usize).hash(state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also no need to hash the tcx.
src/librustc_mir/interpret/memory.rs
Outdated
cur_frame.hash(state); | ||
(tcx as *const _ as usize).hash(state); | ||
|
||
// We ignore some fields which don't change between evaluation steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You aren't really ignoring them, hashing the allocations
iterator should get you most of the info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ignores uninitialized_statics
at the moment, which actually can change during execution, but I think will not affect infinite loop detection, since modifying an uninitialized static would cause it to no longer be uninitialized. I could be way off here though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's dangerous. Equality and hashing need to be in sync. I don't remember the exact rules, but you can read up on that in the hash docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uninitialized_statics
(along with another field) is ignored in the hash, but not in the equality check. This maintains the invariant required by Hash
: that two objects with different hashes never compare as equal.
This should be fixed anyways, but I need to know how to compare FxHashMap<AllocId, Allocation>
s against one another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. That makes sense
self.allocations() | ||
.map(|allocs| { | ||
let mut h = FxHasher::default(); | ||
allocs.hash(&mut h); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is problematic. Hashing an AllocId
(inside a relocation of the alloc) will hash the actual Id
. This means that if we ever call a function in a loop, since we keep getting new AllocId
s for the function's locals, we'll never realize we're in a loop (ok we will outside the function, but imagine the function returning a raw pointer to its local, and we keep updating a field in the outer function).
I think it's fine for now, and will probably go away once we get around to implementing local AllocId
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So right now, this hashes both the AllocId
and the Allocation
(which can be crazy slow btw). Would simply hashing Allocation
be correct?
I could maybe improve performance by memoizing the hash of an Allocation
. This would involve setting a Cell<Option<u64>>
to None
whenever an Allocation
is mutated, even when the loop detector isn't running. This might not be too bad, since there's no immediate data dependency on the cached value, and it will often be None
during regular execution so the branch predictor will be on our side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could benchmark it. For now let's do the slow version. We are already in "this might take a while territory"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might not even have to hash the memory, if we transitively hash through AllocIds by hashing their allocation instead of the id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I'm a sucker for a premature optimization :).
transitively hash through AllocIds by hashing their allocation instead of the id
Can you elaborate on this? I don't quite understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm caught up now; I'll repeat what you said for my own clarity.
alloc_map
contains all Allocation
s which are visible to a given EvalContext
indexed by AllocId
. The naive approach, comparing the alloc_map
s with eq
, will correctly identify some infinite loops. However, if an object is reallocated, and all pointers to it are updated to point to the new value, the program states are equivalent but our equality function will fail.
Instead, we can look at the program heap as a graph of allocations whose roots are Value::ByRef
s on the stack and whose edges are the AllocId
s in Relocations
. Equality over EvalSnaphost
Frame
s is the same except when it comes to these ByRef
variants. We would push them onto a separate list for each frame, then do a simultaneous BFS/DFS, comparing each Allocation
along the way. This would make equality invariant over changing AllocId
s.
It is much easier to implement Hash
than PartialEq
with this approach: we simply omit any AllocId
s--either on the stack in a Value::ByRef
or the heap in a Relocations
--from the hash.
I think I will implement this naively at first (not handling object reallocation), then try to come up with a test case where the more robust case would be necessary to detect infinite loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I belive that the result of casting a reference to usize
during CTFE is still up in the air. If the AllocId
is used for that, the more complex strategy will no longer be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following test case fails as you predicted.
#![feature(const_fn, const_let)]
const fn churn() -> *const u8 {
let x = 0u8;
&x
}
fn main() {
let _ = [(); {
let mut y = 0 as *const u8;
loop {
y = churn();
}
y as usize
}];
}
There must be a subtler way of triggering this, but I can't come up with one at the moment. I don't think this class of programs is worth implementing a custom equality function which ignores AllocId
s. Let me know if you disagree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can at least check for each AllocId that we hash, whether it is dangling, and then just hash some default value (e.g. u64::max_value
), unfortunately that requires the same changes as walking down AllocIds...
If you open an issue about the not working example and the explanation of the full fix, we can merge this PR as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open an issue. I plan to keep working on this btw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you replace stack()
with stack_mut()
and memory()
with memory_mut()
everywhere?
src/librustc_mir/interpret/memory.rs
Outdated
@@ -53,6 +54,85 @@ pub struct Memory<'a, 'mir, 'tcx: 'a + 'mir, M: Machine<'mir, 'tcx>> { | |||
pub tcx: TyCtxtAt<'a, 'tcx, 'tcx>, | |||
} | |||
|
|||
impl<'a, 'mir, 'tcx, M> Clone for Memory<'a, 'mir, 'tcx, M> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be derived?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be possible after the changes I suggested where the machine
field of the EvalContext
becomes part of the state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bjorn3 Because I chose to refactor the fields of EvalContext
into a separate struct so that it could have a custom hash impl and be looked up by reference in a hash table. See my comment for why this might be overkill. Once you do that, everything that was referring to ctxt.stack
had to be changed to either ctxt.stack{,_mut}()
or ctxt.state.stack
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this explicitly because deriving Clone
for Memory
would add a where M: Clone
bound to the implementation, even though only where M::MemoryData: Clone
is needed. As @oli-obk said, any part of the evaluation state must be Clone
, so I should be able to cut down on the verbosity.
I reverted the This comment still needs to be resolved regarding what parts of My next steps are to: |
The ADT value increase is fine as it is. You will have a gap in your step IDs, but that should be fine |
So I've implemented, enabled and added a (successful) test for a version of the loop detector without any of the performance improvements mentioned above. I can't tell if the Also, since the loop counter is sometimes incremented by more than one, it's possible for the loop detector to fail to when Finally, as @oli-obk mentioned, it's really hard to test for false positives. It's possible that I'm not taking some part of the execution state into account, and two snapshots which compare as equal will not actually cause an infinite loop. I guess the way you usually deal with is a crater run? Not sure if that would be appropriate here... |
@@ -46,7 +46,7 @@ pub struct EvalContext<'a, 'mir, 'tcx: 'a + 'mir, M: Machine<'mir, 'tcx>> { | |||
|
|||
/// The number of terminators to be evaluated before enabling the infinite | |||
/// loop detector. | |||
pub(crate) steps_until_detector_enabled: usize, | |||
pub(crate) steps_until_detector_enabled: isize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please document that negative values represent the amount of steps after the detector is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this could be clearer.
My plan is to invert this counter (e.g. steps_since_detector_enabled
), so I can use an add, then mask off the loop detector period if steps is greater than 0. The code in inc_step_counter
currently only works with a detector period of exactly 1 due to the saturating sub. I was waiting for some clarity on the steps
parameter to find the best way to do this.
I'm trying to make the point at which the detector comes on a comparison with 0
since it saves a cycle or two. This is probably pretty dumb :) See previous comments regarding my affinity for premature optimization.
LL | | let mut n = 113383; // #20 in A006884 | ||
LL | | while n != 0 { //~ ERROR constant contains unimplemented expression type | ||
LL | | n = if n % 2 == 0 { n/2 } else { 3*n + 1 }; | ||
| | ---------- program will never terminate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be prudent to add more info here. Like that the highlighted expression is simply the first expression found where the constant evaluator could with certainty say that the evaluation will not terminate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this is a bit obtuse. What about expanding the span of the error to be the entire block? I'll need to do some more reading into the diagnostics parts of rustc to implement that. I could also replace it with something like "Infinite loop detected here, const evaluation will never terminate." But it's hard to make clear why this particular statement is being singled out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't really figure out which part is at fault. I think there's no nice solution except explain more things in the message.
Well... the environment variables can be changed. So if you're writing a loop by increasing the value inside an environment variable, then you can have a loop where all state in the main memory is the same.
Uhm.... because that's a refactoring mess-up (by me) ;) Just remove that call in aggregate creation and the |
@oli-obk You got it! I saw the comment: /// Miri does not expose env vars from the host to the emulated program And thought that meant that no environment variables would be exposed to the emulated program, but I see now that MIRI emulates environment variables. I'm not sure what all the cases are for MIRI, but it seems like infinite loop detection is only really desirable in CTFE. Do you think I should add a way for MIRI to enable/disable this feature? |
You mean because it might end up being slow? Miri can just set the counter to |
f2b2fad
to
54a9e3a
Compare
LL | | let mut n = 113383; // #20 in A006884 | ||
LL | | while n != 0 { //~ ERROR constant contains unimplemented expression type | ||
LL | | n = if n % 2 == 0 { n/2 } else { 3*n + 1 }; | ||
| | ---------- duplicate interpreter state observed while executing this expression, const evaluation will never terminate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this label is too long.
Maybe s/while executing this expression/here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, that's better
let _ = [(); { | ||
//~^ WARNING Constant evaluating a complex constant, this might take some time | ||
//~| ERROR could not evaluate repeat length | ||
let mut n = 113383; // #20 in A006884 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is A006884
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://oeis.org/A006884. Is this too cute? I wanted something that was more complex than just loop{}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion this is a fine example. Could you add the link to the comment though?
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
0bcba33
to
82b704b
Compare
Okay, I implemented the required traits in Lemme know what you think! |
) -> EvalResult<'tcx> { | ||
) -> EvalResult<'tcx> | ||
where M: Clone + Eq + Hash, | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add this to Machine
itself. Eg:
trait Machine: Clone + Eq + Hash {
[...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems fine, Machine
isn't really used for anything but this.
☔ The latest upstream changes (presumably #51428) made this pull request unmergeable. Please resolve the merge conflicts. |
However, it is possible to diverge without ever repeating a state -- if the program just has larger and larger states. So you are not reliably detecting all cases of programs never terminating, you just detect some cases. (Which you may be aware of, but the way you wrote this ["we can detect if a program will never terminate"] made it sound like you are claiming more than this can possibly do. :D ) |
We do detect nontermination. It might take us forever or the system ooms before we do, but in all other cases, at some point we've seen all possible States. Oom also happens faster due to the hashset building up. On a more serious note: as long as we detect simple nontermination it's fine. The complex cases will still produce a warning, so users know something's off. Rustc already isn't guaranteed to terminate in it's typesystem and macro system, so... |
49910b6
to
a183a35
Compare
I changed the PR description to "we can detect if a program is in a state where it will never terminate" to avoid accusations that I've claimed to solve the halting problem. |
@rust-timer build cf47daa968f12c4f95111ce03aaf02b734cc78d7 Also added you to permissions. |
Success: Queued cf47daa968f12c4f95111ce03aaf02b734cc78d7 with parent 4af9132, comparison URL. |
Perf is now ready. The timing changes are... mixed 😕 (inflate clean-opt @ +2.4% and syn clean-opt @ -1.0% are the two extremes). |
Probably just noise, those seem to be fairly normal measurements for most noop builds. |
We can always bump the threshold to cause the loop checker to get triggered later |
I recorded the benchmark locally with |
@bors r+ We do emit a warning if things start going slow, so at least ppl will know what's going on. |
📌 Commit cf5eaa7 has been approved by |
⌛ Testing commit cf5eaa7 with merge d6364f49d822ba1c5076ac649327c048bc34a96e... |
💔 Test failed - status-travis |
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
@bors retry |
Infinite loop detection for const evaluation Resolves #50637. An `EvalContext` stores the transient state (stack, heap, etc.) of the MIRI virtual machine while it executing code. As long as MIRI only executes pure functions, we can detect if a program is in a state where it will never terminate by periodically taking a "snapshot" of this transient state and comparing it to previous ones. If any two states are exactly equal, the machine must be in an infinite loop. Instead of fully cloning a snapshot every time the detector is run, we store a snapshot's hash. Only when a hash collision occurs do we fully clone the interpreter state. Future snapshots which cause a collision will be compared against this clone, causing the interpreter to abort if they are equal. At the moment, snapshots are not taken until MIRI has progressed a certain amount. After this threshold, snapshots are taken every `DETECTOR_SNAPSHOT_PERIOD` steps. This means that an infinite loop with period `P` will be detected after a maximum of `2 * P * DETECTOR_SNAPSHOT_PERIOD` interpreter steps. The factor of 2 arises because we only clone a snapshot after it causes a hash collision.
☀️ Test successful - status-appveyor, status-travis |
Fix issue #52475: Make loop detector only consider reachable memory As [suggested](#51702 (comment)) by @oli-obk `alloc_id`s should be ignored by traversing all `Allocation`s in interpreter memory at a given moment in time, beginning by `ByRef` locals in the stack. - [x] Generalize the implementation of `Hash` for `EvalSnapshot` to traverse `Allocation`s - [x] Generalize the implementation of `PartialEq` for `EvalSnapshot` to traverse `Allocation`s - [x] Commit regression tests Fixes #52626 Fixes #52849
Resolves #50637.
An
EvalContext
stores the transient state (stack, heap, etc.) of the MIRI virtual machine while it executing code. As long as MIRI only executes pure functions, we can detect if a program is in a state where it will never terminate by periodically taking a "snapshot" of this transient state and comparing it to previous ones. If any two states are exactly equal, the machine must be in an infinite loop.Instead of fully cloning a snapshot every time the detector is run, we store a snapshot's hash. Only when a hash collision occurs do we fully clone the interpreter state. Future snapshots which cause a collision will be compared against this clone, causing the interpreter to abort if they are equal.
At the moment, snapshots are not taken until MIRI has progressed a certain amount. After this threshold, snapshots are taken every
DETECTOR_SNAPSHOT_PERIOD
steps. This means that an infinite loop with periodP
will be detected after a maximum of2 * P * DETECTOR_SNAPSHOT_PERIOD
interpreter steps. The factor of 2 arises because we only clone a snapshot after it causes a hash collision.