Refactor `validate_scalar` to take advantage of type information #53826

oli-obk · 2018-08-30T15:26:12Z

Currently we have no way of detecting that

#![feature(const_raw_ptr_deref)]

const X: &i32 = unsafe { &*(8 as *const i32) };

fn main() {}

is UB (an integer is not a valid safe address at compile-time as you can safely dereference it in another constant and then you'd get an error and UB if done at runtime.

The basic idea is to take the function at

rust/src/librustc_mir/interpret/validity.rs

Line 99 in 0e98621

fn validate_scalar(

and split it into two parts

validate_scalar_by_type which first matches on the scalar's type and then decides how to operate on it
- the existing match should basically be pulled out and the correctness checks happen after one knows which type one is operating on
- new arm: ty::Ref can just use to_ptr() on the value and convert the error into a validation_failure! error (see how this is done elsewhere in the same file)
  - do the pointer recursion only here
validate_scalar_by_layout which pretty much does everything else that the current validate_scalar does, minus the type checks and pointer recursion.
run validate_scalar_by_layout on every scalar (maybe here?), and not just on leaf fields. This is necessary to catch const FOO: NonZeroU8 = unsafe { NonZeroU8::new_unchecked(0) }; because right now we're just checking the field of NonZeroU8, which is u8 and thus fine to be 0.

cc @RalfJung

The text was updated successfully, but these errors were encountered:

RalfJung · 2018-08-30T17:29:28Z

the existing match

(which is for Char only)

new arm: ty::Ref can just use to_ptr() on the value and convert the error into a validation_failure! error

Not exactly. First we have to do memory.check_align on the scalar. Then, if the pointee type is a ZST we are done. And only then (for non-ZST) we can call to_ptr().

run validate_scalar_by_layout on every scalar

Also figure out a way to maybe not do redundant checks...

For example, currently I think we have the invariant that the layout only can get more permissive as we descend into fields, right? (@eddyb)
If that is true, we could add a parameter to validate_operand saying that the Scalar has already been validated so we do not have to do that again. (This might be Option<ZeroSizedPrivateToken> to be sure :D )

shivanth · 2018-09-05T13:18:12Z

Can I take this up (Because it has an E-mentor Tab 😄 )?

oli-obk · 2018-09-05T13:33:09Z

sure!

As a first step I think it would be prudent to split the function into a Ptr checking function and a Bits checking function, and have the current function just forward to either of those two

As a second step you can then make the outer function do a match on the type instead of the two matches that are now inside the ptr/bits functions.
This requires some untangling of when to actually call the inner function and when to error.

RalfJung · 2018-09-05T14:05:07Z

As a first step I think it would be prudent to split the function into a Ptr checking function and a Bits checking function, and have the current function just forward to either of those two

I think we should dispatch on the type first, not on the value. We want to validate the value, and hence not use it to decide which codepath to take.

oli-obk · 2018-09-05T14:14:08Z

The refactoring is easier if you have a function to call for specific values instead of having to do both refactorings at once. At some point you'll want to execute the bits checking code or the ptr checking code. And these points will be in separate match arms in the type match.

RalfJung · 2018-09-05T14:50:17Z

Not sure what you mean, but the "specific value" is a Scalar.

This is essentially implementing the validity invariant (and then some more stuff for const safety, but that's most a detail), which is primarily a type-based property. Ideally, the code is readable as almost a definition of validity. Any deviation from an outermost match on the type will make it less readable.

I begrudgingly accepted validate_by_layout. I think it'd actually better to treat NonNull* as primitive types and replicate this part of the logic from layout computation (it's not like that is very complicated logic, and for all intents and purposes as far as UB and language semantics are concerned, lang items are primitive types): The validity check is the definition that makes it UB to ever violate this NonNull; layout then is the part of the compiler that exploits that definition. But well.

However, for bits vs. pointers, I just see no good reason to deviate even further from what I see as the ideal structure. What do we get for this loss in readability? All the types that require bits just call to_bits()?, done (wrapped in some macro for nicer error messages; the current code already shows the need for such a macro).

RalfJung · 2018-10-02T19:22:10Z

I'm afraid I already did this as part of an upcoming PR, because I needed to change validation somewhat anyway... sorry @shivanth :/

@oli-obk

Prepare miri engine for enforcing validity invariant during execution In particular, make recursive checking of references optional, and add a `const_mode` parameter that says whether `usize` is allowed to contain a pointer. Also refactor validation a bit to be type-driven at the "leafs" (primitive types), and separately validate scalar layout to catch `NonNull` violations (which it did not properly validate before). Fixes #53826 Also fixes #54751 r? @oli-obk

@oli-obk

Prepare miri engine for enforcing validity invariant during execution In particular, make recursive checking of references optional, and add a `const_mode` parameter that says whether `usize` is allowed to contain a pointer. Also refactor validation a bit to be type-driven at the "leafs" (primitive types), and separately validate scalar layout to catch `NonNull` violations (which it did not properly validate before). Fixes #53826 Also fixes #54751 r? @oli-obk

oli-obk added E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. A-const-eval Area: Constant evaluation (MIR interpretation) labels Aug 30, 2018

oli-obk mentioned this issue Aug 31, 2018

Optimize miri engine performance for sanity checks of arrays #53845

Closed

RalfJung self-assigned this Oct 2, 2018

RalfJung mentioned this issue Oct 2, 2018

Prepare miri engine for enforcing validity invariant during execution #54762

Merged

bors closed this as completed in #54762 Oct 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `validate_scalar` to take advantage of type information #53826

Refactor `validate_scalar` to take advantage of type information #53826

oli-obk commented Aug 30, 2018

RalfJung commented Aug 30, 2018

shivanth commented Sep 5, 2018

oli-obk commented Sep 5, 2018

RalfJung commented Sep 5, 2018

oli-obk commented Sep 5, 2018

RalfJung commented Sep 5, 2018 •

edited

Loading

RalfJung commented Oct 2, 2018

Refactor validate_scalar to take advantage of type information #53826

Refactor validate_scalar to take advantage of type information #53826

Comments

oli-obk commented Aug 30, 2018

RalfJung commented Aug 30, 2018

shivanth commented Sep 5, 2018

oli-obk commented Sep 5, 2018

RalfJung commented Sep 5, 2018

oli-obk commented Sep 5, 2018

RalfJung commented Sep 5, 2018 • edited Loading

RalfJung commented Oct 2, 2018

Refactor `validate_scalar` to take advantage of type information #53826

Refactor `validate_scalar` to take advantage of type information #53826

RalfJung commented Sep 5, 2018 •

edited

Loading