Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue: Automatically Verify Memory Initialization #3300

Open
7 of 14 tasks
artemagvanian opened this issue Jun 27, 2024 · 5 comments
Open
7 of 14 tasks

Tracking Issue: Automatically Verify Memory Initialization #3300

artemagvanian opened this issue Jun 27, 2024 · 5 comments
Assignees
Labels
[C] Feature / Enhancement A new feature request or enhancement to an existing feature. T-TrackingIssue Issues used to track a large amount of work related to a feature

Comments

@artemagvanian
Copy link
Contributor

artemagvanian commented Jun 27, 2024

Goal: Implement automated and efficient checks for memory initialization in Kani.

Motivation

Safe Rust guarantees that every variable is initialized, and every reference points to a properly allocated and initialized memory position. The same guarantees do not apply to raw pointers, transmute operations and unions, which are available in unsafe Rust. The safe usage of these constructs and operations must be upheld by the Rust developer.

Failures to uphold these safety properties can have unwanted consequences: it can affect the application behavior and even result in the application exposing critical data.

Kani should be able to detect uninitialized memory accesses automatically and efficiently.

Examples

Currently, Kani does not catch the following cases of UB due to accessing uninitialized memory:

#[repr(C)]
struct S(u32, u8);

#[kani::proof]
fn check_uninit_padding() {
    let s = S(0, 0);
    let ptr: *const u8 = addr_of!(s) as *const u8;
    let padding = unsafe { *(ptr.add(5)) }; // ~ERROR: padding bytes are uninitialized, so reading them is UB.
}
#[kani::proof]
fn check_vec_read_uninit() {
    let v: Vec<u8> = Vec::with_capacity(10);
    let uninit = unsafe { *v.as_ptr().add(5) }; // ~ERROR: reading from unitialized memory is UB.
}

Tasks

@artemagvanian artemagvanian added [C] Feature / Enhancement A new feature request or enhancement to an existing feature. T-TrackingIssue Issues used to track a large amount of work related to a feature labels Jun 27, 2024
@artemagvanian artemagvanian self-assigned this Jun 27, 2024
@zhassan-aws
Copy link
Contributor

// ~ERROR: padding bytes are uninitialized, so reading them is UB.

The Rust reference mentions that reading padding bytes is not UB:

In other words, the only cases in which reading uninitialized memory is permitted are inside unions and in "padding" (the gaps between the fields/elements of a type).

@artemagvanian
Copy link
Contributor Author

@zhassan-aws thanks for pointing that out -- I interpret the Rust reference a bit differently in this case.

I take the words you quoted to mean that it is indeed not UB to read padding bytes and treat them as padding bytes. I.e. reading a mix of data bytes and padding bytes and then storing it in a struct with an appropriate layout is not UB. However, the example above reads padding bytes into the type without valid value restrictions, which should be UB.

@zhassan-aws
Copy link
Contributor

I see. Can you post the full example (e.g. the definition of S)? We can check what MIRI says.

@artemagvanian
Copy link
Contributor Author

Updated the example above, sorry for that!

Here is the output from MIRI:

error: Undefined Behavior: using uninitialized data, but this operation requires initialized memory
 --> src/main.rs:9:28
  |
9 |     let padding = unsafe { *(ptr.add(5)) }; // ~ERROR: padding bytes are uninitialized, so reading them is UB.
  |                            ^^^^^^^^^^^^^ using uninitialized data, but this operation requires initialized memory
  |
  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
  = note: BACKTRACE:
  = note: inside `main` at src/main.rs:9:28: 9:41

@zhassan-aws
Copy link
Contributor

Got it. Thanks for the clarification.

artemagvanian added a commit that referenced this issue Jul 2, 2024
This PR enables automatic memory initialization proofs for raw pointers
in Kani. This is done without any extra instrumentation from the user.

Currently, due to high memory consumption and only partial support of
pointee types for which memory initialization proofs work, this feature
is gated behind `-Z uninit-checks` flag. Note that because it uses
shadow memory under the hood, programs using this feature need to pass
`-Z ghost-state` flag as well.

This PR is a part of working towards #3300.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.

---------

Co-authored-by: Celina G. Val <[email protected]>
artemagvanian added a commit that referenced this issue Jul 12, 2024
…cks (#3313)

This is a follow-up PR for #3264, a part of the larger work towards
#3300.

This PR introduces:

- Use of demonic non-determinism (prophecy variables) to improve memory
initialization checks performance;
- Instead of keeping track of memory initialization of each possibly
uninitialized byte pointed to by some pointer, one is chosen
non-deterministically;
- Tests for proper memory initialization checks injection for compiler
intrinsics;
- Separate functions for checking memory initialization of slice chunks.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.

---------

Co-authored-by: Zyad Hassan <[email protected]>
artemagvanian added a commit that referenced this issue Jul 19, 2024
… reinstrumentation (#3344)

Recently added memory initialization checks (see #3300 for an overview)
code suffers from re-instrumentation due to automatically added pointer
checks to the instrumentation code.

This PR adds an internal `kanitool::disable_checks` attribute to disable
automatically generated CBMC checks inside internal instrumentation.
This, in turn, relies on CBMC pragmas
(https://www.cprover.org/cprover-manual/properties/#usinggotoinstrument).

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.
github-merge-queue bot pushed a commit that referenced this issue Aug 5, 2024
)

As #3324 mentioned, delayed UB triggered by accessing uninitialized
memory was previously mitigated by injecting `assert!(false)` for every
possible source of it; this PR adds all the necessary infrastructure and
minimal support for detecting it properly.

The process of detecting and instrumenting places is as follows:
- Find all sources of delayed uninitialized memory UB (mutable pointer
casts with different pointee padding and copy intrinsics);
- Compute conservative aliasing graph between all memory places
reachable from each harness;
- Instrument all places pointed to by each source we found in step 1.

Not all language constructs are currently supported -- see the tracking
issue (#3300) for the full list of limitations.

Resolves #3324

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.

---------

Co-authored-by: Celina G. Val <[email protected]>
github-merge-queue bot pushed a commit that referenced this issue Aug 27, 2024
This PR introduces very basic support for memory initialization checks
for unions.

As of now, the following is supported:
- Unions can be created
- Union fields can be assigned to
- Union fields can be read from
- Unions can be assigned directly to another union

For more information about planned functionality, see #3300 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.
github-merge-queue bot pushed a commit that referenced this issue Sep 4, 2024
This PR introduces support for memory initialization checks for unions
passed across the function boundary.

Whenever a union is passed as an argument, we need to make sure that its
initialization state is preserved. Unlike pointers, unions do not have a
stable memory address which could identify them in shadow memory. Hence,
we need to pass extra information across function boundary since unions
are passed “by value”.

We introduce a global variable to store the previous address of unions
passed as function arguments, which allows us to effectively tie the
initialization state of unions passed between functions. This struct is
written to by the caller and read from by the callee.

For more information about planned functionality, see
#3300

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.
@carolynzech carolynzech self-assigned this Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[C] Feature / Enhancement A new feature request or enhancement to an existing feature. T-TrackingIssue Issues used to track a large amount of work related to a feature
Projects
None yet
Development

No branches or pull requests

3 participants