Try to fix attestation and measurements bugs #342

L0czek · 2024-07-01T16:06:58Z

This PR tries to resolve issues with tests regarding measurements and attestation. Currently, the results are as follows:

	ACS TEST SUIT	PASS/FAIL
1	measurement_immutable_rim	PASS
2	measurement_initial_rem_is_zero	PASS
3	measurement_rim_order	PASS
4	attestation_token_verify	PASS
5	attestation_rpv_value	PASS
6	attestation_challenge_data_verification	PASS
7	attestation_token_init	PASS
8	attestation_realm_measurement_type	PASS
9	attestation_platform_challenge_size	PASS
10	attestation_rem_extend_check	PASS
11	attestation_rem_extend_check_realm_token	PASS
12	attestation_rec_exit_irq	!@???##&?

Legend:

!@???##&? - This test passes only when I run it without others (using the --selected-tests option). As of now, I wasn't able to debug why the interrupt doesn't trigger when the entire test suite is running. Unfortunately, there are more problems with this test, which will be discussed later.

Encountered issues

Some simple mistakes (these were fixed and doesn't require further discussion)

In ATTEST_TOKEN_INIT the challenge was read from wrong registers.
During the attestation token generation, the Realm Personalization Value was stubbed.

Using improperly initialized memory

Upon Realm's REC creation in the rmi REC_CREATE handler Islet does the following:

        let mut rec_granule = get_granule_if!(rec, GranuleState::Delegated)?;
        #[cfg(not(kani))]
        // `page_table` is currently not reachable in model checking harnesses
        rmm.page_table.map(rec, true);
        let mut rec = rec_granule.content_mut::<Rec<'_>>()?;

It uses the SafetyAssured and SafetyAssumed traits to basically convert a raw pointer to a Rust reference. Later, the rec object is initialized by a safe function Rec::init:

        match prepare_args(&mut rd, params.mpidr) {
            Ok((vcpuid, vttbr, vmpidr)) => {
                ret[1] = vcpuid;
                rec.init(owner, vcpuid, params.flags, vttbr, vmpidr)?;
            }
            Err(_) => return Err(Error::RmiErrorInput),
        }

Inside this Rec::init function, someone forgot to initialize the attest_state field. This is of course a simple mistake, but I think that it was caused by a bigger issue. I understand that Rust doesn't yet have any syntax allowing to construct an object in-place by providing some pointer. The current solution implemented in Islet is to cast the raw pointer to Rust reference and then call a some init function which assumes that it is operating on a valid object. That's why it is possible to forget about some fields in a constructor. I think we should discuss better ways of handling this issue, as either the Rec::init function should be made unsafe as this initialization is semantically unsafe, or we should design a better mechanism. Naturally, marking the init function as unsafe will not solve this problem, but it will be a warning that here Rust cannot ensure complete safety. The other solution might be to implement a mechanism similar to how the Box pointer can be created using a custom memory allocator. On the high level, this ensures that the created object will be placed in a memory region controlled by the provided allocator. In this scenario, the object is first created on the stack using the Rust typical struct creation syntax and is the moved to the allocated memory region. To minimize performance impact, Rust compiles relies on the LLVM framework to perform Return Value optimization. This should initialize the object directly in the destination, instead of copying it. In a nutshell, we could implement something like this:

    /// Allocates memory in the given allocator then places `x` into it.
    ///
    /// This doesn't actually allocate if `T` is zero-sized.
    ///
    /// # Examples
    ///
    /// ```
    /// #![feature(allocator_api)]
    ///
    /// use std::alloc::System;
    ///
    /// let five = Box::new_in(5, System);
    /// ```
    #[cfg(not(no_global_oom_handling))]
    #[unstable(feature = "allocator_api", issue = "32838")]
    #[must_use]
    #[inline]
    pub fn new_in(x: T, alloc: A) -> Self
    where
        A: Allocator,
    {
        let mut boxed = Self::new_uninit_in(alloc);
        unsafe {
            boxed.as_mut_ptr().write(x);
            boxed.assume_init()
        }
    }

I quickly patched this issue by initializing the attest_state field inside the Rec::init. Nevertheless, I still think that we should discuss how to rework the Recs, and Rds initialization as this defeats the point of using Rust and really starts looking like C or C++.

The `!@???##&?` test

The attestation_rec_exit_irq attempts to test whether reading the attestation token from rmm can be interrupted by an IRQ. Since, the IRQ is a lower level exception, it cannot interrupt the control flow of Islet. As a result, it needs to be actively checked during RSI handling. For example, the TF-rmm does this as follows:

while (attest_data->token_sign_ctx.state == ATTEST_SIGN_IN_PROGRESS) {
	attest_token_continue_sign_state(attest_data, res);
	if (check_pending_irq()) {
		res->action = UPDATE_REC_EXIT_TO_HOST;
		rec_exit->exit_reason = RMI_EXIT_IRQ;
		return;
	}
}

In Islet this is currently impossible to recreate.

Issue 1

pub fn get_token(challenge: &[u8], measurements: &[Measurement], hash_algo: u8) -> Vec<u8> {
    // TODO: consider storing attestation object somewhere,
    // as RAK and token do not change during rmm lifetime.
    Attestation::new(&plat_token(), &realm_attest_key()).create_attestation_token(
        challenge,
        measurements,
        hash_algo,
    )
}

... and ..


fn get_token_part(
    rd: &Rd,
    context: &mut Rec<'_>,
    size: usize,
) -> core::result::Result<(Vec<u8>, usize), Error> {
    let hash_algo = rd.hash_algo();
    let measurements = rd.measurements;

    // FIXME: This should be stored instead of generating it for every call.
    let token =
        crate::rsi::attestation::get_token(context.attest_challenge(), &measurements, hash_algo);

To make this test work as ARM has intended, we should fix these two TODOs, as these issues are a performance nightmare. When the creation of the realm token will be interruptible by IRQs we cannot regenerate the token every time the Realm asks for it.

Issue 2

    fn sign(secret_key: p384::SecretKey, data: &[u8]) -> Vec<u8> {
        let signing_key = p384::ecdsa::SigningKey::from_bytes(&secret_key.to_bytes())
            .expect("Failed to generate signing key");

        let signature: p384::ecdsa::Signature = signing_key
            .try_sign(data)
            .expect("Failed to create P384 signature");
        signature.to_vec()
    }

This code is responsible for signing the attestation token. We must rewrite it to a "init, update, finish" type API to implement an IRQ checking loop similar to how TF-rmm does it.

DUCT TAPE solution

To make this test working, I applied the following makeshift:

        {
            let (token_part, token_left) = get_token_part(&rd, rec, buffer_size)?;

            if is_irq_pending() {
                error!("IRQ is pending while fetching token");
                set_reg(rec, 0, INCOMPLETE)?;
                set_reg(rec, 1, 0)?;
                ret[0] = rmi::SUCCESS_REC_ENTER;
                return Ok(());
            }

            unsafe {
                let pa_ptr = attest_pa as *mut u8;
                core::ptr::copy(token_part.as_ptr(), pa_ptr.add(pa_offset), token_part.len());
            }
            if token_left == 0 {
                set_reg(rec, 0, SUCCESS)?;
                rec.set_attest_state(RmmRecAttestState::NoAttestInProgress);
            } else {
                set_reg(rec, 0, INCOMPLETE)?;
            }
            set_reg(rec, 1, token_part.len())?;
        }

The helper function:

pub fn is_irq_pending() -> bool {
    let val: u64;

    unsafe {
        core::arch::asm!(
            "mrs {}, ISR_EL1",
            out(reg) val
        )
    }

    return val != 0;
}

Thanks to this, the test passes, but for some reason when I run the entire test suite it fails. Besides that, this solution is a performance nightmare as it generated the token and then checks for IRQ and discards the token if an IRQ is pending.

Summary

This PR in its current state provides fixes and some makeshifts solutions that are necessary for the application provisioning to work. Naturally, the last test is not essential, as it only tests the performance and interrupt handling responsiveness. Nevertheless, we should fix it as will improve performance even on fvp emulator.

bitboom · 2024-07-02T05:02:52Z

Inside this Rec::init function, someone forgot to initialize the attest_state field. This is of course a simple mistake, but I think that it was caused by a bigger issue. I understand that Rust doesn't yet have any syntax allowing to construct an object in-place by providing some pointer. The current solution implemented in Islet is to cast the raw pointer to Rust reference and then call a some init function which assumes that it is operating on a valid object. That's why it is possible to forget about some fields in a constructor. I think we should discuss better ways of handling this issue, as either the Rec::init function should be made unsafe as this initialization is semantically unsafe, or we should design a better mechanism. Naturally, marking the init function as unsafe will not solve this problem, but it will be a warning that here Rust cannot ensure complete safety.

First, thank you for raising this issue. 👍

Overall, I very much agree with your opinion.
I also agree with making Islet a safer Rust project rather than using C or C++.

I agree with making the Rec::init function unsafe. However, is it enough to make only the init function unsafe? How can we prevent mistakenly calling rec without initializing it through init? Perhaps we should mark all cases where we receive a pointer from the host and cast it to an object as unsafe.

In Islet v1.0, Based on expressions, the ratio of unsafe code is 21%. (refer to here).
While it is important to precisely mark unsafe keywords in the current code
and decide whether to mark them based on the appropriate call paths, I also believe that unsafe should not lower development performance in a fast-paced development environment.

Therefore, I prefer the latter part of your suggestion, which involves design improvements.

Additionally, I believe that we should actively use unsafe analysis tools. Using unsafe can cause other side effects.

In the above example, the developer can set the state appropriately within the unsafe fn init function,
but can also set an invalid value. Since it is unsafe, it allows that.

In this case, using MIRI, which I have been applying, can detect such issues.

Below is an example mocked and verified using MIRI.

#[repr(u8)]
#[derive(Debug)]
enum State {
    Run = 1,
    Wait = 2,
}

#[derive(Debug)]
struct Rec {
    state: State,
}

fn main() {
    let mut a = Rec { state: State::Run };
    println!("Valid? => {:?}", a);

    let raw = std::ptr::addr_of_mut!(a.state);
    unsafe {
        let raw = raw as *mut u8;
        *raw = 0xff;
    }

    println!("Really? => {:?}", a);
}

When we run the above, the State ends up as Wait, but this is not the result we intended.

$ cargo run
Valid? => Rec { state: Run }
Really? => Rec { state: Wait }

MIRI can detect this.

$ cargo miri run
Valid? => Rec { state: Run }
error: Undefined Behavior: enum value has invalid tag: 0xff
  --> src/main.rs:2:10

Currently, I am integrating MIRI into Islet #341. I plan to verify areas where unsafe is used. Also adding it to the CI so that other developers can easily verify it.

zpzigi754 · 2024-07-02T05:25:18Z

Really nice to bring up all these issues in detail! I think that it would be good to post these on an issue page as well in addition to the current PR description. I've also encountered that non-initialized fields in REC such as psci_pending and host_call_pending, which are missed in rec init, could cause a problem.

jinbpark · 2024-07-03T23:26:20Z

Great description-!

The other solution might be to implement a mechanism similar to how the Box pointer can be created using a custom memory allocator.

I like this idea. I'm not sure if there is another solution (I think there might be) that can achieve the same goal, making a safe init function for Rec, Rd, or whatever having similar strict memory requirements.

@bitboom , can we have "Safety Trait" handle this initialization stuff? (there might be some difficulty, e.g., deciding when to initialize and when to not) It sounds like "Safety Trait" might be a proper place to do that. But, we might need to spend some time checking if it technically makes sense-

bitboom · 2024-07-04T01:42:02Z

@L0czek , @jinbpark

I wrote a draft PR #344 to discuss this.

Previously, safe-abstraction crates provides to parse an instance from a given address assume_safe.
This draft adds assume_safe_uninit_with to initialize an instance at a given address.

This approach has been applied to create instances from granules. Let me know if you have any feedback and suggestions.

bitboom

Nice work and description!
(Please apply cargo fmt to pass CI.)

nook1208 · 2024-07-04T04:40:46Z

Thank you for the good description !

The current solution implemented in Islet is to cast the raw pointer to Rust reference and then call a some init function which assumes that it is operating on a valid object. That's why it is possible to forget about some fields in a constructor.

I agree that it's easy to forget some fields when intializing it. Because most of memory like Rec are allocated from the host, it's hard to force that all of the fields should be non-null when they're created.

Another way to handle this is to have the init function return a structure with all fields set like @bitboom did in #344

    pub fn new() -> Self {
        Self {
            context: Context::new(),
            attest_state: RmmRecAttestState::NoAttestInProgress,
            attest_challenge: [0; MAX_CHALLENGE_SIZE],
            attest_token_offset: 0,
            owner: OnceCell::new(),
            vcpuid: 0,
            runnable: false,
            psci_pending: false,
            state: State::Ready,
            ripas: Ripas {
                start: 0,
                end: 0,
                addr: 0,
                state: 0,
                flags: 0,
            },
            vtcr: 0,
            host_call_pending: false,
        }
    }
...
let mut rec = rec_granule.new_uninit_with::<Rec<'_>>(Rec::new())?;

It would be nice if there is a way to always guarantee this rule, but unfortunately I can't think of any right now.. :(

Signed-off-by: Michał Szaknis <[email protected]>

L0czek · 2024-07-05T10:34:56Z

I repushed the commits after formatting with cargo in case you want to merge it. On our side, we can start implementing the application provisioning on host. Most likely, we will need the entire stack with RMM somewhere in August / September to start integrating the developed components. As a result, you can take your time fixing stuff the proper way. Nevertheless, if you want to merge this PR feel free to do so. I such case I will create an issue and copy the description there so we can continue the discussion.

@bitboom I saw your draft, it looks promising. I will review it when you're ready (ping me somewhere).

L0czek force-pushed the dev/m.szaknis/fix_eac5_api_errors branch from e2f6cd9 to 4bbe2dd Compare July 3, 2024 08:40

L0czek marked this pull request as ready for review July 3, 2024 08:41

bitboom mentioned this pull request Jul 4, 2024

Safe instance creation at given memory address #344

Merged

bitboom approved these changes Jul 4, 2024

View reviewed changes

L0czek force-pushed the dev/m.szaknis/fix_eac5_api_errors branch 2 times, most recently from 9dc8fce to 37479ac Compare July 5, 2024 10:09

L0czek added 2 commits July 5, 2024 12:14

Patch api errors

edf99c2

Signed-off-by: Michał Szaknis <[email protected]>

Use DUCT-TAPE to fix irq test

fd2da19

Signed-off-by: Michał Szaknis <[email protected]>

L0czek force-pushed the dev/m.szaknis/fix_eac5_api_errors branch from 37479ac to fd2da19 Compare July 5, 2024 10:14

bokdeuk-jeong merged commit 2e0f9ff into main Jul 16, 2024
8 checks passed

bokdeuk-jeong deleted the dev/m.szaknis/fix_eac5_api_errors branch July 16, 2024 11:14

L0czek mentioned this pull request Jul 16, 2024

Issues with safety traits and interrupting attestation #352

Open

p-sawicki2 mentioned this pull request Jul 17, 2024

RMM v1.0-eac5 support #291

Closed

L0czek mentioned this pull request Jul 18, 2024

realm measurement using veraison #350

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try to fix attestation and measurements bugs #342

Try to fix attestation and measurements bugs #342

L0czek commented Jul 1, 2024

bitboom commented Jul 2, 2024 •

edited

Loading

zpzigi754 commented Jul 2, 2024

jinbpark commented Jul 3, 2024

bitboom commented Jul 4, 2024

bitboom left a comment

nook1208 commented Jul 4, 2024

L0czek commented Jul 5, 2024

Try to fix attestation and measurements bugs #342

Try to fix attestation and measurements bugs #342

Conversation

L0czek commented Jul 1, 2024

Encountered issues

Some simple mistakes (these were fixed and doesn't require further discussion)

Using improperly initialized memory

The !@???##&? test

Issue 1

Issue 2

DUCT TAPE solution

Summary

bitboom commented Jul 2, 2024 • edited Loading

zpzigi754 commented Jul 2, 2024

jinbpark commented Jul 3, 2024

bitboom commented Jul 4, 2024

bitboom left a comment

Choose a reason for hiding this comment

nook1208 commented Jul 4, 2024

L0czek commented Jul 5, 2024

The `!@???##&?` test

bitboom commented Jul 2, 2024 •

edited

Loading