Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to fix attestation and measurements bugs #342

Merged
merged 2 commits into from
Jul 16, 2024

Conversation

L0czek
Copy link
Collaborator

@L0czek L0czek commented Jul 1, 2024

This PR tries to resolve issues with tests regarding measurements and attestation. Currently, the results are as follows:

ACS TEST SUIT PASS/FAIL
1 measurement_immutable_rim PASS
2 measurement_initial_rem_is_zero PASS
3 measurement_rim_order PASS
4 attestation_token_verify PASS
5 attestation_rpv_value PASS
6 attestation_challenge_data_verification PASS
7 attestation_token_init PASS
8 attestation_realm_measurement_type PASS
9 attestation_platform_challenge_size PASS
10 attestation_rem_extend_check PASS
11 attestation_rem_extend_check_realm_token PASS
12 attestation_rec_exit_irq !@???##&?

Legend:

  • !@???##&? - This test passes only when I run it without others (using the --selected-tests option). As of now, I wasn't able to debug why the interrupt doesn't trigger when the entire test suite is running. Unfortunately, there are more problems with this test, which will be discussed later.

Encountered issues

Some simple mistakes (these were fixed and doesn't require further discussion)

  • In ATTEST_TOKEN_INIT the challenge was read from wrong registers.
  • During the attestation token generation, the Realm Personalization Value was stubbed.

Using improperly initialized memory

Upon Realm's REC creation in the rmi REC_CREATE handler Islet does the following:

        let mut rec_granule = get_granule_if!(rec, GranuleState::Delegated)?;
        #[cfg(not(kani))]
        // `page_table` is currently not reachable in model checking harnesses
        rmm.page_table.map(rec, true);
        let mut rec = rec_granule.content_mut::<Rec<'_>>()?;

It uses the SafetyAssured and SafetyAssumed traits to basically convert a raw pointer to a Rust reference. Later, the rec object is initialized by a safe function Rec::init:

        match prepare_args(&mut rd, params.mpidr) {
            Ok((vcpuid, vttbr, vmpidr)) => {
                ret[1] = vcpuid;
                rec.init(owner, vcpuid, params.flags, vttbr, vmpidr)?;
            }
            Err(_) => return Err(Error::RmiErrorInput),
        }

Inside this Rec::init function, someone forgot to initialize the attest_state field. This is of course a simple mistake, but I think that it was caused by a bigger issue. I understand that Rust doesn't yet have any syntax allowing to construct an object in-place by providing some pointer. The current solution implemented in Islet is to cast the raw pointer to Rust reference and then call a some init function which assumes that it is operating on a valid object. That's why it is possible to forget about some fields in a constructor. I think we should discuss better ways of handling this issue, as either the Rec::init function should be made unsafe as this initialization is semantically unsafe, or we should design a better mechanism. Naturally, marking the init function as unsafe will not solve this problem, but it will be a warning that here Rust cannot ensure complete safety. The other solution might be to implement a mechanism similar to how the Box pointer can be created using a custom memory allocator. On the high level, this ensures that the created object will be placed in a memory region controlled by the provided allocator. In this scenario, the object is first created on the stack using the Rust typical struct creation syntax and is the moved to the allocated memory region. To minimize performance impact, Rust compiles relies on the LLVM framework to perform Return Value optimization. This should initialize the object directly in the destination, instead of copying it. In a nutshell, we could implement something like this:

    /// Allocates memory in the given allocator then places `x` into it.
    ///
    /// This doesn't actually allocate if `T` is zero-sized.
    ///
    /// # Examples
    ///
    /// ```
    /// #![feature(allocator_api)]
    ///
    /// use std::alloc::System;
    ///
    /// let five = Box::new_in(5, System);
    /// ```
    #[cfg(not(no_global_oom_handling))]
    #[unstable(feature = "allocator_api", issue = "32838")]
    #[must_use]
    #[inline]
    pub fn new_in(x: T, alloc: A) -> Self
    where
        A: Allocator,
    {
        let mut boxed = Self::new_uninit_in(alloc);
        unsafe {
            boxed.as_mut_ptr().write(x);
            boxed.assume_init()
        }
    }

I quickly patched this issue by initializing the attest_state field inside the Rec::init. Nevertheless, I still think that we should discuss how to rework the Recs, and Rds initialization as this defeats the point of using Rust and really starts looking like C or C++.

The !@???##&? test

The attestation_rec_exit_irq attempts to test whether reading the attestation token from rmm can be interrupted by an IRQ. Since, the IRQ is a lower level exception, it cannot interrupt the control flow of Islet. As a result, it needs to be actively checked during RSI handling. For example, the TF-rmm does this as follows:

while (attest_data->token_sign_ctx.state == ATTEST_SIGN_IN_PROGRESS) {
	attest_token_continue_sign_state(attest_data, res);
	if (check_pending_irq()) {
		res->action = UPDATE_REC_EXIT_TO_HOST;
		rec_exit->exit_reason = RMI_EXIT_IRQ;
		return;
	}
}

In Islet this is currently impossible to recreate.

Issue 1

pub fn get_token(challenge: &[u8], measurements: &[Measurement], hash_algo: u8) -> Vec<u8> {
    // TODO: consider storing attestation object somewhere,
    // as RAK and token do not change during rmm lifetime.
    Attestation::new(&plat_token(), &realm_attest_key()).create_attestation_token(
        challenge,
        measurements,
        hash_algo,
    )
}

... and ..


fn get_token_part(
    rd: &Rd,
    context: &mut Rec<'_>,
    size: usize,
) -> core::result::Result<(Vec<u8>, usize), Error> {
    let hash_algo = rd.hash_algo();
    let measurements = rd.measurements;

    // FIXME: This should be stored instead of generating it for every call.
    let token =
        crate::rsi::attestation::get_token(context.attest_challenge(), &measurements, hash_algo);

To make this test work as ARM has intended, we should fix these two TODOs, as these issues are a performance nightmare. When the creation of the realm token will be interruptible by IRQs we cannot regenerate the token every time the Realm asks for it.

Issue 2

    fn sign(secret_key: p384::SecretKey, data: &[u8]) -> Vec<u8> {
        let signing_key = p384::ecdsa::SigningKey::from_bytes(&secret_key.to_bytes())
            .expect("Failed to generate signing key");

        let signature: p384::ecdsa::Signature = signing_key
            .try_sign(data)
            .expect("Failed to create P384 signature");
        signature.to_vec()
    }

This code is responsible for signing the attestation token. We must rewrite it to a "init, update, finish" type API to implement an IRQ checking loop similar to how TF-rmm does it.

DUCT TAPE solution

To make this test working, I applied the following makeshift:

        {
            let (token_part, token_left) = get_token_part(&rd, rec, buffer_size)?;

            if is_irq_pending() {
                error!("IRQ is pending while fetching token");
                set_reg(rec, 0, INCOMPLETE)?;
                set_reg(rec, 1, 0)?;
                ret[0] = rmi::SUCCESS_REC_ENTER;
                return Ok(());
            }

            unsafe {
                let pa_ptr = attest_pa as *mut u8;
                core::ptr::copy(token_part.as_ptr(), pa_ptr.add(pa_offset), token_part.len());
            }
            if token_left == 0 {
                set_reg(rec, 0, SUCCESS)?;
                rec.set_attest_state(RmmRecAttestState::NoAttestInProgress);
            } else {
                set_reg(rec, 0, INCOMPLETE)?;
            }
            set_reg(rec, 1, token_part.len())?;
        }

The helper function:

pub fn is_irq_pending() -> bool {
    let val: u64;

    unsafe {
        core::arch::asm!(
            "mrs {}, ISR_EL1",
            out(reg) val
        )
    }

    return val != 0;
}

Thanks to this, the test passes, but for some reason when I run the entire test suite it fails. Besides that, this solution is a performance nightmare as it generated the token and then checks for IRQ and discards the token if an IRQ is pending.

Summary

This PR in its current state provides fixes and some makeshifts solutions that are necessary for the application provisioning to work. Naturally, the last test is not essential, as it only tests the performance and interrupt handling responsiveness. Nevertheless, we should fix it as will improve performance even on fvp emulator.

@bitboom
Copy link
Collaborator

bitboom commented Jul 2, 2024

Inside this Rec::init function, someone forgot to initialize the attest_state field. This is of course a simple mistake, but I think that it was caused by a bigger issue. I understand that Rust doesn't yet have any syntax allowing to construct an object in-place by providing some pointer. The current solution implemented in Islet is to cast the raw pointer to Rust reference and then call a some init function which assumes that it is operating on a valid object. That's why it is possible to forget about some fields in a constructor. I think we should discuss better ways of handling this issue, as either the Rec::init function should be made unsafe as this initialization is semantically unsafe, or we should design a better mechanism. Naturally, marking the init function as unsafe will not solve this problem, but it will be a warning that here Rust cannot ensure complete safety.

First, thank you for raising this issue. 👍

Overall, I very much agree with your opinion.
I also agree with making Islet a safer Rust project rather than using C or C++.

I agree with making the Rec::init function unsafe. However, is it enough to make only the init function unsafe? How can we prevent mistakenly calling rec without initializing it through init? Perhaps we should mark all cases where we receive a pointer from the host and cast it to an object as unsafe.

In Islet v1.0, Based on expressions, the ratio of unsafe code is 21%. (refer to here).
While it is important to precisely mark unsafe keywords in the current code
and decide whether to mark them based on the appropriate call paths, I also believe that unsafe should not lower development performance in a fast-paced development environment.

Therefore, I prefer the latter part of your suggestion, which involves design improvements.


Additionally, I believe that we should actively use unsafe analysis tools. Using unsafe can cause other side effects.

In the above example, the developer can set the state appropriately within the unsafe fn init function,
but can also set an invalid value. Since it is unsafe, it allows that.

In this case, using MIRI, which I have been applying, can detect such issues.

Below is an example mocked and verified using MIRI.

#[repr(u8)]
#[derive(Debug)]
enum State {
    Run = 1,
    Wait = 2,
}

#[derive(Debug)]
struct Rec {
    state: State,
}

fn main() {
    let mut a = Rec { state: State::Run };
    println!("Valid? => {:?}", a);

    let raw = std::ptr::addr_of_mut!(a.state);
    unsafe {
        let raw = raw as *mut u8;
        *raw = 0xff;
    }

    println!("Really? => {:?}", a);
}

When we run the above, the State ends up as Wait, but this is not the result we intended.

$ cargo run
Valid? => Rec { state: Run }
Really? => Rec { state: Wait }

MIRI can detect this.

$ cargo miri run
Valid? => Rec { state: Run }
error: Undefined Behavior: enum value has invalid tag: 0xff
  --> src/main.rs:2:10

Currently, I am integrating MIRI into Islet #341. I plan to verify areas where unsafe is used. Also adding it to the CI so that other developers can easily verify it.

@zpzigi754
Copy link
Collaborator

Really nice to bring up all these issues in detail! I think that it would be good to post these on an issue page as well in addition to the current PR description. I've also encountered that non-initialized fields in REC such as psci_pending and host_call_pending, which are missed in rec init, could cause a problem.

@L0czek L0czek force-pushed the dev/m.szaknis/fix_eac5_api_errors branch from e2f6cd9 to 4bbe2dd Compare July 3, 2024 08:40
@L0czek L0czek marked this pull request as ready for review July 3, 2024 08:41
@jinbpark
Copy link
Collaborator

jinbpark commented Jul 3, 2024

Great description-!

The other solution might be to implement a mechanism similar to how the Box pointer can be created using a custom memory allocator.

I like this idea. I'm not sure if there is another solution (I think there might be) that can achieve the same goal, making a safe init function for Rec, Rd, or whatever having similar strict memory requirements.

@bitboom , can we have "Safety Trait" handle this initialization stuff? (there might be some difficulty, e.g., deciding when to initialize and when to not) It sounds like "Safety Trait" might be a proper place to do that. But, we might need to spend some time checking if it technically makes sense-

@bitboom
Copy link
Collaborator

bitboom commented Jul 4, 2024

@L0czek , @jinbpark

I wrote a draft PR #344 to discuss this.

Previously, safe-abstraction crates provides to parse an instance from a given address assume_safe.
This draft adds assume_safe_uninit_with to initialize an instance at a given address.

This approach has been applied to create instances from granules. Let me know if you have any feedback and suggestions.

Copy link
Collaborator

@bitboom bitboom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work and description!
(Please apply cargo fmt to pass CI.)

@nook1208
Copy link
Collaborator

nook1208 commented Jul 4, 2024

Thank you for the good description !

The current solution implemented in Islet is to cast the raw pointer to Rust reference and then call a some init function which assumes that it is operating on a valid object. That's why it is possible to forget about some fields in a constructor.

I agree that it's easy to forget some fields when intializing it. Because most of memory like Rec are allocated from the host, it's hard to force that all of the fields should be non-null when they're created.

Another way to handle this is to have the init function return a structure with all fields set like @bitboom did in #344

    pub fn new() -> Self {
        Self {
            context: Context::new(),
            attest_state: RmmRecAttestState::NoAttestInProgress,
            attest_challenge: [0; MAX_CHALLENGE_SIZE],
            attest_token_offset: 0,
            owner: OnceCell::new(),
            vcpuid: 0,
            runnable: false,
            psci_pending: false,
            state: State::Ready,
            ripas: Ripas {
                start: 0,
                end: 0,
                addr: 0,
                state: 0,
                flags: 0,
            },
            vtcr: 0,
            host_call_pending: false,
        }
    }
...
let mut rec = rec_granule.new_uninit_with::<Rec<'_>>(Rec::new())?;

It would be nice if there is a way to always guarantee this rule, but unfortunately I can't think of any right now.. :(

@L0czek L0czek force-pushed the dev/m.szaknis/fix_eac5_api_errors branch 2 times, most recently from 9dc8fce to 37479ac Compare July 5, 2024 10:09
L0czek added 2 commits July 5, 2024 12:14
Signed-off-by: Michał Szaknis <[email protected]>
Signed-off-by: Michał Szaknis <[email protected]>
@L0czek L0czek force-pushed the dev/m.szaknis/fix_eac5_api_errors branch from 37479ac to fd2da19 Compare July 5, 2024 10:14
@L0czek
Copy link
Collaborator Author

L0czek commented Jul 5, 2024

I repushed the commits after formatting with cargo in case you want to merge it. On our side, we can start implementing the application provisioning on host. Most likely, we will need the entire stack with RMM somewhere in August / September to start integrating the developed components. As a result, you can take your time fixing stuff the proper way. Nevertheless, if you want to merge this PR feel free to do so. I such case I will create an issue and copy the description there so we can continue the discussion.

@bitboom I saw your draft, it looks promising. I will review it when you're ready (ping me somewhere).

@bokdeuk-jeong bokdeuk-jeong merged commit 2e0f9ff into main Jul 16, 2024
8 checks passed
@bokdeuk-jeong bokdeuk-jeong deleted the dev/m.szaknis/fix_eac5_api_errors branch July 16, 2024 11:14
@p-sawicki2 p-sawicki2 mentioned this pull request Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants