Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

soundness improvements around hypervisor-shared memory #451

Merged
merged 9 commits into from
Oct 2, 2024

Conversation

Freax13
Copy link
Contributor

@Freax13 Freax13 commented Aug 30, 2024

This PR improves the soundness of code around hypervisor-shared memory.

The first patch, 174274d, is blocked on google/zerocopy#1601. Let me know if you want me to drop that patch if we don't want to wait on a new zerocopy release. I used the following patch to override zerocopy for testing:

diff --git a/Cargo.toml b/Cargo.toml
index b7bdb46..d87444d 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -49,6 +49,9 @@ zerocopy = { version = "0.7.32", features = ["alloc", "derive"] }
 # other repos
 packit = { git = "https://github.com/coconut-svsm/packit", version = "0.1.1" }
 
+[patch.crates-io]
+zerocopy = { git = "https://github.com/Freax13/zerocopy.git", rev = "68e1cc8" }
+
 [workspace.lints.rust]
 future_incompatible = { level = "deny", priority = 127 }
 nonstandard_style = { level = "deny", priority = 126 }

@Freax13 Freax13 force-pushed the more-zerocopy branch 3 times, most recently from bfe8671 to a084a9a Compare August 30, 2024 07:24
@joergroedel
Copy link
Member

This PR improves the soundness of code around hypervisor-shared memory.

The first patch, 174274d, is blocked on google/zerocopy#1601. Let me know if you want me to drop that patch if we don't want to wait on a new zerocopy release.

Please move that patch to a separate draft-PR, which you can then "undraft" once all blockers are solved.

In general I like these changes, especially the SharedBox implementation. That will simplify a lot of things.

Once updated this needs testing by @msft-jlange and possibly also a review by @cclaudio .

@Freax13
Copy link
Contributor Author

Freax13 commented Sep 12, 2024

Please move that patch to a separate draft-PR, which you can then "undraft" once all blockers are solved.

Done.

@joergroedel joergroedel added the wait-for-review PR needs for approval by reviewers label Sep 16, 2024
This makes it possible to implement get_aad_slice without any unsafe
code.

Signed-off-by: Tom Dohrmann <[email protected]>
Given that the hypervisor has write access to that memory, we need to
treat the memory as interiorly mutable.

Signed-off-by: Tom Dohrmann <[email protected]>
@Freax13
Copy link
Contributor Author

Freax13 commented Sep 18, 2024

Just rebased onto main. I resolved the TODOs by switching to the functions in crate::cpu::mem.

kernel/src/cpu/mem.rs Outdated Show resolved Hide resolved
kernel/src/sev/ghcb.rs Outdated Show resolved Hide resolved
Cell doesn't allow concurrent accesses. This is a problem because we
share the memory with the host and the host could write to the memory
while we're reading it. Use atomic accesses instead. Atomic accesses
can tolerate concurrent writes.

Signed-off-by: Tom Dohrmann <[email protected]>
This resolves a TOC-TOU issue. Furthermore we don't need to check the
entire content: If the certificate data is not empty, there will be
non-zero bytes in the first 24 bytes.

Signed-off-by: Tom Dohrmann <[email protected]>
SharedBox is a safe wrapper around memory pages shared with the host.

Signed-off-by: Tom Dohrmann <[email protected]>
HVDoorbellPage was only used in one place and leak was immediately
called on it. Given that we don't ever need to free up a doorbell page
let's just implement this in a single function returning a static
reference.

Signed-off-by: Tom Dohrmann <[email protected]>
This is better for a couple of reasons:
1. drop_in_place destroys the object rather than mutating it to release
   resources. The downside with simply mutating but not destroying is
   that the object still has to be in a valid state and this limits the
   shutdown code (for example it can't release the memory associated
   with a PageBox)
2. After the object has been dropped, it can't be accessed anymore.
   This means that the shutdown code doesn't have to worry about later
   accesses like the previous code had to.
3. All resources are freed, not just the GHCB.

This also fixes a soundness issue where if the shutdown were to be
called twice on the same GHCB that would result in a double-pvalidate
bug.

Signed-off-by: Tom Dohrmann <[email protected]>
This impl is unused. It is also unsound because we can never have
unique ownership over the GHCB as long as it is shared with the host.

Signed-off-by: Tom Dohrmann <[email protected]>
Now that the shutdown code is only called from the Drop impl we might
as well move it in there. This also makes it impossible to call
shutdown more than once (or to call shutdown and the Drop the
GhcbPage).

Signed-off-by: Tom Dohrmann <[email protected]>
@joergroedel joergroedel added in-review PR is under active review and not yet approved and removed wait-for-review PR needs for approval by reviewers labels Sep 26, 2024
@joergroedel joergroedel merged commit 48b4294 into coconut-svsm:main Oct 2, 2024
3 checks passed
@joergroedel joergroedel removed the in-review PR is under active review and not yet approved label Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants