Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reset for net device #4389

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

acj
Copy link
Contributor

@acj acj commented Jan 22, 2024

Changes

This PR implements support for resetting the net device. Resolves #3074.

Reason

The VirtIO spec says in section 3.1.1:

The driver MUST follow this sequence to initialize a device:

  1. Reset the device.

Operating systems like FreeBSD and NetBSD reset the device as part of initializing it, which currently causes Firecracker to put the device into a FAILED state. As noted in #3074, some guest OSes continue to have functional networking because they're unaware that the device is in a failed state, but we shouldn't rely on this behavior.

Notes

  • This is partly an RFC because I'm not familiar enough with the vmm code to be confident that the device is being reset properly. Still, this does work in my Linux and FreeBSD test cases. Linux doesn't seem to reset the net device. FreeBSD does reset it, and then the device is activated and becomes usable.
  • reset() returns a copy of the interrupt FD and queue event FDs, but the MMIO driver does not use them. It's unclear to me how they should be used. Is that handshake only needed for specific drivers? If there were a way to signal that reset had succeeded without needing to clone the FDs, then I could remove the seccomp changes. Perhaps the return type could be Result<Option<...>> instead.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following
Developer Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • If a specific issue led to this PR, this PR closes the issue.
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this PR.
  • API changes follow the Runbook for Firecracker API changes.
  • User-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.
  • New TODOs link to an issue.
  • Commits meet contribution quality standards.

  • This functionality cannot be added in rust-vmm.

@acj
Copy link
Contributor Author

acj commented Jan 29, 2024

Add two reviewers to your pull request (a maintainer will do that for you if you're new).

@kalyazin @sudanl0 👋

@roypat roypat requested a review from bchalios January 29, 2024 08:58
Copy link

codecov bot commented Jan 29, 2024

Codecov Report

Attention: Patch coverage is 70.58824% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 81.51%. Comparing base (43aae6e) to head (a3f3ad9).
Report is 210 commits behind head on main.

Files Patch % Lines
src/vmm/src/devices/virtio/mmio.rs 33.33% 4 Missing ⚠️
src/vmm/src/devices/virtio/mod.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4389      +/-   ##
==========================================
- Coverage   81.52%   81.51%   -0.01%     
==========================================
  Files         241      241              
  Lines       29296    29308      +12     
==========================================
+ Hits        23883    23891       +8     
- Misses       5413     5417       +4     
Flag Coverage Δ
4.14-c5n.metal 78.79% <70.58%> (-0.01%) ⬇️
4.14-c7g.metal ?
4.14-m5d.metal ?
4.14-m5n.metal 78.78% <70.58%> (-0.01%) ⬇️
4.14-m6a.metal 77.90% <70.58%> (-0.01%) ⬇️
4.14-m6g.metal 76.86% <70.58%> (-0.01%) ⬇️
4.14-m6i.metal 78.77% <70.58%> (-0.01%) ⬇️
4.14-m7g.metal 76.86% <70.58%> (-0.01%) ⬇️
5.10-c5n.metal 81.48% <70.58%> (-0.01%) ⬇️
5.10-c7g.metal ?
5.10-m5d.metal ?
5.10-m5n.metal 81.47% <70.58%> (-0.01%) ⬇️
5.10-m6a.metal 80.68% <70.58%> (-0.01%) ⬇️
5.10-m6g.metal 79.79% <70.58%> (-0.01%) ⬇️
5.10-m6i.metal 81.46% <70.58%> (-0.01%) ⬇️
5.10-m7g.metal 79.79% <70.58%> (-0.01%) ⬇️
6.1-c5n.metal 81.48% <70.58%> (-0.01%) ⬇️
6.1-c7g.metal ?
6.1-m5d.metal ?
6.1-m5n.metal 81.47% <70.58%> (-0.01%) ⬇️
6.1-m6a.metal 80.68% <70.58%> (-0.01%) ⬇️
6.1-m6g.metal 79.79% <70.58%> (-0.01%) ⬇️
6.1-m6i.metal 81.46% <70.58%> (-0.01%) ⬇️
6.1-m7g.metal 79.79% <70.58%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@acj
Copy link
Contributor Author

acj commented Jan 30, 2024

Rebased. At least some of the build failures were from the branch being stale, I think. The rest ("failed to run custom build command") look like they might be related to seccomp, but there's not much output.

@roypat
Copy link
Contributor

roypat commented Jan 30, 2024

Hi @acj,
thank you for your contribution, and sorry for taking a while to respond.

Rebased. At least some of the build failures were from the branch being stale, I think. The rest ("failed to run custom build command") look like they might be related to seccomp, but there's not much output.

I think this is because there is no readlink syscall on ARM, instead the seccomp entry needs to be readlinkat.

However, my team talked about this internally and I think we can avoid adding new syscalls. The callsite of reset does not make use of the returned file descriptors (and I'm also not sure what it could use them for), so I think we can simply change the signature of reset to return nothing, saving us the need to dup the fds (and thus the syscalls). cc @bchalios because he has more context on this.

@acj
Copy link
Contributor Author

acj commented Jan 31, 2024

@roypat Hi! No worries, and thanks for the feedback. Changing the reset signature sounds good.

I went with Result as the return type so that the existing reset behavior doesn't change too much, i.e. set_device_status will still mark the device as failed if reset isn't implemented.

Copy link
Contributor

@bchalios bchalios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Adam,

Thanks a lot for your contribution to Firecracker!

This looks a lot like what I had in mind. One thing that I'd like to understand better is what we should do with the TAP device. I've considered two main alternatives:

  1. Leave everything as it is. It should probably work.
  2. Close the open end we are holding during reset and re-create it once we get activated. In fact, we could always defer the opening of the TAP device during activation. This would express better our intentions I think, but there are three problems that I see:
    • We will need to change the TAP type we are holding to be an Option, which is not ideal
    • We would need to handle the (de)registration of the TAP device with the event loop which might not be trivial
    • Opening the TAP device now, adds an extra system call to the boot time which is in the hot path.

One final thing that bothers me is testing this through integration tests. We can't use FreeBSD to do this. It is not feasible for us to maintain FreeBSD artifacts for testing, so it would need to be done using Linux drivers. However, I haven't found a way to trigger a reset through Linux.

Comment on lines +875 to +879
self.device_state = DeviceState::Inactive;
self.rx_bytes_read = 0;
self.rx_deferred_frame = false;
self.rx_frame_buf = [0u8; MAX_BUFFER_SIZE];
self.metrics = NetMetricsPerDevice::alloc(self.id.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need to reset the acked_features here, too.

Also, @sudanl0 do you think it is ok to reset the metrics here? I think it's more of a philosophical question, on whether we want our metrics to reflect the fact that the device has been reset by zeroing them out. Maybe, it would make sense to add a reset counter metric as well.

// Test corresponding queues events.
assert_eq!(net.queue_events().len(), NET_QUEUE_SIZES.len());
// Test reset.
let mut net = net;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can just mark as mut the net device at line 2025?

Comment on lines +178 to +179
fn reset(&mut self) -> Result<(), ResetError> {
Err(ResetError::NotImplemented)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I had in mind honestly. I assume the initial idea of giving back the EventFds was to "ensure" through the type system that they can't be used, but that is a very dubious choice, IMHO.

@acj
Copy link
Contributor Author

acj commented Feb 1, 2024

@bchalios Thanks, that's very helpful. I'll take a closer look at the TAP handling.

It is not feasible for us to maintain FreeBSD artifacts for testing, so it would need to be done using Linux drivers. However, I haven't found a way to trigger a reset through Linux.

I think this might be an issue on the Firecracker side. My understanding of Linux's virtio implementation is crude at best, but it looks like it should reset the device during registration. I added logging code and verified that the Linux driver sets status to 0 during boot, prior to activation. But Firecracker ignores this because it only calls reset if the device is already activated. Do you think it would be okay to handle a reset even if the device is inactive?

@bchalios
Copy link
Contributor

Hey @acj. Really sorry for silence on this. We got context switched in other topics.

I think this might be an issue on the Firecracker side. My understanding of Linux's virtio implementation is crude at best, but it looks like it should reset the device during registration. I added logging code and verified that the Linux driver sets status to 0 during boot, prior to activation. But Firecracker ignores this because it only calls reset if the device is already activated. Do you think it would be okay to handle a reset even if the device is inactive?

It is. We don't support reset :) I had seen this reset there when I looked at resetting some time ago, but I'm not sure it is enough for testing that reset works (I might be convinced otherwise :)). Ideally, what we could do is load/unload the virtio-net module in the guest. But we explicitly don't want to use modules in the guest for safety reasons.

@bchalios
Copy link
Contributor

Hi @acj, just wanted to touch base in order to see if you're still working on this or have plans to do in the near future.

@acj
Copy link
Contributor Author

acj commented Sep 13, 2024

@bchalios 👋 Yes, sorry for the delay on my end. I've been busy with other projects but would still like to land this. I'll find time soon to rebase and understand what's changed. If your team needs this sooner, though, don't let me hold you up. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] support for resetting virtio devices
3 participants