-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial integration with Oxide Packet Transformation Engine #955
Conversation
This is awesome @bnaecker! Really excited to see this work getting integrated into the control plane 🎉 . Just a few comments on comments above. |
Though I'm moving this to a draft, this PR is definitely ready for review. There are a few reasons for the draft status. First, any CI is blocked on being able to pull OPTE as a dependency. @rzezeski has started turning the crank on making OPTE public, but it won't be instantaneous. Second, given that fact, I'm going to split out the updates to the scripts in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this all together @bnaecker. I realize that the work and effort required here is not fully appreciated if one were to look at the lines of code alone. Especially given all the testing woes you've hit, plenty of which were my fault. So thank you for your patience and diligence in getting this across the line.
For the most part this seems as I would expect. Some of my comments don't require any action, just stuff I wanted to call out. Though I did not think to explicitly differentiate what needs action and what doesn't. Feel free to reach out for clarification in that regard or if any of my comments are confusing.
b6d0e7d
to
9c326f5
Compare
9c326f5
to
97ceda5
Compare
97ceda5
to
a5d0fb5
Compare
I've verified that all the tests pass locally, and that I can successfully ping between two Alpine guests. This is no longer blocked on OPTE going public, but the linking failure we've seen before is back. I'm going to mark this open, with the assumption it'll wait until that gets fixed to merge. |
a5d0fb5
to
0c126a1
Compare
@rcgoodfellow @rmustacc @rzezeski I'm planning to merge this once CI is green, unless there are objections. Below is some output showing successfully Guest 1This is the
Guest 2This is the
OPTE trace outputHere's the output of
CaveatThere's a caveat here. The firewall rules are not propagated correctly in this branch. In particular, the guests still can't communicate with one another without adding a rule like:
where |
b244926
to
3f171b5
Compare
@bnaecker I think that makes perfect sense. There is plenty of major improvement here already that seems ready for integration. I think doing the firewall stuff in a follow up makes it cleaner and easier to focus review anyways. So +1 to your plan. |
- Brings in OPTE via the `opte-ioctl` and `opte` crates. - Modifies the instance-ensure request from Nexus to the sled agent, to carry the actual information required for setting up the guest OPTE port. This includes the actual IP subnet and MAC, rather than things like the VPC Subnet UUID. - Adds a database query and method to extract the above information from both the network interface and VPC subnet tables. - Adds OPTE port for the guests (and currently still a VNIC on top), with the right OPTE settings for traffic to flow between two guests in the same VPC subnet. That's the virtual-to-physical mapping and a router entry for the subnet. - Adds the VNICs over each OPTE port to the running zone. Note that this removes the specification of guest NICs for the zone itself as VNICs. They are passed as OPTE ports, and the VNIC is pulled out internally, so hopefully little will need to change when the VNIC is removed entirely. - Store the main underlay address for the sled agent, currently its dropshot server IP address, in the instance manager, and forward to each instance. It's then used as the underlay address when setting up the OPTE ports for the guest. Addressing review comments Add a unique VNI to each VPC Updating OPTE dependency and package repos Add dummy/mock module for OPTE on non-illumos systems Update error handling to be more self-contained in the callee
3f171b5
to
5ae31bd
Compare
Crucible updates all Crucible connections should set TCP_NODELAY (#983) Use a fixed size for tag and nonce (#957) Log crucible opts on start, order crutest options (#974) Lock the Downstairs less (#966) Cache dirty flag locally, reducing SQLite operations (#970) Make stats mutex synchronous (#961) Optimize requeue during flow control conditions (#962) Update Rust crate base64 to 0.21.4 (#950) Do less in control (#949) Fix --flush-per-blocks (#959) Fast dependency checking (#916) Update actions/checkout action to v4 (#960) Use `cargo hakari` for better workspace deps (#956) Update actions/checkout digest to 8ade135 (#939) Cache block size in Guest (#947) Update Rust crate ringbuffer to 0.15.0 (#954) Update Rust crate toml to 0.8 (#955) Update Rust crate reedline to 0.24.0 (#953) Update Rust crate libc to 0.2.148 (#952) Update Rust crate indicatif to 0.17.7 (#951) Remove unused async (#943) Use a synchronous mutex for bw/iop_tokens (#946) Make flush ID non-locking (#945) Use `oneshot` channels instead of `mpsc` for notification (#918) Use a strong type for upstairs negotiation (#941) Add a "dynamometer" option to crucible-downstairs (#931) Get new work and active count in one lock (#938) A bunch of misc test cleanup stuff (#937) Wait for a snapshot to finish on all downstairs (#920) dsc and clippy cleanup. (#935) No need to sort ackable_work (#934) Use a strong type for repair ID (#928) Keep new jobs sorted (#929) Remove state_count function on Downstairs (#927) Small cleanup to IOStateCount (#932) let cmon and IOStateCount use ClientId (#930) Fast return for zero length IOs (#926) Use a strong type for client ID (#925) A few Crucible Agent fixes (#922) Use a newtype for `JobId` (#919) Don't pass MutexGuard into functions (#917) Crutest updates, rename tests, new options (#911) Propolis updates Update tungstenite crates to 0.20 Use `strum` crate for enum-related utilities Wire up bits for CPUID customization PHD: improve artifact store (#529) Revert abort-on-panic in 'dev' cargo profile
Crucible updates all Crucible connections should set TCP_NODELAY (#983) Use a fixed size for tag and nonce (#957) Log crucible opts on start, order crutest options (#974) Lock the Downstairs less (#966) Cache dirty flag locally, reducing SQLite operations (#970) Make stats mutex synchronous (#961) Optimize requeue during flow control conditions (#962) Update Rust crate base64 to 0.21.4 (#950) Do less in control (#949) Fix --flush-per-blocks (#959) Fast dependency checking (#916) Update actions/checkout action to v4 (#960) Use `cargo hakari` for better workspace deps (#956) Update actions/checkout digest to 8ade135 (#939) Cache block size in Guest (#947) Update Rust crate ringbuffer to 0.15.0 (#954) Update Rust crate toml to 0.8 (#955) Update Rust crate reedline to 0.24.0 (#953) Update Rust crate libc to 0.2.148 (#952) Update Rust crate indicatif to 0.17.7 (#951) Remove unused async (#943) Use a synchronous mutex for bw/iop_tokens (#946) Make flush ID non-locking (#945) Use `oneshot` channels instead of `mpsc` for notification (#918) Use a strong type for upstairs negotiation (#941) Add a "dynamometer" option to crucible-downstairs (#931) Get new work and active count in one lock (#938) A bunch of misc test cleanup stuff (#937) Wait for a snapshot to finish on all downstairs (#920) dsc and clippy cleanup. (#935) No need to sort ackable_work (#934) Use a strong type for repair ID (#928) Keep new jobs sorted (#929) Remove state_count function on Downstairs (#927) Small cleanup to IOStateCount (#932) let cmon and IOStateCount use ClientId (#930) Fast return for zero length IOs (#926) Use a strong type for client ID (#925) A few Crucible Agent fixes (#922) Use a newtype for `JobId` (#919) Don't pass MutexGuard into functions (#917) Crutest updates, rename tests, new options (#911) Propolis updates Update tungstenite crates to 0.20 Use `strum` crate for enum-related utilities Wire up bits for CPUID customization PHD: improve artifact store (#529) Revert abort-on-panic in 'dev' cargo profile --------- Co-authored-by: Alan Hanson <[email protected]>
opte-ioctl
andopte
crates.carry the actual information required for setting up the guest OPTE
port. This includes the actual IP subnet and MAC, rather than things
like the VPC Subnet UUID.
both the network interface and VPC subnet tables.
with the right OPTE settings for traffic to flow between two guests in
the same VPC subnet. That's the virtual-to-physical mapping and a
router entry for the subnet.
removes the specification of guest NICs for the zone itself as VNICs.
They are passed as OPTE ports, and the VNIC is pulled out internally,
so hopefully little will need to change when the VNIC is removed
entirely.
dropshot server IP address, in the instance manager, and forward to
each instance. It's then used as the underlay address when setting up
the OPTE ports for the guest.
setup. Removes the unnecessary loopback address, since the sled agent
provides its main listening address as the underlay address when
create OPTE ports.
on. Don't install
opteadm
orxde
directly, do everything throughtheir package repos.