Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop abusing OPTE's source NAT configuration for external IPs #1479

Merged
merged 1 commit into from
Jul 25, 2022

Conversation

bnaecker
Copy link
Collaborator

  • Update OPTE dep to include the API that allows setting an external IP
    address explicitly, rather than abusing the SNAT configuration for
    that
  • Carve up SNAT addresses into port ranges of 16K. We previously used
    the entire port range, since OPTE used the whole range anyway to allow
    inbound connections on any port. This allows 16K outbound connections
    from guests that don't need inbound, which is a reasonable starting
    point.
  • Add explicit SNAT and external IP addresses to the sled agent and
    client-library types, as well as the representation of instances and
    OPTE ports.

- Update OPTE dep to include the API that allows setting an external IP
  address explicitly, rather than abusing the SNAT configuration for
  that
- Carve up SNAT addresses into port ranges of 16K. We previously used
  the entire port range, since OPTE used the whole range anyway to allow
  inbound connections on any port. This allows 16K outbound connections
  from guests that don't need inbound, which is a reasonable starting
  point.
- Add explicit SNAT and external IP addresses to the sled agent and
  client-library types, as well as the representation of instances and
  OPTE ports.
@bnaecker bnaecker requested review from rzezeski and smklein July 22, 2022 19:18
@bnaecker
Copy link
Collaborator Author

Here are a few details about testing. I populated images using the script at tools/populate/populate-images.sh, and then created two guests using the Debian image. The IPs those were assigned are:

bnaecker@feldspar : ~/omicron $ oxide api /organizations/o/projects/p/instances/i0/external-ips
[
  {
    "ip": "192.168.1.211",
    "kind": "ephemeral"
  }
]
bnaecker@feldspar : ~/omicron $ oxide api /organizations/o/projects/p/instances/i1/external-ips
[
  {
    "ip": "192.168.1.212",
    "kind": "ephemeral"
  }
]

I can verify that pings make it through, and I can SSH into the guests:

bnaecker@feldspar : ~/omicron $ ping 192.168.1.211
192.168.1.211 is alive
bnaecker@feldspar : ~/omicron $ ping 192.168.1.212
192.168.1.212 is alive
bnaecker@feldspar : ~/omicron $ ssh -i demo [email protected]
The authenticity of host '192.168.1.211 (192.168.1.211)' can't be established.
ED25519 key fingerprint is SHA256:nbDot/N3qT8kfVqLRTzENqIADojiQyuq/w6f2Mixf84.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.1.211' (ED25519) to the list of known hosts.
Linux myinst 5.10.0-16-cloud-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
debian@myinst:~$ ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether a8:40:25:fe:ea:5a brd ff:ff:ff:ff:ff:ff
    inet 172.30.0.5/32 brd 172.30.0.5 scope global dynamic enp0s8
       valid_lft 86280sec preferred_lft 86280sec
    inet6 fe80::aa40:25ff:fefe:ea5a/64 scope link
       valid_lft forever preferred_lft forever
debian@myinst:~$ ping -c 4 172.30.0.6
PING 172.30.0.6 (172.30.0.6) 56(84) bytes of data.
64 bytes from 172.30.0.6: icmp_seq=1 ttl=64 time=0.340 ms
64 bytes from 172.30.0.6: icmp_seq=2 ttl=64 time=0.395 ms
64 bytes from 172.30.0.6: icmp_seq=3 ttl=64 time=0.275 ms
64 bytes from 172.30.0.6: icmp_seq=4 ttl=64 time=0.256 ms

--- 172.30.0.6 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3071ms
rtt min/avg/max/mdev = 0.256/0.316/0.395/0.055 ms
debian@myinst:~$

Note that I can also ping i1 from i0, i.e., guest-to-guest is now working well. Those addresses are from the VPC Subnet, and we can see them here:

bnaecker@feldspar : ~/omicron $ oxide api /organizations/o/projects/p/vpcs/default/subnets/default/network-interfaces
{
  "items": [
    {
      "description": "default primary interface for i0",
      "id": "a98e02f5-ed18-4411-abed-a7fbe9cd0781",
      "instance_id": "56525c9e-da4d-40b6-a0b7-c3c1467dac80",
      "ip": "172.30.0.5",
      "mac": "A8:40:25:FE:EA:5A",
      "name": "net0",
      "primary": true,
      "subnet_id": "e1acdd79-5b4d-435e-9f91-3784e9ac2449",
      "time_created": "2022-07-22T18:47:24.106830Z",
      "time_modified": "2022-07-22T18:47:24.106830Z",
      "vpc_id": "2981145c-7126-4593-8c1b-6ff0f162c534"
    },
    {
      "description": "default primary interface for i1",
      "id": "c12389c9-234d-42cb-b3e5-a108926c9e86",
      "instance_id": "44bf8faa-7716-4e32-881c-742be319e485",
      "ip": "172.30.0.6",
      "mac": "A8:40:25:FE:BA:3E",
      "name": "net0",
      "primary": true,
      "subnet_id": "e1acdd79-5b4d-435e-9f91-3784e9ac2449",
      "time_created": "2022-07-22T18:47:36.897538Z",
      "time_modified": "2022-07-22T18:47:36.897538Z",
      "vpc_id": "2981145c-7126-4593-8c1b-6ff0f162c534"
    }
  ],
  "next_page": "eyJ2IjoidjEiLCJwYWdlX3N0YXJ0Ijp7InNvcnRfYnkiOiJuYW1lX2FzY2VuZGluZyIsImxhc3Rfc2VlbiI6Im5ldDAifX0="
}
bnaecker@feldspar : ~/omicron $

@bnaecker
Copy link
Collaborator Author

This also addresses #1471

@bnaecker bnaecker merged commit 62565ea into main Jul 25, 2022
@bnaecker bnaecker deleted the stop-abusing-opte-snat branch July 25, 2022 17:26
leftwo pushed a commit that referenced this pull request Oct 25, 2024
Crucible changes
Add test and fix for replay race condition (#1519)
Fix clippy warnings (#1517)
Add edition to `crucible-workspace-hack` (#1516)
Split out Downstairs-specific stats (#1511)
Move remaining `GuestWork` functionality into `Downstairs` (#1510)
Track both jobs and bytes in each IO state (#1507)
Fix new `rustc` and `clippy` warnings (#1509)
Remove IOP/BW limits (for now) (#1506)
Move `GuestBlockRes` into `DownstairsIO` (#1502)
Update actions/checkout digest to eef6144 (#1499)
Update Rust crate hyper-staticfile to 0.10.1 (#1411)
Turn off test-up-2region-encrypted.sh (#1504)
Add `IOop::Barrier` (#1494)
Fix IPv6 addresses in `crutest` (#1503)
Add region set options to more tests. (#1496)
Simplify `CompleteJobs` (#1493)
Removed ignored CI jobs (#1497)
Minor cleanups to `print_last_completed` (#1501)
Remove remaining `Arc<Volume>` instances (#1500)
Add `VolumeBuilder` type (#1492)
remove old unused scripts (#1495)
More multiple region support. (#1484)
Simplify matches (#1490)
Move complete job tracker to a helper object (#1489)
Expand summary and add documentation references to the README. (#1486)
Remove `GuestWorkId` (2/2) (#1482)
Remove `JobId` from `DownstairsIO` (1/2) (#1481)
Remove unused `#[derive(..)]` (#1483)
Update more tests to use dsc (#1480)
Crutest now Volume only (#1479)

Propolis changes
manually impl Deserialize for PciPath for validation purposes (#801)
phd: gate OS-specific tests, make others more OS-agnostic (#799)
lib: log vCPU diagnostics on triple fault and for some unhandled exit types (#795)
add marker trait to help check safety of guest memory reads (#794)
clippy fixes for 1.82 (#796)
lib: move cpuid::Set to cpuid_utils; prevent semantic subleaf conflicts (#782)
PHD: write efivars in one go (#786)
PHD: support guest-initiated reboot (#785)
server: accept CPUID values in instance specs and plumb them to bhyve (#780)
PHD: allow patched Crucible dependencies (#778)
server: add a first-class error type to machine init (#777)
PciPath to Bdf conversion is infallible; prove it and refactor (#774)
instance spec rework: flatten InstanceSpecV0 (#767)
Make PUT /instance/state 503 when waiting to init
Less anxiety-inducing `Vm::{get, state_watcher}`
leftwo added a commit that referenced this pull request Oct 30, 2024
Crucible changes
Add test and fix for replay race condition (#1519) Fix clippy warnings
(#1517)
Add edition to `crucible-workspace-hack` (#1516)
Split out Downstairs-specific stats (#1511)
Move remaining `GuestWork` functionality into `Downstairs` (#1510) Track
both jobs and bytes in each IO state (#1507) Fix new `rustc` and
`clippy` warnings (#1509)
Remove IOP/BW limits (for now) (#1506)
Move `GuestBlockRes` into `DownstairsIO` (#1502)
Update actions/checkout digest to eef6144 (#1499)
Update Rust crate hyper-staticfile to 0.10.1 (#1411) Turn off
test-up-2region-encrypted.sh (#1504)
Add `IOop::Barrier` (#1494)
Fix IPv6 addresses in `crutest` (#1503)
Add region set options to more tests. (#1496)
Simplify `CompleteJobs` (#1493)
Removed ignored CI jobs (#1497)
Minor cleanups to `print_last_completed` (#1501)
Remove remaining `Arc<Volume>` instances (#1500)
Add `VolumeBuilder` type (#1492)
remove old unused scripts (#1495)
More multiple region support. (#1484)
Simplify matches (#1490)
Move complete job tracker to a helper object (#1489) Expand summary and
add documentation references to the README. (#1486) Remove `GuestWorkId`
(2/2) (#1482)
Remove `JobId` from `DownstairsIO` (1/2) (#1481)
Remove unused `#[derive(..)]` (#1483)
Update more tests to use dsc (#1480)
Crutest now Volume only (#1479)

Propolis changes
manually impl Deserialize for PciPath for validation purposes (#801)
phd: gate OS-specific tests, make others more OS-agnostic (#799) lib:
log vCPU diagnostics on triple fault and for some unhandled exit types
(#795) add marker trait to help check safety of guest memory reads
(#794) clippy fixes for 1.82 (#796)
lib: move cpuid::Set to cpuid_utils; prevent semantic subleaf conflicts
(#782) PHD: write efivars in one go (#786)
PHD: support guest-initiated reboot (#785)
server: accept CPUID values in instance specs and plumb them to bhyve
(#780) PHD: allow patched Crucible dependencies (#778)
server: add a first-class error type to machine init (#777) PciPath to
Bdf conversion is infallible; prove it and refactor (#774) instance spec
rework: flatten InstanceSpecV0 (#767) Make PUT /instance/state 503 when
waiting to init
Less anxiety-inducing `Vm::{get, state_watcher}`

---------

Co-authored-by: Alan Hanson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants