-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Static Routes #18369
Static Routes #18369
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Cydox The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d72facd
to
dd337c9
Compare
cf31eeb
to
a62cc53
Compare
/hold You need to open a PR in c/common with the required changes there. And in either case we need to merge the implementation in netavark first merging anything in c/common or podman. |
cmd/podman/networks/create.go
Outdated
flags.BoolVar(&networkCreateOptions.NoAutoGateway, "no-auto-gateway", false, "do not automatically add gateways to subnets") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a fan of this flag. Can you explain the use case more? Compared to e.g. --internal?
I really do not want to add more top level options to the Network type, if this is truly needed I think it is a much better fit via the generic options map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On macvlan it does the exact same thing as --internal
. I confirmed that all internal does on macvlan is prevent a default gateway from being added automatically.
On bridge is where the difference comes in. Here, --internal
also applies firewall rules that prevent external communication. So let's say you don't want a default route but add static routes to certain destinations. You can't achieve that with --internal
as the firewall rules will block communication to those destinations.
Now that I write this, I think it might be possible to simply have the firewall rules allow communication to those static route destinations. Although that makes the meaning of --internal
a little murky. In that case --internal
would be internal except for those allowed packets to static route destinations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--internal does not add firewall rules to prevent external communication with bridge, we just set a sysctl to block routing on the bridge interface so it is impossible to reach ips outside the subnet.
I think having this as a use case is fine, but it should not be top level. Just put it in the existing --opt
which just maps to a [string]string
map, so something like --opt no_auto_gateway=1
could be used.
We shouldn't change the meaning of internal at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bridge network is a NAT router between the host and the pod/container. This NAT router uses the host route table. To make this work a default gateway for the bridge network is necessary.
On a macvlan network the gateway is explicitly defined with the --gateway option. We don't need no_auto_gateway for macvlan.
Combining bridge and macvlan in 1 pod/container is complex and requires rule-based routing. But this is complex and out of scope.
I only see multiple networks per pod/container with macvlan networks. I wouldn't allow a combination of macvlan and bridge networks. Therefore the no_auto_gateway flag is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edited the above comment, because I wrote no_default_gateway instead of no_default_route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The no_auto_gateway flag in this PR will be replaced by no_default_route. The relevant PR is containers/netavark#697
A bridge network is a NAT router between the host and the pod/container. This NAT router uses the host route table. To make this work a default gateway for the bridge network is necessary.
For a bridge network a default route is not necessary if you have a relevant static route set up via the gateway.
You can create static routes if you know the gateway address automatically defined by the bridge driver.
In a normal IP network, packets are allowed to follow different routes. Only the destination determines the route.
This does not apply to NAT routing. For connections that go over NAT, all packets must go through the same NAT router (because of connection tracking). This can be achieved with rule-based routing. However, this is out of scope I think.
Therefore I wouldn't recommend using bridge and macvlan in the same pod/container.
On a macvlan network the gateway is explicitly defined with the --gateway option. We don't need no_auto_gateway for macvlan.
On macvlan even without the --gateway option a gateway is added to the network config.
If the gateway address is not defined what address is then used? It is not wise not to make assumptions about what the gateway address might be (there is no standard). If this is already done by the macvlan driver, than this is unexpected behavior in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can create static routes if you know the gateway address automatically defined by the bridge driver.
You can use the --gateway option to define the gateway address yourself. So in that case you know the ip.
In a normal IP network, packets are allowed to follow different routes. Only the destination determines the route. This does not apply to NAT routing. For connections that go over NAT, all packets must go through the same NAT router (because of connection tracking).
I mean there can be setups where you wanne define a static route that goes to another container (certain VPN or Tor setups come to mind) instead of to the gateway.
Even without a setup like that you might want to not have a default route and only a specific route via the gateway. Example use-case is if you are using an ipv6 ULA prefix, but don't have ipv6 connectivity to the internet. In that case you want the container to not have an ipv6 default route and only a route to that ULA prefix via the gateway.
Therefore I wouldn't recommend using bridge and macvlan in the same pod/container.
Not sure what you are referring to here.
If the gateway address is not defined what address is then used?
Example:
sudo podman network create -d macvlan --subnet 10.0.0.0/24 testnet
sudo podman network inspect testnet
Output:
[
{
"name": "testnet123",
"id": "cb5b9938629cced32336b1f6eb58eb68371c87cd27b5a19c8bf3ed543a538635",
"driver": "macvlan",
"created": "2023-05-22T22:48:43.981931988+02:00",
"subnets": [
{
"subnet": "10.0.0.0/24",
"gateway": "10.0.0.1"
}
],
"ipv6_enabled": false,
"internal": false,
"dns_enabled": false,
"ipam_options": {
"driver": "host-local"
}
}
]
It is not wise not to make assumptions about what the gateway address might be (there is no standard). If this is already done by the macvlan driver, than this is unexpected behavior in my opinion.
Podman's behavior currently is to use the first address in the subnet as the gateway. If you want another gateway you can specify it with --gateway, or disable the default route with the no_default_route option that I'll push to this PR later tonight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To justify the feature in one (or two) sentence: If we give the user the ability to add routes to the container network namespace, then we should give him the ability to disable the otherwise automatically generated default route. That way he has full control over the routes in the network namespace if he desires.
Greetings from Delft, btw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree. multiple networks without static routes are almost useless. I don't like the no_default_path flag. But since podman assumes a gateway address without specifying one, I don't see any other solution.
We are researching podman for our internet gateways. A pod with 2 networks. 1 with a real internet ip number and 1 with an internal private ip address and containers with bind, nginx, postfix, asterisk, etc. So i hope this PR is in the next release :)
Podman also assumes a ip-range without specifying one (based on the subnet). This can result in duplicate IP addresses if you forget to specify an ip_address when creating the pod :-(.
It would be nice to have 'none' for --ip-range and --gateway.
I also miss a standard way to create firewall rules (nftable/iptables) for a pod.
It was a little difficult to deal with the stuff that's in common. Am I doing this workflow the right way? I made a fork of common with the changes and added a |
Yes this is the correct flow to test it here in podman. For actual merging you need to open the c/common changes as PR there. When it is merged there you just update the dependency here instead of the replace. |
Made the changes. So now the way to prevent a default gateway is with Looks like copr builds don't work when there are merge conflicts. Won't rebase at this point since at least the go.mod needs to be updated anyways. |
All tests passing, will just need a rebase once the PR in netavark and the PR in common are in to make rpm-builds work. |
cdb2dc7
to
ac4bcc3
Compare
Rebased this commit and incorporated the changes needed to switch to |
CI failed because some mirror seems to be down. Trying again. |
@Cydox: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
e6b1a02
to
4dd6478
Compare
The one failed test appears to be a gateway timeout from quay.io Can't quite make sense of the other failure at this hour. |
Updated this PR with all the latest changes in containers/common#1440 to test everything together. |
@@ -114,6 +116,10 @@ The `macvlan` and `ipvlan` driver support the following options: | |||
- Supported values for `macvlan` are `bridge`, `private`, `vepa`, `passthru`. Defaults to `bridge`. | |||
- Supported values for `ipvlan` are `l2`, `l3`, `l3s`. Defaults to `l2`. | |||
|
|||
#### **--route**=*route* | |||
|
|||
A static route in the format \<destination in CIDR notaion>,\<gateway>,\<route metric (optional)>. This route will be added to every container in this network. Only available with the netavark backend. Can be specified multiple times if more than one static route is desired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A static route in the format \<destination in CIDR notaion>,\<gateway>,\<route metric (optional)>. This route will be added to every container in this network. Only available with the netavark backend. Can be specified multiple times if more than one static route is desired. | |
A static route in the format \<destination in CIDR notation>,\<gateway>,\<route metric (optional)>. This route will be added to every container in this network. Only available with the netavark backend. It can be specified multiple times if more than one static route is desired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and do you need to backslash the <
, but not the >
symbols?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No those "\" are a typo (not sure how that happened...). I'll fix them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not xml. Here's an example route:
--route 10.0.10.0/24,10.0.0.1,100
Not sure if my notation here is the best way to describe that.
99afb8a
to
be0426b
Compare
Signed-off-by: Jan Hendrik Farr <[email protected]>
add routes using the --route flag. the no_default_route option in --opt prevents a default route from getting added automatically. Signed-off-by: Jan Hendrik Farr <[email protected]>
I vendored in the latest containers/common with containers/common#1440. Obviously this pulls in a lot of unrelated changes from c/common, so I'm not sure if this should be split into a separate PR. I split all the vendoring stuff into a separate commit (let me know if you prefer having this squashed) Now the CI is failing because of the bloat-check. Size of bin/podman increases by 171 kB. In the previous CI run were I only vendored in a branch of common with the changes needed for this PR, the size increase was only 1.9 kB. So the lion-share of this increase is from the unrelated upstream changes in c/common. @Luap99 How do we proceed? |
Also, sys podman debian-12 rootless host boltdb failed, but I believe this might be a weird random failure. I think I've seen a similar failure before that disappeared without any changes. |
Yes vendoring is broken right now: #18510. You have to wait until somebody from our team fixes that. |
Sorry for the delay, I cherry-picked your commit into #18835 because I had to fix the docs there to pass CI. It was just merged so we can close this PR, if anything is missing/not working let me know. |
Thanks @Luap99 Saw that it was cherry picked into that PR. It should all be working as expected, I've been running the rpm from this PR on a local fedora coreos server and it's all working as expected for my use-case. Will this correctly make its way into the release notes with this PR being closed? |
This PR adds the ability to add static routes to a network. Discussion is in containers/netavark#678. The PR for support in netavark is containers/netavark#680
Does this PR introduce a user-facing change?