Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static Routes #18369

Closed
wants to merge 2 commits into from
Closed

Static Routes #18369

wants to merge 2 commits into from

Conversation

Cydox
Copy link
Contributor

@Cydox Cydox commented Apr 26, 2023

This PR adds the ability to add static routes to a network. Discussion is in containers/netavark#678. The PR for support in netavark is containers/netavark#680

Does this PR introduce a user-facing change?

This PR adds the --route flag and the no_default_route option in --opt. 

Example:
podman network create --subnet 10.10.10.0/24 --route 10.11.0.0/24,10.10.10.1,120 --opt no_default_route=1 testnet

This example creates a network with a static route to `10.11.0.0/24` via `10.10.10.1` 
and a route metric of `120`. In this example no default route will be added to 
containers in this network due to `no-default-route=1`. The route metric is optional. 
Multiple routes can be specified.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 26, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Cydox
Once this PR has been reviewed and has the lgtm label, please assign edsantiago for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 26, 2023
@Cydox Cydox force-pushed the static-route-pr branch 2 times, most recently from d72facd to dd337c9 Compare April 27, 2023 00:26
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 27, 2023
@Cydox Cydox force-pushed the static-route-pr branch 4 times, most recently from cf31eeb to a62cc53 Compare April 27, 2023 03:35
@Luap99
Copy link
Member

Luap99 commented Apr 27, 2023

/hold

You need to open a PR in c/common with the required changes there. And in either case we need to merge the implementation in netavark first merging anything in c/common or podman.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 27, 2023
Comment on lines 57 to 58
flags.BoolVar(&networkCreateOptions.NoAutoGateway, "no-auto-gateway", false, "do not automatically add gateways to subnets")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of this flag. Can you explain the use case more? Compared to e.g. --internal?
I really do not want to add more top level options to the Network type, if this is truly needed I think it is a much better fit via the generic options map.

Copy link
Contributor Author

@Cydox Cydox Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On macvlan it does the exact same thing as --internal. I confirmed that all internal does on macvlan is prevent a default gateway from being added automatically.

On bridge is where the difference comes in. Here, --internal also applies firewall rules that prevent external communication. So let's say you don't want a default route but add static routes to certain destinations. You can't achieve that with --internal as the firewall rules will block communication to those destinations.

Now that I write this, I think it might be possible to simply have the firewall rules allow communication to those static route destinations. Although that makes the meaning of --internal a little murky. In that case --internal would be internal except for those allowed packets to static route destinations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--internal does not add firewall rules to prevent external communication with bridge, we just set a sysctl to block routing on the bridge interface so it is impossible to reach ips outside the subnet.

I think having this as a use case is fine, but it should not be top level. Just put it in the existing --opt which just maps to a [string]string map, so something like --opt no_auto_gateway=1 could be used.

We shouldn't change the meaning of internal at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bridge network is a NAT router between the host and the pod/container. This NAT router uses the host route table. To make this work a default gateway for the bridge network is necessary.

On a macvlan network the gateway is explicitly defined with the --gateway option. We don't need no_auto_gateway for macvlan.

Combining bridge and macvlan in 1 pod/container is complex and requires rule-based routing. But this is complex and out of scope.

I only see multiple networks per pod/container with macvlan networks. I wouldn't allow a combination of macvlan and bridge networks. Therefore the no_auto_gateway flag is not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edited the above comment, because I wrote no_default_gateway instead of no_default_route.

Copy link
Contributor

@pjannesen pjannesen May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The no_auto_gateway flag in this PR will be replaced by no_default_route. The relevant PR is containers/netavark#697

A bridge network is a NAT router between the host and the pod/container. This NAT router uses the host route table. To make this work a default gateway for the bridge network is necessary.

For a bridge network a default route is not necessary if you have a relevant static route set up via the gateway.

You can create static routes if you know the gateway address automatically defined by the bridge driver.
In a normal IP network, packets are allowed to follow different routes. Only the destination determines the route.
This does not apply to NAT routing. For connections that go over NAT, all packets must go through the same NAT router (because of connection tracking). This can be achieved with rule-based routing. However, this is out of scope I think.
Therefore I wouldn't recommend using bridge and macvlan in the same pod/container.

On a macvlan network the gateway is explicitly defined with the --gateway option. We don't need no_auto_gateway for macvlan.

On macvlan even without the --gateway option a gateway is added to the network config.

If the gateway address is not defined what address is then used? It is not wise not to make assumptions about what the gateway address might be (there is no standard). If this is already done by the macvlan driver, than this is unexpected behavior in my opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create static routes if you know the gateway address automatically defined by the bridge driver.

You can use the --gateway option to define the gateway address yourself. So in that case you know the ip.

In a normal IP network, packets are allowed to follow different routes. Only the destination determines the route. This does not apply to NAT routing. For connections that go over NAT, all packets must go through the same NAT router (because of connection tracking).

I mean there can be setups where you wanne define a static route that goes to another container (certain VPN or Tor setups come to mind) instead of to the gateway.

Even without a setup like that you might want to not have a default route and only a specific route via the gateway. Example use-case is if you are using an ipv6 ULA prefix, but don't have ipv6 connectivity to the internet. In that case you want the container to not have an ipv6 default route and only a route to that ULA prefix via the gateway.

Therefore I wouldn't recommend using bridge and macvlan in the same pod/container.

Not sure what you are referring to here.

If the gateway address is not defined what address is then used?

Example:

sudo podman network create -d macvlan --subnet 10.0.0.0/24 testnet
sudo podman network inspect testnet

Output:

[
     {
          "name": "testnet123",
          "id": "cb5b9938629cced32336b1f6eb58eb68371c87cd27b5a19c8bf3ed543a538635",
          "driver": "macvlan",
          "created": "2023-05-22T22:48:43.981931988+02:00",
          "subnets": [
               {
                    "subnet": "10.0.0.0/24",
                    "gateway": "10.0.0.1"
               }
          ],
          "ipv6_enabled": false,
          "internal": false,
          "dns_enabled": false,
          "ipam_options": {
               "driver": "host-local"
          }
     }
]

It is not wise not to make assumptions about what the gateway address might be (there is no standard). If this is already done by the macvlan driver, than this is unexpected behavior in my opinion.

Podman's behavior currently is to use the first address in the subnet as the gateway. If you want another gateway you can specify it with --gateway, or disable the default route with the no_default_route option that I'll push to this PR later tonight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To justify the feature in one (or two) sentence: If we give the user the ability to add routes to the container network namespace, then we should give him the ability to disable the otherwise automatically generated default route. That way he has full control over the routes in the network namespace if he desires.

Greetings from Delft, btw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree. multiple networks without static routes are almost useless. I don't like the no_default_path flag. But since podman assumes a gateway address without specifying one, I don't see any other solution.

We are researching podman for our internet gateways. A pod with 2 networks. 1 with a real internet ip number and 1 with an internal private ip address and containers with bind, nginx, postfix, asterisk, etc. So i hope this PR is in the next release :)

Podman also assumes a ip-range without specifying one (based on the subnet). This can result in duplicate IP addresses if you forget to specify an ip_address when creating the pod :-(.
It would be nice to have 'none' for --ip-range and --gateway.

I also miss a standard way to create firewall rules (nftable/iptables) for a pod.

@Cydox
Copy link
Contributor Author

Cydox commented Apr 27, 2023

You need to open a PR in c/common with the required changes there.

It was a little difficult to deal with the stuff that's in common. Am I doing this workflow the right way? I made a fork of common with the changes and added a replace in the go.mod here. So when I open a PR in common and that is merged, this PR requires a change to the go.mod before merging. Is there a better way around this?

@Luap99
Copy link
Member

Luap99 commented Apr 27, 2023

You need to open a PR in c/common with the required changes there.

It was a little difficult to deal with the stuff that's in common. Am I doing this workflow the right way? I made a fork of common with the changes and added a replace in the go.mod here. So when I open a PR in common and that is merged, this PR requires a change to the go.mod before merging. Is there a better way around this?

Yes this is the correct flow to test it here in podman. For actual merging you need to open the c/common changes as PR there. When it is merged there you just update the dependency here instead of the replace.

@Cydox Cydox force-pushed the static-route-pr branch from a62cc53 to d300ebc Compare April 27, 2023 11:23
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 27, 2023
@Cydox
Copy link
Contributor Author

Cydox commented Apr 27, 2023

Made the changes. So now the way to prevent a default gateway is with --opt no_auto_gateway=1. Edited the "Does this PR introduce a user-facing change?" section accordingly. Also opened a PR in common with the needed changes.

Looks like copr builds don't work when there are merge conflicts. Won't rebase at this point since at least the go.mod needs to be updated anyways.

@Cydox
Copy link
Contributor Author

Cydox commented Apr 27, 2023

All tests passing, will just need a rebase once the PR in netavark and the PR in common are in to make rpm-builds work.

@Cydox Cydox force-pushed the static-route-pr branch from d300ebc to 74434d0 Compare May 23, 2023 01:21
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 23, 2023
@Cydox Cydox force-pushed the static-route-pr branch 2 times, most recently from cdb2dc7 to ac4bcc3 Compare May 23, 2023 02:49
@Cydox
Copy link
Contributor Author

Cydox commented May 23, 2023

Rebased this commit and incorporated the changes needed to switch to no_default_route instead of no_auto_gateway, as well as the other changes from containers/common#1440

@Cydox
Copy link
Contributor Author

Cydox commented May 23, 2023

CI failed because some mirror seems to be down. Trying again.
/test all

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 23, 2023

@Cydox: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

CI failed because some mirror seems to be down. Trying again.
/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Cydox Cydox force-pushed the static-route-pr branch 2 times, most recently from e6b1a02 to 4dd6478 Compare May 23, 2023 04:19
@Cydox
Copy link
Contributor Author

Cydox commented May 23, 2023

The one failed test appears to be a gateway timeout from quay.io

Can't quite make sense of the other failure at this hour.

@Cydox Cydox force-pushed the static-route-pr branch from 4dd6478 to 423f960 Compare May 24, 2023 00:14
@Cydox
Copy link
Contributor Author

Cydox commented May 24, 2023

Updated this PR with all the latest changes in containers/common#1440 to test everything together.

@@ -114,6 +116,10 @@ The `macvlan` and `ipvlan` driver support the following options:
- Supported values for `macvlan` are `bridge`, `private`, `vepa`, `passthru`. Defaults to `bridge`.
- Supported values for `ipvlan` are `l2`, `l3`, `l3s`. Defaults to `l2`.

#### **--route**=*route*

A static route in the format \<destination in CIDR notaion>,\<gateway>,\<route metric (optional)>. This route will be added to every container in this network. Only available with the netavark backend. Can be specified multiple times if more than one static route is desired.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A static route in the format \<destination in CIDR notaion>,\<gateway>,\<route metric (optional)>. This route will be added to every container in this network. Only available with the netavark backend. Can be specified multiple times if more than one static route is desired.
A static route in the format \<destination in CIDR notation>,\<gateway>,\<route metric (optional)>. This route will be added to every container in this network. Only available with the netavark backend. It can be specified multiple times if more than one static route is desired.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and do you need to backslash the <, but not the > symbols?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No those "\" are a typo (not sure how that happened...). I'll fix them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not xml. Here's an example route:

--route 10.0.10.0/24,10.0.0.1,100

Not sure if my notation here is the best way to describe that.

go.mod Outdated Show resolved Hide resolved
@Cydox Cydox force-pushed the static-route-pr branch 2 times, most recently from 99afb8a to be0426b Compare May 25, 2023 00:55
Cydox added 2 commits May 25, 2023 01:07
Signed-off-by: Jan Hendrik Farr <[email protected]>
add routes using the --route flag.
the no_default_route option in --opt prevents a default route from
getting added automatically.

Signed-off-by: Jan Hendrik Farr <[email protected]>
@Cydox Cydox force-pushed the static-route-pr branch from be0426b to 6b7ec92 Compare May 25, 2023 01:08
@Cydox
Copy link
Contributor Author

Cydox commented May 25, 2023

I vendored in the latest containers/common with containers/common#1440. Obviously this pulls in a lot of unrelated changes from c/common, so I'm not sure if this should be split into a separate PR.

I split all the vendoring stuff into a separate commit (let me know if you prefer having this squashed)

Now the CI is failing because of the bloat-check. Size of bin/podman increases by 171 kB. In the previous CI run were I only vendored in a branch of common with the changes needed for this PR, the size increase was only 1.9 kB. So the lion-share of this increase is from the unrelated upstream changes in c/common.

@Luap99 How do we proceed?

@Cydox
Copy link
Contributor Author

Cydox commented May 25, 2023

Also, sys podman debian-12 rootless host boltdb failed, but I believe this might be a weird random failure. I think I've seen a similar failure before that disappeared without any changes.

@Luap99
Copy link
Member

Luap99 commented May 25, 2023

Also, sys podman debian-12 rootless host boltdb failed, but I believe this might be a weird random failure. I think I've seen a similar failure before that disappeared without any changes.

Yes vendoring is broken right now: #18510. You have to wait until somebody from our team fixes that.

@Luap99
Copy link
Member

Luap99 commented Jun 12, 2023

Sorry for the delay, I cherry-picked your commit into #18835 because I had to fix the docs there to pass CI. It was just merged so we can close this PR, if anything is missing/not working let me know.

@Luap99 Luap99 closed this Jun 12, 2023
@Cydox
Copy link
Contributor Author

Cydox commented Jun 12, 2023

Thanks @Luap99

Saw that it was cherry picked into that PR.

It should all be working as expected, I've been running the rpm from this PR on a local fedora coreos server and it's all working as expected for my use-case.

Will this correctly make its way into the release notes with this PR being closed?

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 11, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants