-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libpod: make use of new pasta option from c/common #23791
libpod: make use of new pasta option from c/common #23791
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Luap99 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
e99eb0d
to
c1e3950
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, at least with my limited familiarity with the podman codebase. A new nits noted in comments.
libpod/container_internal_common.go
Outdated
@@ -2306,8 +2308,13 @@ func (c *Container) addHosts() error { | |||
} | |||
|
|||
var exclude []net.IP | |||
var hostContainersInernalIP string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing you meant Internal
instead of Inernal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
|
||
// mapGuestAddrIpv4 static ip used as forwarder address inside the netns to reach the host, | ||
// given this is a "link local" ip it should be very unlikely that it causes conflicts | ||
mapGuestAddrIpv4 = "169.254.0.2" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is a pre-existing issue with the --dns-forward
address, but RFC 3927 says the first and last 256 addresses of 169.25.0.0/16 are "reserved for future use". So it might be wiser to use 169.254.1.XXX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: vendor code is just a copy of the actual upstream code, so it is best to add your comments on containers/common#2136.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that there are "reserved":
The first 256 and last 256 addresses in the 169.254/16
prefix are reserved for future use and MUST NOT be selected by a host
using this dynamic configuration mechanism.
I am mean sure it is reserved so it might get a special meaning which can break us but using another ip from the proper range is also risking conflicts with real world users
As an example currently all our GCE CI VM's have nameserver 169.254.169.254
set so it means this ip cannot be reached directly from the container anymore which currently works fine. We do not really know how link local addresses are used in the wild and we should have a default that is unlikely to cause conflicts.
Of course we also might get in trouble if this range will be assigned for some new use. If you have other suggestions of unused ip ranges I am happy to hear them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: vendor code is just a copy of the actual upstream code,
I'm aware, but I didn't know how to find the relevant upstream PR..
so it is best to add your comments on containers/common#2136.
.. until now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that there are "reserved":
The first 256 and last 256 addresses in the 169.254/16
prefix are reserved for future use and MUST NOT be selected by a host
using this dynamic configuration mechanism.I am mean sure it is reserved so it might get a special meaning which can break us but using another ip from the proper range is also risking conflicts with real world users As an example currently all our GCE CI VM's have
nameserver 169.254.169.254
set so it means this ip cannot be reached directly from the container anymore which currently works fine. We do not really know how link local addresses are used in the wild and we should have a default that is unlikely to cause conflicts.
I think it should be ok to use something in the non-reserved range. AIUI, if guests want to allocate a link-local themselves they should do duplicate address detection via ARP. Since we know we're the first thing on the link, we should be ok. Actually, right now pasta responds to all ARPs, so I'm pretty sure a guest can't allocate a link local addr (everything will appear to be a duplicate). Which is something I want to fix, but largely independent of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It it not that a guest allocates this, if some service on the network is using a link local addresses and we use the same address for dns or map guest option this ip will no longer be reachable from within the container because pasta maps this to a different ip on the host.
To be clear here the GCE instance has 10.128.15.199/32
ip assigned to its interface with
default via 10.128.0.1 dev eth0 proto dhcp src 10.128.15.199 metric 100
10.128.0.1 dev eth0 proto dhcp scope link src 10.128.15.199 metric 100
Yet it uses 169.254.169.254 as nameserver which routes fine in this network. I do not know if other networks do something similar where the link local range is used. So there is very well the chance to break connectivity to that host if we use the same link local ip for the option as that host.
And yes I agree this very much "might break" in super weird networks and very likely not a problem for basically everyone. So if others think using 169.254.1.x is better to not use reserved ranges I am fine to change it
@sbrivio-rh @mheon WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing all this in a bit, and then I'll come back to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, let me answer here first, as @dgibson already reviewed (I might try to review as well but don't count on it).
if some service on the network is using a link local addresses and we use the same address for dns or map guest option this ip will no longer be reachable from within the container because pasta maps this to a different ip on the host
I don't see that as a real concern because those addresses are link-local, and anything a container might reasonably expect to access on the same network segment (that's the scope of link-local addresses) is pretty much just DNS nowadays, which we already handle in a special way.
Strictly speaking, I guess Podman should perform duplicate address detection (for IPv4, it's done via ARP, RFC 3927 2.5). It's an optimistic form of detection so it doesn't cause additional delays in a general case, but it's not exceedingly simple either, and the risk looks so low to me that I'm not sure it makes sense to implement this.
So, yeah, I would use 168.254.1.x/24.
The risk of using reserved ranges is not just that they might be assigned to something one day, but you might fail sanity checks in applications, or in a future version of the kernel, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of all comes back to what we consider "the link". Is it just the virtual link from the container to pasta, or does it include things that are local on one of the host's links. At the moment we're not really consistently either, which is something I have slowly progressing plans to sanitize.
Allowing access to things on the host link is arguably "more transparent", and does let you reach those sites. However it makes it much less clear what the semantics should be in the case of multiple host links, or a disconnected host, or a host moving from one link to another. Plus, I'm not sure it's even possible to adequately maintain the illusion of being on the host link in enough cases.
So, I'm generally trying to push us in the direction of considering the pasta<->container connection "the link", at least in default configurations. This has the extremely convenient additional effect that it means that link-local addresses all become free to allocate for internal purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated in containers/common#2136
vendor/github.com/containers/common/libnetwork/pasta/pasta_linux.go
Show resolved
Hide resolved
@@ -10,6 +10,9 @@ type SetupResult struct { | |||
// DNSForwardIP is the ip used in --dns-forward, it should be added as first | |||
// entry to resolv.conf in the container. | |||
DNSForwardIPs []string | |||
// MapGuestIps are the ips used for the --map-guest-addr option which | |||
// we can use for the host.containers.internal entry. | |||
MapGuestAddrIPs []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this can have at most one IPv4 and one IPv6 address (for now, anyway).
Per feedback[1] the 169.254.0.0/24 range is reserved for future use in RFC 3927. As such we should not use it here as it might break in the future if the range gets assigned a new meaning. Switch to 169.254.1.1. [1] containers/podman#23791 (comment) Signed-off-by: Paul Holzinger <[email protected]>
c1e3950
to
28135e0
Compare
Includes my pasta changes. Signed-off-by: Paul Holzinger <[email protected]>
pasta added a new --map-guest-addr to option that maps a to the actual host ip. This is exactly what we need for host.containers.internal entry. So we now make use of this option by default but still have to keep the exclude fallback because the option is very new and some users/distros will not have it yet. This also fixes an issue where the --dns-forward ip were not used when using the bridge network mode, only useful when not using aardvark-dns as this used the proper ips there already from the rootless netns resolv.conf file. Fixes containers#19213 Signed-off-by: Paul Holzinger <[email protected]>
28135e0
to
a1e6603
Compare
/lgtm |
f22f4cf
into
containers:main
pasta added a new --map-guest-addr to option that maps a to the actual
host ip. This is exactly what we need for host.containers.internal
entry. So we now make use of this option by default but still have to
keep the exclude fallback because the option is very new and some
users/distros will not have it yet.
This also fixes an issue where the --dns-forward ip were not used when
using the bridge network mode, only useful when not using aardvark-dns
as this used the proper ips there already from the rootless netns
resolv.conf file.
Fixes #19213