-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rootless cni without infra container #9423
rootless cni without infra container #9423
Conversation
acdd288
to
55d974d
Compare
55d974d
to
bd2f58c
Compare
48e7065
to
627329c
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cevich, Luap99 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
bbf4eea
to
d3afcdd
Compare
Well the new test failed. Most likely a flake. |
Instead of creating an extra container create a network and mount namespace inside the podman user namespace. This ns is used to for rootless cni operations. This helps to align the rootless and rootful network code path. If we run as rootless we just have to set up a extra net ns and initialize slirp4netns in it. The ocicni lib will be called in that net ns. This design allows allows easier maintenance, no extra container with pause processes, support for rootless cni with --uidmap and possibly more. The biggest problem is backwards compatibility. I don't think live migration can be possible. If the user reboots or restart all cni containers everything should work as expected again. The user is left with the rootless-cni-infa container and image but this can safely be removed. To make the existing cni configs work we need execute the cni plugins in a extra mount namespace. This ensures that we can safely mount over /run and /var which have to be writeable for the cni plugins without removing access to these files by the main podman process. One caveat is that we need to keep the netns files at `XDG_RUNTIME_DIR/netns` accessible. `XDG_RUNTIME_DIR/rootless-cni/{run,var}` will be mounted to `/{run,var}`. To ensure that we keep the netns directory we bind mount this relative to the new root location, e.g. XDG_RUNTIME_DIR/rootless-cni/run/user/1000/netns before we mount the run directory. The run directory is mounted recursive, this makes the netns directory at the same path accessible as before. This also allows iptables-legacy to work because /run/xtables.lock is now writeable. Signed-off-by: Paul Holzinger <[email protected]>
This is supported with the new rootless cni logic. Signed-off-by: Paul Holzinger <[email protected]>
For rootless users the infra container used the slirp4netns net mode even when bridge was requested. We can support bridge networking for rootless users so we have allow this. The default is not changed. Signed-off-by: Paul Holzinger <[email protected]>
This should make maintenance easier. Signed-off-by: Paul Holzinger <[email protected]>
With the new rootless cni supporting network connect/disconnect is easy. Combine common setps into extra functions to prevent code duplication. Signed-off-by: Paul Holzinger <[email protected]>
Signed-off-by: Paul Holzinger <[email protected]>
Do not invoke the rootlesskit port forwarder when the container has no ports. Signed-off-by: Paul Holzinger <[email protected]>
Signed-off-by: Paul Holzinger <[email protected]>
Make sure the DOCKER_SOCK location is accessible by the user when run rootless. Alos set the DOCKER_HOST env var to ensure docker-compose will use the non default location. Cleanup steps such as `rm` or `umount` must be run inside podman unshare otherwise they can fail due missing privileges. Change the curl test to use --retry-all-errors otherwise the tests will flake. The web server inside the container will return http code 500 sometimes, most likely because it is not fully ready to accept connections. With --retry-all-errors curl will retry instead of failing and thus the test will work. Signed-off-by: Paul Holzinger <[email protected]>
Also fix the tests so we can use the podman function with the output. Signed-off-by: Paul Holzinger <[email protected]>
Delte the network namespace and kill the slirp4netns process when it is no longer needed. Signed-off-by: Paul Holzinger <[email protected]>
If a user only has a local dns server in the resolv.conf file the dns resolution will fail. Instead we create a new resolv.conf which will use the slirp4netns dns. Signed-off-by: Paul Holzinger <[email protected]>
Signed-off-by: Paul Holzinger <[email protected]>
d3afcdd
to
d1e32dc
Compare
LGTM as in looks GREAT to me! well done |
/lgtm |
Instead of creating an extra container create a network and mount
namespace inside the podman user namespace. This ns is used to
for rootless cni operations.
This helps to align the rootless and rootful network code path.
If we run as rootless we just have to set up a extra net ns and
initialize slirp4netns in it. The ocicni lib will be called in
that net ns.
This design allows allows easier maintenance, no extra container
with pause processes, support for rootless cni with --uidmap
and possibly more.
The biggest problem is backwards compatibility. I don't think
live migration can be possible. If the user reboots or restart
all cni containers everything should work as expected again.
The user is left with the rootless-cni-infa container and image
but this can safely be removed.
To make the existing cni configs work we need execute the cni plugins
in a extra mount namespace. This ensures that we can safely mount over
/run and /var which have to be writeable for the cni plugins without
removing access to these files by the main podman process. One caveat
is that we need to keep the netns files at
XDG_RUNTIME_DIR/netns
accessible.
XDG_RUNTIME_DIR/rootless-cni/{run,var}
will be mounted to/{run,var}
.To ensure that we keep the netns directory we bind mount this relative
to the new root location, e.g. XDG_RUNTIME_DIR/rootless-cni/run/user/1000/netns
before we mount the run directory. The run directory is mounted recursive,
this makes the netns directory at the same path accessible as before.
This also allows iptables-legacy to work because /run/xtables.lock is
now writeable.
Add rootless support for cni and --uidmap. This is supported with the new rootless cni logic.
Lastly enable network connect/disconnect for rootless users.
The docker-compose test has been changed to work rootless and also test for network connect.
Refer to the individual commits for more details.
Fixes #9169
Fixes #8709