Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rootless: Add abillity to select port driver #6488

Closed
zroug opened this issue Nov 13, 2022 · 11 comments
Closed

Rootless: Add abillity to select port driver #6488

zroug opened this issue Nov 13, 2022 · 11 comments
Assignees
Milestone

Comments

@zroug
Copy link

zroug commented Nov 13, 2022

Is your feature request related to a problem? Please describe.
It seems to not be possible to get the source ip of ingress traffic in rootless mode.

Describe the solution you'd like
Currently K3s is always using the builtin port driver of rootlesskit. According to https://github.com/rootless-containers/rootlesskit/blob/1920341cd41e047834a21007424162a2dc946315/docs/port.md, this driver does not support the propagation of source ip addresses. slirp4netns would support this but is slower. The user should be able to make the decision for this tradeoff.

Describe alternatives you've considered
Inspecting the traffic outside of the cluster.

Additional context
https://github.com/rootless-containers/rootlesskit/blob/1920341cd41e047834a21007424162a2dc946315/docs/port.md

opt.PortDriver, err = portbuiltin.NewParentDriver(debugWriter, stateDir)

opt.PortDriver = portbuiltin.NewChildDriver(&logrusDebugWriter{})

This is a copy of #5405 because a bot closed that one incorrectly.

@brandond
Copy link
Member

We bundle slirp4netns, if we're not using it then I'm not sure why we include it.
cc @AkihiroSuda

@AkihiroSuda
Copy link
Contributor

RootlessKit is used for ingress traffic, slirp4netns is used for egress traffic. So slirp4netns is not removable

@brandond
Copy link
Member

So would this need to be configurable on an ingress/egress basis? We also have a request in another issue to allow configuring the MTU. It would be nice if there was a way to allow all of this without adding a bunch more flags.

@brandond
Copy link
Member

Check out the PR I just merged. You should be able to use a commit build from the installer once CI is done.

@brandond
Copy link
Member

brandond commented Nov 15, 2022

@AkihiroSuda with the changes from that PR, I am now able to run both server and agent on the same node in different rootless user units. Unfortunately, they both end up having the same address by default unless I manually change the slirp4netns CIDR block to something unique for each of them. Can you think of a better way to handle this?

brandond@dev01:~$ kubectl get node -A -o wide
NAME         STATUS   ROLES                  AGE     VERSION                INTERNAL-IP   EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION    CONTAINER-RUNTIME
k3s-server   Ready    control-plane,master   3m31s   v1.25.4+k3s-b09dea28   10.41.0.100   <none>        Ubuntu 22.10   5.19.0-1011-aws   containerd://1.6.8-k3s1
k3s-agent    Ready    <none>                 2m58s   v1.25.4+k3s-b09dea28   10.41.0.100   <none>        Ubuntu 22.10   5.19.0-1011-aws   containerd://1.6.8-k3s1

Even after doing that, I have other problems... I need to override the server's advertise-address to the node's actual IP in order for the agent to connect properly, and even after doing that metrics-server doesn't work right because the node IPs cannot route to each other. I suspect flannel isn't routing pod-to-pod traffic between nodes correctly either. I guess that in order to make this work properly, there would need to be some way to forward traffic between tunnel interfaces? I have no idea where to even begin with that.

There is also the larger issue of making this work across multiple hosts, which I think would have a bunch of challenges that I haven't even thought of yet.

@AkihiroSuda
Copy link
Contributor

Can you think of a better way to handle this?

Maybe hash the username into the CIDR ?

across multiple hosts

Flannel (VXLAN) is known to work with rootless, but not integrated to rootless k3s yet.
https://github.com/rootless-containers/usernetes#multi-node-docker-compose

@brandond
Copy link
Member

brandond commented Nov 15, 2022

What about routing between multiple user namespaces on the same host? I see that the slirp4netns docs mention needing to use an external tool like vde_plug to make this work between instances; would it be possible to somehow get k3s to share the slirp4netns instance instead?

@AkihiroSuda
Copy link
Contributor

would it be possible to somehow get k3s to share the slirp4netns instance instead?

Theoretically yes, but not implemented yet.

@brandond
Copy link
Member

Not implemented on the rootlesskit side, or just not in K3s yet? If it just needs some work on our side I'm glad to continue poking at it.

@AkihiroSuda
Copy link
Contributor

What about routing between multiple user namespaces on the same host?

Maybe we should add an optional support for lxc-user-nic: https://github.com/rootless-containers/rootlesskit/blob/master/docs/network.md#--netlxc-user-nic-experimental

This assigns the "real" IP address that is accessible from other user namespaces, but needs to configure /etc/lxc/lxc-usernet per user.

@VestigeJ
Copy link

VestigeJ commented Dec 6, 2022

##Environment Details
COMMIT=b5d39df9294627cbfa3081acb92e2be54f02b0d6
VERSION=v1.25.4+k3s1

Infrastructure

  • Cloud

Node(s) CPU architecture, OS, and version:

Linux 5.15.0-1019-aws x86_64 GNU/Linux Ubuntu 22.04.1 LTS

Cluster Configuration:

NAME               STATUS   ROLES                  AGE   VERSION
ip-1-1-2-1   Ready    control-plane,master   26m   v1.25.4+k3s-b5d39df9 

k3s-rootless.service

[Unit]
Description=k3s (Rootless)

[Service]
Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Environment=K3S_ROOTLESS_CIDR="10.41.0.0/16"
Environment=K3S_ROOTLESS_PORT_DRIVER=slirp4netns
Environment=K3S_ROOTLESS_DISABLE_HOST_LOOPBACK=true
Environment=K3S_ROOTLESS_MTU=1500
# NOTE: Don't try to run `k3s server --rootless` on a terminal, as it doesn't enable cgroup v2 delegation.
# If you really need to try it on a terminal, prepend `systemd-run --user -p Delegate=yes --tty` to create a systemd scope.
ExecStart=/usr/local/bin/k3s server --rootless --snapshotter=fuse-overlayfs
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
Type=simple
KillMode=mixed

[Install]
WantedBy=default.target

Default k3s-rootless.service output

$  curl https://get.k3s.io --output install-k3s.sh
$  sudo chmod +x install-k3s.sh
$  sudo groupadd --system etcd && sudo useradd -s /sbin/nologin --system -g etcd etcd
$  wget https://raw.githubusercontent.com/k3s-io/k3s/master/k3s-rootless.service
$  mkdir -p /home/ubuntu/.config/systemd/user/
$  cp k3s-rootless.service /home/ubuntu/.config/systemd/user/k3s-rootless.service
$  printf "[Service]\nDelegate=cpu cpuset io memory pids\n" > delegate.conf
$  sudo mkdir -p /etc/systemd/system/[email protected]/
$  sudo cp ~/delegate.conf /etc/systemd/system/[email protected]/delegate.conf
$  sudo tee -a /etc/modules <<EOF
fuse
tun
tap 
bridge
br_netfilter 
veth
ip_tables
ip6_tables
iptable_nat
ip6table_nat
iptable_filter
ip6table_filter
nf_tables
x_tables
xt_MASQUERADE
xt_addrtype
xt_comment
xt_conntrack
xt_mark
xt_multiport
xt_nat
xt_tcpudp
EOF

$  sudo vim /etc/default/grub
$  sudo update-grub
$  VERSION=v1.25.4+k3s1
$  sudo INSTALL_K3S_VERSION=$VERSION INSTALL_K3S_SKIP_ENABLE=true ./install-k3s.sh 
$  sudo cat k3s-rootless.service 
$  sudo vim .config/systemd/user/k3s-rootless.service 
$  printf "net.ipv4.ip_forward=1\n net.ipv6.conf.all.forwarding=1\n" | sudo tee -a /etc/sysctl.conf /dev/null
$  sudo apt update
$  sudo apt install uidmap
$  sudo reboot
$  systemctl --user enable --now k3s-rootless.service
$  systemctl --user status k3s-rootless

Results:
With the default k3s-rootless.service file

$ systemctl --user status k3s-rootless
● k3s-rootless.service - k3s (Rootless)
     Loaded: loaded (/home/ubuntu/.config/systemd/user/k3s-rootless.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-12-06 19:44:39 UTC; 24s ago
   Main PID: 1023 (k3s-server)
      Tasks: 60
     Memory: 620.9M
        CPU: 20.910s
     CGroup: /user.slice/user-1000.slice/[email protected]/app.slice/k3s-rootless.service
             ├─k3s
             │ └─1072 "k3s server" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
             └─k3s_evac
               ├─1023 "/usr/local/bin/k3s server" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "">
               ├─1038 "/proc/self/exe init" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "">
               ├─1048 slirp4netns --mtu 65520 -r 3 --disable-host-loopback --cidr 10.41.0.0/16 1038 tap0
               ├─1051 "k3s server" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
               └─1105 containerd -c /home/ubuntu/.rancher/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/container>

Additional Environment Variables k3s-rootless.service output

*attn to service file's additional Environment variables at the top*

Validation Steps

$  curl https://get.k3s.io --output install-k3s.sh
$  sudo chmod +x install-k3s.sh
$  sudo groupadd --system etcd && sudo useradd -s /sbin/nologin --system -g etcd etcd
$  wget https://raw.githubusercontent.com/k3s-io/k3s/master/k3s-rootless.service
$  mkdir -p /home/ubuntu/.config/systemd/user/
$  cp k3s-rootless.service /home/ubuntu/.config/systemd/user/k3s-rootless.service
$  printf "[Service]\nDelegate=cpu cpuset io memory pids\n" > delegate.conf
$  sudo mkdir -p /etc/systemd/system/[email protected]/
$  sudo cp ~/delegate.conf /etc/systemd/system/[email protected]/delegate.conf
$  sudo tee -a /etc/modules <<EOF
fuse
tun
tap 
bridge
br_netfilter 
veth
ip_tables
ip6_tables
iptable_nat
ip6table_nat
iptable_filter
ip6table_filter
nf_tables
x_tables
xt_MASQUERADE
xt_addrtype
xt_comment
xt_conntrack
xt_mark
xt_multiport
xt_nat
xt_tcpudp
EOF

$  sudo vim /etc/default/grub
$  sudo update-grub
$  COMMIT=b5d39df9294627cbfa3081acb92e2be54f02b0d6
$  sudo INSTALL_K3S_COMMIT=$COMMIT INSTALL_K3S_SKIP_ENABLE=true ./install-k3s.sh 
$  sudo cat k3s-rootless.service 
$  sudo vim .config/systemd/user/k3s-rootless.service 
$  printf "net.ipv4.ip_forward=1\n net.ipv6.conf.all.forwarding=1\n" | sudo tee -a /etc/sysctl.conf /dev/null
$  sudo apt update
$  sudo apt install uidmap
$  sudo reboot
$  systemctl --user enable --now k3s-rootless.service
$  systemctl --user status k3s-rootless

$ systemctl --user status k3s-rootless

● k3s-rootless.service - k3s (Rootless)
     Loaded: loaded (/home/ubuntu/.config/systemd/user/k3s-rootless.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-12-06 19:26:46 UTC; 36min ago
   Main PID: 4182 (k3s-server)
      Tasks: 173
     Memory: 718.6M
        CPU: 2min 26.570s
     CGroup: /user.slice/user-1000.slice/[email protected]/app.slice/k3s-rootless.service
             ├─k3s
             │ └─4229 "k3s server" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─k3s_evac
             │ ├─4182 "/usr/local/bin/k3s server" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             │ ├─4195 "/proc/self/exe init" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             │ ├─4205 slirp4netns --mtu 1500 -r 3 --disable-host-loopback --cidr 10.41.0.0/16 --api-socket /tmp/rootless3607429376/.s4nn.sock 4195 >
             │ ├─4210 "k3s server" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
  

Additional context / logs:

it appears we aren't using builtin by default so I'll confer on that and either patch the docs or see if we can get a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants