Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm and selinux #1654

Closed
rcythr opened this issue Jul 4, 2019 · 9 comments
Closed

kubeadm and selinux #1654

rcythr opened this issue Jul 4, 2019 · 9 comments
Labels
area/security priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@rcythr
Copy link

rcythr commented Jul 4, 2019

BUG REPORT

Versions

kubeadm version (use kubeadm version):
kubeadm version: &version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:37:41Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:40:16Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
    vmware vSphere virtual machine. 8 CPU, 16GB RAM, 50GB disk.
  • OS: CentOS 7
  • Kernel (e.g. uname -a): Linux learning-master.k8s.local 3.10.0-957.el7.x86_64 kubeadm join on slave node fails preflight checks #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Others:

What happened?

The documentation says to disable selinux for the entire system, but that's a deal breaker in my organization. As such, I did the install with selinux enabled. The hope was I would find a simple solution which would allow me to use kubeadm to do k8s deployments in my organization. I did find a solution, and have details at the end.

What you expected to happen?

I expected it would break horribly, and be a complete mess to resolve. It did break, but it was easy to resolve. See bottom section.

How to reproduce it (as minimally and precisely as possible)?

Follow the usual kubeadm cluster setup documentation, except ignore the part about disabling selinux. Leave it in enforcing mode instead.

Anything else we need to know?

I was able to resolve this problem pretty easily by running the following before kubeadm init:

mkdir -p /var/lib/etcd/
mkdir -p /etc/kubernetes/pki/
chcon -R -t svirt_sandbox_file_t /var/lib/etcd
chcon -R -t svirt_sandbox_file_t /etc/kubernetes/

I've been using the cluster now for a couple days as part of a certification course, so (almost) every aspect of the kubernetes api has been exercised to some extent. I've had no further issues.

*For the selinux uninitiated:
Due to the type labels on these directories, selinux would normally prevent containers from using files in these directories -- even if they are mounted into containers. The svirt_sandbox_file_t type allows access to these files from containers. Ideally the labels would use MCS to allow only the api server container to use the /etc/kubernetes/pki/ files and only the etcd container to use the /var/lib/etcd/ directory.

@rhatdan Can you opine? I'd like to use MCS if possible, but I'm less experienced with it. Since we need to set the labels before the container is created, we can't set the labels on the directory using the randomly generated "c1,c2" labels. Can we simply set something like "system_u:object_r:container_file_t:s0:api_server" on the files and then start the api server container with that label instead (or in addition to the generated labels)?

Once we agree on the solution I'm happy to create a PR with that change. I suspect it's a very simple change to just add arguments to the volume and directory creation with the proper labels.

@rcythr
Copy link
Author

rcythr commented Jul 4, 2019

Looking at the kube-apiserver.yaml manifest, I'm now wondering if this solution is fully generalizable given that the kube-apiserver also mounts the /etc/ssl/certs and /etc/pki directories. I guess my simple cluster does not use these?

@rosti
Copy link

rosti commented Jul 4, 2019

Thank you very much @rcythr !
That's a very important discovery! I'll be glad to review a PR for SELinux enablement.

/priority important-longterm
/area security

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. area/security labels Jul 4, 2019
@neolit123
Copy link
Member

neolit123 commented Jul 4, 2019

Looking at the kube-apiserver.yaml manifest, I'm now wondering if this solution is fully generalizable given that the kube-apiserver also mounts the /etc/ssl/certs and /etc/pki directories. I guess my simple cluster does not use these?

possibly.

i think i'm seeing us documenting this in our Kubeadm Troubleshooting (or setup?) guide instead of enabling this from kubeadm WRT to directory manipulation using tools like chcon.

cc @detiber

@neolit123
Copy link
Member

neolit123 commented Jul 4, 2019

Once we agree on the solution I'm happy to create a PR with that change. I suspect it's a very simple change to just add arguments to the volume and directory creation with the proper labels.

kubeadm requires root and i'm not sure we should set SELinux labels without the consent of the user.
what is the standard practice in the ecosystem?

@rcythr
Copy link
Author

rcythr commented Jul 4, 2019

In my case at least the /var/lib/etcd and /etc/kubernetes/pki/ directories were created by kubeadm, so kubeadm can set whatever context is required to make it work with the processes that need to read the directories. That said, /etc/ssl/certs or /etc/pki are more problematic as we definitely should not relabel those.

In this case the kube-apiserver and etcd processes created are not selinux aware, and so they get the default labels from the container selinux module that all containers get when they are created (i.e. container_t). Since kubeadm doesn't have any custom labels, we're stuck with the pre-existing labels which are available from the container selinux module.

@rhatdan can correct me if I'm wrong here, but I believe the best solution is to create a policy module which has the types we need and install it if the system in use is running selinux. At the moment this seems like a light lift given that there's only 4 directories (that I'm aware of) in scope, with a couple of different labels.

The 4 commands I put above are a nice workaround in the meantime which others can use if they need selinux in enforcing mode.

@rcythr
Copy link
Author

rcythr commented Jul 4, 2019

So rather than stop at the 4 command workaround above, I'd really like to run this all the way to ground. In order to do that, I need to figure out what files/directories are needed by each component, and whether it's read/write or read only access. The manifests don't specify, but I suspect that write is not required for most.

Here is what I was able to gather so far, someone please correct me if I'm wrong anywhere:

etcd

  • /etc/kubernetes/pki/etcd (RO)
  • /var/lib/etcd (RW)

kube-apiserver

  • /etc/ssl/certs (RO)
  • /etc/pki (RO)
  • /etc/kubernetes/pki (RO)

kube-controller-manager

  • /etc/ssl/certs (RO)
  • /etc/pki (RO)
  • /usr/libexec/kubernetes/kubelet-plugins/volume/exec (??)
  • /etc/kubernetes/pki (RO)
  • /etc/kubernetes/controller-manager.conf (RO?)

kube-scheduler

  • /etc/kubernetes/scheduler.conf (RO?)

I think the best path forward is to create custom types for each service which extend container_t and allow these accesses. Then we just need set the appropriate context when starting each service. Perhaps these types can become part of container-selinux?

@neolit123
Copy link
Member

neolit123 commented Jul 4, 2019

/usr/libexec/kubernetes/kubelet-plugins/volume/exec (??)

this one is created by the kubelet and is should be RW for it.
possibly (R)) for the controller-manager.

the kubelet also has a number of other paths that the user can configure:

--log-dir
--cert-dir
--root-dir
--volume-plugin-dir (this is the one above)
--cni-bin-dir
--cni-conf-dir
--bootstrap-kubeconfig
--kubeconfig

Here is what I was able to gather so far, someone please correct me if I'm wrong anywhere:

one problem is that the each component allows some many different paths for certs, logs and so on. the user can pick custom paths from the kubeadm config and selinux might block it.

also we have long term plans to make etc/kubernetes configurable from kubeadm, which means that the users that apply the selinux changes need to match the kubeadm config.

it feels to me that instead of creating policies we should just document a small guide in our setup guide in the lines of "if wish to run kubadm with selinux you need these steps...please replace the paths in this guide with the paths you wish to use, etc..."

@rcythr
Copy link
Author

rcythr commented Jul 4, 2019

Understood. Given the amount of configurability the ideal change, if any, would need to be very flexible about all the paths.

I'm less concerned about the kubelet right now because the kubelet runs as unconfined_t which is why it works fine without any additional pain and suffering. The other services are running inside containers so they are given container_t which is a lot more restrictive. If we wanted to declare true selinux support the kubelet's context would need to change also, but I'm more concerned about achieving selinux compatibility right now.

Before deciding I really would like to get some input from @rhatdan, but perhaps a better workaround/implementation is to have the 4 services on the master run with spc_t instead of container_t. This would give them access similar to kubelet, and probably resolve this whole issue while allowing users to move files around to (almost) wherever they like.

@neolit123
Copy link
Member

closing in favor of #279
which is a much older issue. let's continue the discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

4 participants