The effort of rebasing the k8s images to distroless/static is aimed at making the k8s images thinner, safer and less vulnerable. The scope is not only improving the core containers but will cover the master and node addons which have their own release process. As for the core containers, this effort is targeting the v1.15 release.
Rebasing the k8s images to distroless/static can make the images thinner, safer and less vulnerable.
Meanwhile, it will drastically reduce churn on the total number of k8s images versions. Due to the fact that many images are based on debian base and a vulnerability in debian base (a couple times a month) will result in rebuilding every image, changing the image from debian base to distroless/static can reduce the total number of k8s image versions.
What's more, it reduces the burden of managing and maintaining multiple k8s images from the security (e.g. CVE), compatibility and build process concerns.
Use image gcr.io/distroless/static:latest as the only base image for the following kubernetes images
- Images based FROM scratch
- Images based on debian/alpine and only for the purpose of redirecting logs with shell.
- Images based on k8s.gcr.io/debian-base due to previous rebasing from busybox.
Help the community and contributors better understanding and maintaining the images.
- Set up the policy that only
distroless/static
andk8s.gcr.io/debian-base
are used (as the base image) for the images hosted in the official k8s.gcr.io image repository. And if the image is based on debian-base, it should be documented in the exception list. - Improve the presubmit prow test to guarantee that the upcoming k8s/kubernetes PRs won't introduce dependencies that distroless/static doesn't support.
- Document the base image list for important kubernetes components, including both core containers and important add-ons. Also, document the exception list (unable to base on distroless).
- Do not change Images based on debian/alpine that requires fluentd (e.g. hyperkube).
- Do not change images that have hard dependencies on non-static binaries.
This section discusses how the goal and scope are determined due to the reality. It also contains the real use cases.
Kubernetes not only runs images in the containers, but its components themselves are running and deploying as images. Each component image can be built from different base images.
Currently, kubernetes uses three main types of base images to build their components.
This docker image is based “FROM scratch” and doesn’t have external dependencies. The original motivation of using “FROM scratch” is to keep the image extremely thin and only contain what a static binaries need. However, caveats are found when running the go static binaries due to some missing non-binary dependencies like ca-certificates, resolv.conf, hosts, and nsswitch (see issue/69195).
An image can be based from Debian due to different reasons.
- One big reason is that the image needs to use shell to redirect the glog. This base image now is an overkill because K8s 1.13 can support using klog which accepts a --log-file flag to point to the log path directly. (Historically images doing this mostly relied on busybox or alpine. Some recent change has migrated off those to debian-base. PR/70245)
- Another reason of using debian is from the CVE concerns. Those images are originally rebased from busybox to debian for better CVE feeds and management (See PR/70245).
- A third type of images uses debian for certain external dependencies.
The reasons for images based on alpine are similar to the ones on debian. Debian is more widely used due to previous “Establish base image policy for system containers” effort (see issue/40248).
"Distroless images contain only your application and its runtime dependencies. They do not contain package managers, shells or any other programs you would expect to find in a standard Linux distribution.” (See distroless/README) Distroless supports the dependencies where “FROM scratch” misses and more light-weighted than debian or alpine. Meanwhile, distroless “improves the signal to noise of scanners (e.g. CVE) and reduces the burden of establishing provenance to just what you need.”(from distroless/README)
Using Distroless/static as a common image base is originally proposed as an exploration area in the base image policy for system containers. Tim(tallclair@) has driven the effort on defining and establishing the base image policy (main changes):
- Add Alpine iptable as base image for kube-proxy. Previously kube-proxy is based on debian iptable image. (This direction is scrapped, see issue/39696 for details)
- Rebase busybox images to debian-base.
- Rebase certain alpine images to debian-base
The distroless/static solution is filed separately in issue/70249. This kep, as a more up-to-date version, is slightly different than the original issue.
The approaches to rebase different containers can vary significantly due to the function of the containers, the cloud-providers’ release workflows, and legacy reasons (repo migration plans, retirement plans, etc). This section will discuss 4 main types of image rebasing strategies, and this should cover the majority of the kubernetes containers.
- The images are built via bazel. In such case, we will update the bazel BUILD rule to switch to the base image to distroless/static. This method applies for the core containers like kube-apiserver, kube-controller-manager, cloud-controller-manager, kube-scheduler. (See detailed solution in Core Master Images)
- The images have dependencies that are not supported by distroless/static. One typical example is the usage of shell. Previously, shell is widely used for redirecting glog output to a certain directory. This use case is no longer needed since we've switched from glog to klog which can accept a flag to specify the log output path. A generic approach is: Remove the dependencies that distroless/static doesn't support and then rebase the images to distroless (e.g. issue/1787. Meanwhile, we limit the introduction of new dependencies. (e.g. pr/74690)
- Images based "FROM scratch" is safe to switch to distroless/static directly.
- Images from kubernetes incubator won't be changed directly by this KEP and the release plan is not estimated here. We notify the project OWNERs and we defer to the OWNERs on whether or not those images should be updated.
The core master images includes kube-apiserver
, kube-controller-manager
, kube-scheduler
and kube-proxy
. For kube-proxy
, it is based on debian-iptable which distroless/static doesn't support iptables yet. Thus, kube-proxy won't be changed.
Currently, there are three different workflows to build the core master images and which workflow to use is determined by each cloud provider.
Run make release
under kubernetes repo. This approach is most commonly used and it uses the bash scripts (See build/release.sh for details) to build the images. In this workflow, the base image is specified in the build/common.sh
.
Run make bazel-release
under kubernetes repo. This approach uses bazel to build the image artifact based on this BUILD rule (More details in bazel.bzl/release-filegroup) . In this workflow, the base image is specified in the build/BUILD rule.
Run kubetest
or hack/e2e
. See details in the test-infra repo. This approach is recommended for development testing and is broadly used by contributors. However, this approach is under a test env and it uses different config than the two official workflows as described above. In this workflow, the base image is specified to use k8s.gcr.io/pause:3.1.
This KEP is expected to rebase images for all three workflows. This requires each cloud provider team to be involved in the manifest updates and release workflow testing part (See the graph below). Before we switch the base images to distroless/static
, each cloud provider team should make sure their manifest config is updated so that the command doesn’t require shell to run the executable binaries and no log redirection is involved in the command. Otherwise rebasing images to distroless will break the core containers running in the cluster master VMs. The test release should also be updated to distroless/static
so as we can guarantee further changes wouldn’t be able to add unexpected dependencies (otherwise, they will fail the e2e tests in the github prow test stage).
- For log redirection, please use flag
log-file
(e.g.--log-file=/var/log/kube-controller-manager.log
) and also disable standard output (e.g.--logtostderr=false
) - When removing the shell from manifest command, please also update the parameter format to exec form.
[“executable”, “param1, “param2”]
- See example in PR/75624
- Detailed timelines about switching to the distroless/static will be announced later on. Please make sure manifest change is well tested in the release workflow (as shown in the right blue part).
It more or less depends on add-on OWNERs’ judge on whether/how the add-on images should be rebased. The below progress is what we proposed to the OWNERs. This should apply for most use cases.
- (If the images depends on a k8s version that is earlier than v1.13) Sync up with current k8s head. Since kubernetes 1.13,oss kubernetes no longer uses glog which requires shell to redirect the log file. Instead, k8s is using klog which accepts a log path flag. This sync-up is necessary to remove log redirection.
- (If the images use glog) Replace the glog to klog inside the add-on files.
- Update the base image to distroless and remove distroless-preinstalled packages like ca-certificate.
- (If necessary) Update the container upstart command to avoid using bash command. (For log redirection, see examples in the For Core Master Images section).
- If bash scripts can’t be easily removed, document the container as exception in this list
- After the above steps are done, require release engineers' help on monitoring the performance.
ingress-gce/fuzzer was based on alpine and can't be switched to distroless directly due to the fact that it needs shell to redirect the glog file. To allow the images to be based on distroless (which doesn't contain shell), we firstly need to remove the dependency on the shell (use klog instead of glog), and then rebase the image. (Related PR pr/682, pr/666)
This KEP is targeted at v1.15 release. The full list of images switched to distroless/static will be updated later on.
- Rebased the following images to
gcr.io/distroless/static:latest
ork8s.gcr.io/debian-base:v1.0.0
. - Investigated these images as exceptions (can't based on distroless).
- Triaged and fixed the following issues which blocked rebasing images:
- Triage klog for the performance regression on core master containers:
- Affected images: kube-controller-manager, kube-scheduler, kube-apiserver
- Blocked PRs:
- Avoid using exec in kube-controller-manager for flexvolume.
Component Name | on Master/Node | Previous Image --> Current Image | Image | Code Complete | Release Complete | Contact |
---|---|---|---|---|---|---|
addon-resize | Master + Node | Busybox --> distroless | k8s.gcr.io/addon-resizer:1.8.5 | Done | Done | @bskiba @yuwenma |
cluster-proportional-autoscaler | Master + Node | scratch --> distroless | k8s.gcr.io/cluster-proportional-autoscaler-arm:v1.6.0 | Done | Done | @yuwenma @MrHohn |
cluster-proportional-vertical-autoscaler | Master + Node | scratch --> distroless | k8s.gcr.io/cpvpa-amd64:v0.7.1 | Done | Done | @yuwenma @MrHohn |
event-exporter | Master + Node | debian-base --> distroless | k8s.gcr.io/event-exporter:v0.2.5 | Done | Done | @x13n @yuwenma |
node-termination-handler | Master + Node | alpine --> distroless | k8s.gcr.io/gke-node-termination-handler | Done | Done | @yuwenma |
metadata-proxy | Master + Node | scratch --> distroless | k8s.gcr.io/metadata-proxy:v0.1.12 | Done | Done | @dekkagaijin @yuwenma |
metrics-server | Master + Node | busybox --> distroless | k8s.gcr.io/metrics-server:v0.3.3 | Done | Done | @yuwenma @kawych |
prometheus-to-sd | Master + Node | debian-base --> distroless | k8s.gcr.io/metrics-server:v0.5.2 | Done | Done | @loburm |
ip-masq-agent | Master + Node | busybox --> debian-iptables | k8s.gcr.io/ip-masq-agent:v2.4.1 | Done | Done | @BenTheElder @yuwenma |
slo-monitor | Master | alpine --> distroless | k8s.gcr.io/slo-monitor:0.11.2 | Done | Done | @yuwenma |
kubelet-to-gcm | Master | scratch --> distroless | k8s.gcr.io/kubelet-to-gcm:v1.2.11 | Done | wait for next release | @yuwenma |
etcd-version-monitor | Master | scratch --> distroless | k8s.gcr.io/etcd-version-monitor:v0.1.3 | Done | Done | @yuwenma |
etcd-empty-dir-cleanup | Master | busybox --> distroless | k8s.gcr.io/etcd-empty-dir-cleanup:3.3.10.1 | Done | Done | @yuwenma |
etcd | Master | busybox --> distroless | k8s.gcr.io/etcd:3.3.10-1 | Done | Done | @yuwenma |
defaultbackend | Master + Node | scratch --> distroless | Wait for next release | Done | targeting v1.16 | @rramkumar1 @yuwenma |
fuzzer | Master + Node | alpine --> distroless | Wait for next release | Done | targeting v1.16 | @rramkumar1 @yuwenma |
ingress-gce-glbc | Master + Node | alpine --> distroless | Wait for next release | Done | targeting v1.16 | @rramkumar1 @yuwenma |
k8s-dns-kube-dns | Master + Node | alpine --> debian-base | k8s.gcr.io/k8s-dns-kube-dns:1.15.3 | Done | Done | @yuwenma @prameshj |
k8s-dns-sidecar | Master + Node | alpine --> debian-base | k8s.gcr.io/k8s-dns-sidecar:1.15.3 | Done | Done | @yuwenma @prameshj |
k8s-dns-dnsmasq-nanny | Master + Node | alpine --> debian-base | k8s.gcr.io/k8s-dns-dnsmasq-nanny:1.15.3 | Done | Done | @yuwenma @prameshj |
k8s-dns-node-cache | Node | debian:stable-slim --> debian-base | k8s.gcr.io/k8s-dns-node-cache:1.15.3 | Done | Done | @yuwenma @prameshj |
cluster-autoscaler | Master | debian-base --> distroless | k8s.gcr.io/cluster-autoscaler:v1.16.0 | Done | Done | @losipiuk |