From a99cbda74e6ec189655a64d7ad685a9192f9ad31 Mon Sep 17 00:00:00 2001 From: Michael Taufen Date: Mon, 18 Sep 2017 11:27:12 -0700 Subject: [PATCH] Add task doc for alpha dynamic kubelet configuration --- _data/tasks.yml | 1 + .../administer-cluster/reconfigure-kubelet.md | 470 ++++++++++++++++++ 2 files changed, 471 insertions(+) create mode 100644 docs/tasks/administer-cluster/reconfigure-kubelet.md diff --git a/_data/tasks.yml b/_data/tasks.yml index 2586a06a10cb6..a9585a808df17 100644 --- a/_data/tasks.yml +++ b/_data/tasks.yml @@ -144,6 +144,7 @@ toc: - docs/tasks/administer-cluster/reserve-compute-resources.md - docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods.md - docs/tasks/administer-cluster/declare-network-policy.md + - docs/tasks/administer-cluster/reconfigure-kubelet.md - title: Install Network Policy Provider section: - docs/tasks/administer-cluster/calico-network-policy.md diff --git a/docs/tasks/administer-cluster/reconfigure-kubelet.md b/docs/tasks/administer-cluster/reconfigure-kubelet.md new file mode 100644 index 0000000000000..550898f7d523f --- /dev/null +++ b/docs/tasks/administer-cluster/reconfigure-kubelet.md @@ -0,0 +1,470 @@ +--- +approvers: +- mtaufen +- dawnchen +title: Reconfigure a Node's Kubelet in a Live Cluster +--- + +{% capture overview %} +{% include feature-state-alpha.md %} +As of Kubernetes 1.8, the new +[Dynamic Kubelet Configuration](https://github.com/kubernetes/features/issues/281) +feature is available in alpha. This allows you to change the configuration of +Kubelets in a live Kubernetes cluster via first-class Kubernetes concepts. +Specifically, this feature allows you to configure individual Nodes' Kubelets +via ConfigMaps. + +**Warning:** All Kubelet configuration parameters may be changed dynamically, +but not all parameters are safe to change dynamically. This feature is intended +for system experts who have a strong understanding of how configuration changes +will affect behavior. No documentation currently exists which plainly lists +"safe to change" fields, but we plan to add it before this feature graduates +from alpha. +{% endcapture %} + +{% capture prerequisites %} +- A live Kubernetes cluster with both Master and Node at v1.8 or higher must be +running, with the `DynamicKubeletConfig` feature gate enabled and the Kubelet's +`--dynamic-config-dir` flag set to a writeable directory on the Node. +This flag must be set to enable Dynamic Kubelet Configuration. +- The kubectl command-line tool must be also be v1.8 or higher, and must be +configured to communicate with the cluster. +{% endcapture %} + +{% capture steps %} + +## Reconfiguring the Kubelet on a Live Node in your Cluster + +### Basic Workflow Overview + +The basic workflow for configuring a Kubelet in a live cluster is as follows: + +1. Write a YAML or JSON configuration file containing the +Kubelet's configuration. +2. Wrap this file in a ConfigMap and save it to the Kubernetes control plane. +3. Update the Kubelet's correspoinding Node object to use this ConfigMap. + +Each Kubelet watches a configuration reference on its respective Node object. +When this reference changes, the Kubelet downloads the new configuration and +exits. For the feature to work correctly, you must be running a process manager +(like systemd) which will restart the Kubelet when it exits. When the Kubelet is +restarted, it will begin using the new configuration. + +The new configuration completely overrides the old configuration; unspecified +fields in the new configuration will receive their canonical default values. +Some CLI flags do not have an associated configuration field, and will not be +affected by the new configuration. These fields are defined by the KubeletFlags +structure, [here](https://github.com/kubernetes/kubernetes/blob/master/cmd/kubelet/app/options/options.go). + +The status of the Node's Kubelet configuration is reported via the `ConfigOK` +condition in the Node status. Once you have updated a Node to use the new +ConfigMap, you can observe this condition to confirm that the Node is using the +intended configuration. A table describing the possible conditions can be found +at the end of this article. + +This document describes editing Nodes using `kubectl edit`. +There are other ways to modify a Node's spec, including `kubectl patch`, for +example, which facilitate scripted workflows. + +This document only describes a single Node consuming each ConfigMap. Keep in +mind that it is also valid for multiple Nodes to consume the same ConfigMap. + +### Node Authorizer Workarounds + +The Node Authorizer does not yet pay attention to which ConfigMaps are assigned +to which Nodes. If you currently use the Node authorizer, your Kubelets will not +be automatically granted permission to download their respective ConfigMaps. + +The temporary workaround used in this document is to manually create the RBAC +Roles and RoleBindings for each ConfigMap. The Node Authorizer will be extended +before the Dynamic Kubelet Configuration feature graduates from alpha, so doing +this in production should never be necessary. + +### Generating a file that contains the current configuration + +The Dynamic Kubelet Configuration feature allows you to provide an override for +the entire configuration object, rather than a per-field overlay. This is a +simpler model that makes it easier to trace the source of configuration values +and debug issues. The compromise, however, is that you must start with knowledge +of the existing configuration to ensure that you only change the fields you +intend to change. + +In the future, the Kubelet will be bootstrapped from a file on disk, and you +will simply edit a copy of this file (which, as a best practice, should live in +version control) while creating the first Kubelet ConfigMap. Today, however, the +Kubelet is still bootstrapped with command-line flags. Fortunately, there is a +dirty trick you can use to generate a config file containing a Node's current +configuration. The trick involves hitting the Kubelet server's `configz` +endpoint via the kubectl proxy. This endpoint, in its current implementation, is +intended to be used only as a debugging aid, which is part of why this is a +dirty trick. There is ongoing work to improve the endpoint, and in the future +this will be a less "dirty" operation. This trick also requires the `jq` command +to be installed on your machine, for unpacking and editing the JSON response +from the endpoint. + +Do the following to generate the file: + +1. Pick a Node to reconfigure. We will refer to this Node's name as NODE_NAME. +2. Start the kubectl proxy in the background with `kubectl proxy --port=8001 &` +3. Run the following command to download and unpack the configuration from the +configz endpoint: + +``` +$ export NODE_NAME=the-name-of-the-node-you-are-reconfiguring +$ curl -sSL http://localhost:8001/api/v1/proxy/nodes/${NODE_NAME}/configz | jq '.kubeletconfig|.kind="KubeletConfiguration"|.apiVersion="kubeletconfig/v1alpha1"' > kubelet_configz_${NODE_NAME} +``` + +Note that we have to manually add the `kind` and `apiVersion` to the downloaded +object, as these are not reported by the configz endpoint. This is one of the +limitations of the endpoint that is planned to be fixed in the future. + +### Edit the configuration file + +Using your editor of choice, change one of the parameters in the +`kubelet_configz_${NODE_NAME}` file from the previous step. A QPS parameter, +`eventRecordQPS` for example, is a good candidate. + +### Push the configuration file to the control plane + +Push the edited configuration file to the control plane with the +following command: + +``` +$ kubectl -n kube-system create configmap my-node-config --from-file=kubelet=kubelet_configz_${NODE_NAME} --append-hash -o yaml +``` + +You should see a response similar to: + +``` +apiVersion: v1 +data: + kubelet: | + {...} +kind: ConfigMap +metadata: + creationTimestamp: 2017-09-14T20:23:33Z + name: my-node-config-gkt4c2m4b2 + namespace: kube-system + resourceVersion: "119980" + selfLink: /api/v1/namespaces/kube-system/configmaps/my-node-config-gkt4c2m4b2 + uid: 946d785e-998a-11e7-a8dd-42010a800006 +``` + +Note that the configuration data must appear under the ConfigMap's +`kubelet` key. + +We create the ConfigMap in the `kube-system` namespace, which is appropriate +because this ConfigMap configures a Kubernetes system component - the Kubelet. + +The `--append-hash` option appends a short checksum of the ConfigMap contents +to the name. This is convenient for an edit->push workflow, as it will +automatically, yet deterministically, generate new names for new ConfigMaps. + +We use the `-o yaml` output format so that the name, namespace, and uid are all +reported following creation. We will need these in the next step. We will refer +to the name as CONFIG_MAP_NAME and the uid as CONFIG_MAP_UID. + +### Authorize your Node to read the new ConfigMap + +Now that you've created a new ConfigMap, you need to authorize your node to +read it. First, create a Role for your new ConfigMap with the +following commands: + +``` +$ export CONFIG_MAP_NAME=name-from-previous-output +$ kubectl -n kube-system create role ${CONFIG_MAP_NAME}-reader --verb=get --resource=configmap --resource-name=${CONFIG_MAP_NAME} +``` + +Next, create a RoleBinding to associate your Node with the new Role: + +``` +$ kubectl -n kube-system create rolebinding ${CONFIG_MAP_NAME}-reader --role=${CONFIG_MAP_NAME}-reader --user=system:node:${NODE_NAME} +``` + +Once the Node Authorizer is updated to do this automatically, you will +be able to skip this step. + +### Set the Node to use the new configuration + +Edit the Node's reference to point to the new ConfigMap with the +following command: + +``` +kubectl edit node ${NODE_NAME} +``` + +Once in your editor, add the following YAML under `spec`: + +``` +configSource: + configMapRef: + name: CONFIG_MAP_NAME + namespace: kube-system + uid: CONFIG_MAP_UID +``` + +Be sure to specify all three of `name`, `namespace`, and `uid`. + +### Observe that the Node begins using the new configuration + +Retrieve the Node with `kubectl get node ${NODE_NAME} -o yaml`, and look for the +`ConfigOK` condition in `status.conditions`. You should see the message +`Using current (UID: CONFIG_MAP_UID)` when the Kubelet starts using the new +configuration. + +For convenience, you can use the following command (using `jq`) to filter down +to the `ConfigOK` condition: + +``` +$ kubectl get no ${NODE_NAME} -o json | jq '.status.conditions|map(select(.type=="ConfigOK"))' +[ + { + "lastHeartbeatTime": "2017-09-20T18:08:29Z", + "lastTransitionTime": "2017-09-20T18:08:17Z", + "message": "using current (UID: \"2ebc8d1a-9e2a-11e7-a8dd-42010a800006\")", + "reason": "passing all checks", + "status": "True", + "type": "ConfigOK" + } +] +``` + +If something goes wrong, you may see one of several different error conditions, +detailed in the Table of ConfigOK Conditions, below. When this happens, you +should check the Kubelet's log for more details. + +### Edit the configuration file again + +To change the configuration again, we simply repeat the above workflow. +Try editing the `kubelet` file, changing the previously changed parameter to a +new value. + +### Push the newly edited configuration to the control plane + +Push the new configuration to the control plane in a new ConfigMap with the +following command: + +``` +$ kubectl create configmap my-node-config --namespace=kube-system --from-file=kubelet=kubelet_configz_${NODE_NAME} --append-hash -o yaml +``` + +This new ConfigMap will get a new name, as we have changed the contents. +We will refer to the new name as NEW_CONFIG_MAP_NAME and the new uid +as NEW_CONFIG_MAP_UID. + +### Authorize your Node to read the new ConfigMap + +Now that you've created a new ConfigMap, you need to authorize your node to +read it. First, create a Role for your new ConfigMap with the +following commands: + +``` +$ export NEW_CONFIG_MAP_NAME=name-from-previous-output +$ kubectl -n kube-system create role ${NEW_CONFIG_MAP_NAME}-reader --verb=get --resource=configmap --resource-name=${NEW_CONFIG_MAP_NAME} +``` + +Next, create a RoleBinding to associate your Node with the new Role: + +``` +$ kubectl -n kube-system create rolebinding ${NEW_CONFIG_MAP_NAME}-reader --role=${NEW_CONFIG_MAP_NAME}-reader --user=system:node:${NODE_NAME} +``` + +Once the Node Authorizer is updated to do this automatically, you will +be able to skip this step. + +### Configure the Node to use the new configuration + +Once more, edit the Node's `spec.configSource` with +`kubectl edit node ${NODE_NAME}`. Your new `spec.configSource` should look like +the following, with `name` and `uid` substituted as necessary: + +``` +configSource: + configMapRef: + name: NEW_CONFIG_MAP_NAME + namespace: kube-system + uid: NEW_CONFIG_MAP_UID +``` + +### Observe that the Kubelet is using the new configuration + +Once more, retrieve the Node with `kubectl get node ${NODE_NAME} -o yaml`, and +look for the `ConfigOK` condition in `status.conditions`. You should the message +`Using current (UID: NEW_CONFIG_MAP_UID)` when the Kubelet starts using the +new configuration. + +### Deauthorize your Node fom reading the old ConfigMap + +Once you know your Node is using the new configuration and are confident that +the new configuration has not caused any problems, it is a good idea to +deauthorize the node from reading the old ConfigMap. Run the following +commands to remove the RoleBinding and Role: + +``` +$ kubectl -n kube-system delete rolebinding ${CONFIG_MAP_NAME}-reader +$ kubectl -n kube-system delete role ${CONFIG_MAP_NAME}-reader +``` + +Note that this does not necessarily prevent the Node from reverting to the old +configuration, as it may locally cache the old ConfigMap for an indefinite +period of time. + +You may optionally also choose to remove the old ConfigMap: + +``` +$ kubectl -n kube-system delete configmap ${CONFIG_MAP_NAME} +``` + +Once the Node Authorizer is updated to do this automatically, you will +be able to skip this step. + +### Reset the Node to use its local default configuration + +Finally, if you wish to reset the Node to use the configuration it was +provisioned with, simply edit the Node with `kubectl edit node ${NODE_NAME}` and +remove the `spec.configSource` subfield. + +### Observe that the Node is using its local default configuration + +After removing this subfield, you should eventually observe that the ConfigOK +condition's message reverts to either `using current (default)` or +`using current (init)`, depending on how the Node was provisioned. + +### Deauthorize your Node fom reading the old ConfigMap + +Once you know your Node is using the default configuraiton again, it is a good +idea to deauthorize the node from reading the old ConfigMap. Run the following +commands to remove the RoleBinding and Role: + +``` +$ kubectl -n kube-system delete rolebinding ${NEW_CONFIG_MAP_NAME}-reader +$ kubectl -n kube-system delete role ${NEW_CONFIG_MAP_NAME}-reader +``` + +Note that this does not necessarily prevent the Node from reverting to the old +ConfigMap, as it may locally cache the old ConfigMap for an indefinite +period of time. + +You may optionally also choose to remove the old ConfigMap: + +``` +$ kubectl -n kube-system delete configmap ${NEW_CONFIG_MAP_NAME} +``` + +Once the Node Authorizer is updated to do this automatically, you will +be able to skip this step. + +{% endcapture %} + +{% capture discussion %} +## Kubectl Patch Example +As mentioned above, there are many ways to change a Node's configSource. +Here is an example command that uses `kubectl patch`: + +``` +kubectl patch node ${NODE_NAME} -p "{\"spec\":{\"configSource\":{\"configMapRef\":{\"name\":\"${CONFIG_MAP_NAME}\",\"namespace\":\"kube-system\",\"uid\":\"${CONFIG_MAP_UID}\"}}}}" +``` + +## Understanding ConfigOK Conditions + +The following table describes several of the `ConfigOK` Node conditions you +might encounter in a cluster that has Dynamic Kubelet Config enabled. If you +observe a condition with `status=False`, you should check the Kubelet log for +more error details by searching for the message or reason text. + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Possible MessagesPossible ReasonsStatus

using current (default)

current is set to the local default, and no init config was provided

True

using current (init)

current is set to the local default, and an init config was provided

True

using current (UID: CURRENT_CONFIG_MAP_UID)

passing all checks

True

using last-known-good (default)

+
    +
  • failed to load current (UID: CURRENT_CONFIG_MAP_UID)
  • +
  • failed to parse current (UID: CURRENT_CONFIG_MAP_UID)
  • +
  • failed to validate current (UID: CURRENT_CONFIG_MAP_UID)
  • +
+

False

using last-known-good (init)

+
    +
  • failed to load current (UID: CURRENT_CONFIG_MAP_UID)
  • +
  • failed to parse current (UID: CURRENT_CONFIG_MAP_UID)
  • +
  • failed to validate current (UID: CURRENT_CONFIG_MAP_UID)
  • +
+

False

using last-known-good (UID: LAST_KNOWN_GOOD_CONFIG_MAP_UID)

+
    +
  • failed to load current (UID: CURRENT_CONFIG_MAP_UID)
  • +
  • failed to parse current (UID: CURRENT_CONFIG_MAP_UID)
  • +
  • failed to validate current (UID: CURRENT_CONFIG_MAP_UID)
  • +
+

False

+

+ The reasons in the next column could potentially appear for any of + the above messages. +

+

+ This condition indicates that the Kubelet is having trouble + reconciling `spec.configSource`, and thus no change to the in-use + configuration has occurred. +

+

+ The "failed to sync" reasons are specific to the failure that + occurred, and the next column does not necessarily contain all + possible failure reasons. +

+
+

failed to sync, reason:

+
    +
  • failed to read Node from informer object cache
  • +
  • failed to reset to local (default or init) config
  • +
  • invalid NodeConfigSource, exactly one subfield must be non-nil, but all were nil
  • +
  • invalid ObjectReference, all of UID, Name, and Namespace must be specified
  • +
  • invalid ObjectReference, UID SOME_UID does not match UID of downloaded ConfigMap SOME_OTHER_UID
  • +
  • failed to determine whether object with UID SOME_UID was already checkpointed
  • +
  • failed to download ConfigMap with name SOME_NAME from namespace SOME_NAMESPACE
  • +
  • failed to save config checkpoint for object with UID SOME_UID
  • +
  • failed to set current config checkpoint to default
  • +
  • failed to set current config checkpoint to object with UID SOME_UID
  • +
+

False

+{% endcapture %} + + +{% include templates/task.md %}