Skip to content

Commit

Permalink
Rework documentation
Browse files Browse the repository at this point in the history
Remove the cgroup schema as it's not really actionnable => the link to
kubernetes documenation and design doc over here already has that stuff.
  • Loading branch information
VannTen committed Dec 14, 2023
1 parent e279ba4 commit 4ea8f19
Show file tree
Hide file tree
Showing 2 changed files with 39 additions and 80 deletions.
74 changes: 22 additions & 52 deletions docs/cgroups.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,71 +2,41 @@

To avoid the rivals for resources between containers or the impact on the host in Kubernetes, the kubelet components will rely on cgroups to limit the container’s resources usage.

## Enforcing Node Allocatable
## Node Allocatable

You can use `kubelet_enforce_node_allocatable` to set node allocatable enforcement.
Node Allocatable is calculated by substracting from the node capacity:

```yaml
# A comma separated list of levels of node allocatable enforcement to be enforced by kubelet.
kubelet_enforce_node_allocatable: "pods"
# kubelet_enforce_node_allocatable: "pods,kube-reserved"
# kubelet_enforce_node_allocatable: "pods,kube-reserved,system-reserved"
```

Note that to enforce kube-reserved or system-reserved, `kube_reserved_cgroups` or `system_reserved_cgroups` needs to be specified respectively.
- kube-reserved reservations
- system-reserved reservations
- hard eviction thresholds

Here is an example:
You can set those reservations:

```yaml
kubelet_enforce_node_allocatable: "pods,kube-reserved,system-reserved"

# Reserve this space for kube resources
# Set to true to reserve resources for kube daemons
kube_reserved: true
kube_reserved_cgroups_for_service_slice: kube.slice
kube_reserved_cgroups: "/{{ kube_reserved_cgroups_for_service_slice }}"
# Kubelet and container engine
kube_memory_reserved: 256Mi
kube_cpu_reserved: 100m
# kube_ephemeral_storage_reserved: 2Gi
# kube_pid_reserved: "1000"
# Reservation for master hosts
kube_master_memory_reserved: 512Mi
kube_master_cpu_reserved: 200m
# kube_master_ephemeral_storage_reserved: 2Gi
# kube_master_pid_reserved: "1000"
kube_ephemeral_storage_reserved: 2Gi
kube_pid_reserved: "1000"

# Set to true to reserve resources for system daemons
system_reserved: true
system_reserved_cgroups_for_service_slice: system.slice
system_reserved_cgroups: "/{{ system_reserved_cgroups_for_service_slice }}"
# System daemons (sshd, network manager, ...)
system_memory_reserved: 512Mi
system_cpu_reserved: 500m
# system_ephemeral_storage_reserved: 2Gi
# system_pid_reserved: "1000"
# Reservation for master hosts
system_master_memory_reserved: 256Mi
system_master_cpu_reserved: 250m
# system_master_ephemeral_storage_reserved: 2Gi
# system_master_pid_reserved: "1000"
system_ephemeral_storage_reserved: 2Gi
system_pid_reserved: "1000"
```
After the setup, the cgroups hierarchy is as follows:
By default, the kubelet will enforce Node Allocatable for pods by default, which means pods will be
evicted when resource usage excess Allocatable.
You can optionnaly enforce the reservations for kube-reserved and
system-reserved, but proceed with caution (see [the kubernetes
guidelines](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#general-guidelines)).
```bash
/ (Cgroups Root)
├── kubepods.slice
│ ├── ...
│ ├── kubepods-besteffort.slice
│ ├── kubepods-burstable.slice
│ └── ...
├── kube.slice
│ ├── ...
│ ├── {{container_manager}}.service
│ ├── kubelet.service
│ └── ...
├── system.slice
│ └── ...
└── ...
```yaml
enforce_allocatable_pods: true # default
enforce_allocatable_kube_reserved: true
enforce_allocatable_system_reseverd: true
```
You can learn more in the [official kubernetes documentation](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/).
45 changes: 17 additions & 28 deletions inventory/sample/group_vars/k8s_cluster/k8s-cluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -259,47 +259,36 @@ podsecuritypolicy_enabled: false
# Download kubectl onto the host that runs Ansible in {{ bin_dir }}
# kubectl_localhost: false

# A comma separated list of levels of node allocatable enforcement to be enforced by kubelet.
# Acceptable options are 'pods', 'system-reserved', 'kube-reserved' and ''. Default is "".
# kubelet_enforce_node_allocatable: pods
## Reserving compute resources
# https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/

## Set runtime and kubelet cgroups when using systemd as cgroup driver (default)
# kubelet_runtime_cgroups: "/{{ kube_service_cgroups }}/{{ container_manager }}.service"
# kubelet_kubelet_cgroups: "/{{ kube_service_cgroups }}/kubelet.service"

## Set runtime and kubelet cgroups when using cgroupfs as cgroup driver
# kubelet_runtime_cgroups_cgroupfs: "/system.slice/{{ container_manager }}.service"
# kubelet_kubelet_cgroups_cgroupfs: "/system.slice/kubelet.service"

# Optionally reserve this space for kube daemons.
# kube_reserved: false
# Optionally reserve resources for kube daemons.
## Uncomment to override default values
## The following two items need to be set when kube_reserved is true
# kube_reserved_cgroups_for_service_slice: kube.slice
# kube_reserved_cgroups: "/{{ kube_reserved_cgroups_for_service_slice }}"
# kube_memory_reserved: 256Mi
# kube_cpu_reserved: 100m
# kube_ephemeral_storage_reserved: 2Gi
# kube_pid_reserved: "1000"
# Reservation for master hosts
# kube_master_memory_reserved: 512Mi
# kube_master_cpu_reserved: 200m
# kube_master_ephemeral_storage_reserved: 2Gi
# kube_master_pid_reserved: "1000"

## Optionally reserve resources for OS system daemons.
# system_reserved: true
## Uncomment to override default values
## The following two items need to be set when system_reserved is true
# system_reserved_cgroups_for_service_slice: system.slice
# system_reserved_cgroups: "/{{ system_reserved_cgroups_for_service_slice }}"
# system_memory_reserved: 512Mi
# system_cpu_reserved: 500m
# system_ephemeral_storage_reserved: 2Gi
## Reservation for master hosts
# system_master_memory_reserved: 256Mi
# system_master_cpu_reserved: 250m
# system_master_ephemeral_storage_reserved: 2Gi
# system_pid_reserved: "1000"
#
# Make the kubelet enforce with cgroups the limits of Pods
# enforce_allocatable_pods: true

# Enforce kube_*_reserved as limits
# WARNING: this limits the resources the kubelet and the container engine can
# use which can cause instability on your nodes
# enforce_allocatable_kube_reserved: false

# Enforce system_*_reserved as limits
# WARNING: this limits the resources system daemons can use which can lock you
# out of your nodes (by OOMkilling sshd for instance)
# enforce_allocatable_system_reserved: false

## Eviction Thresholds to avoid system OOMs
# https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#eviction-thresholds
Expand Down

0 comments on commit 4ea8f19

Please sign in to comment.