Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Commit

Permalink
Merge pull request #1384 from kinvolk/knrt10/new-component-node-probl…
Browse files Browse the repository at this point in the history
…em-detector

Add component node-problem-detector
  • Loading branch information
knrt10 authored Feb 25, 2021
2 parents 5463b50 + 99d4024 commit 87e9c0d
Show file tree
Hide file tree
Showing 29 changed files with 1,213 additions and 0 deletions.
20 changes: 20 additions & 0 deletions assets/charts/components/node-problem-detector/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
apiVersion: v1
name: node-problem-detector
version: "1.8.6"
appVersion: v0.8.5
home: https://github.com/kubernetes/node-problem-detector
description: |
This chart installs a [node-problem-detector](https://github.com/kubernetes/node-problem-detector) daemonset. This tool aims to make various node problems visible to the upstream layers in cluster management stack. It is a daemon which runs on each node, detects node problems and reports them to apiserver.
icon: https://github.com/kubernetes/kubernetes/raw/master/logo/logo.png
keywords:
- node
- problem
- detector
- monitoring
sources:
- https://github.com/kubernetes/node-problem-detector
- https://kubernetes.io/docs/concepts/architecture/nodes/#condition
maintainers:
- name: max-rocket-internet
email: [email protected]
engine: gotpl
90 changes: 90 additions & 0 deletions assets/charts/components/node-problem-detector/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# node-problem-detector

![Version: 1.8.6](https://img.shields.io/badge/Version-1.8.6-informational?style=flat-square) ![AppVersion: v0.8.5](https://img.shields.io/badge/AppVersion-v0.8.5-informational?style=flat-square)

This chart installs a [node-problem-detector](https://github.com/kubernetes/node-problem-detector) daemonset. This tool aims to make various node problems visible to the upstream layers in cluster management stack. It is a daemon which runs on each node, detects node problems and reports them to apiserver.

**Homepage:** <https://github.com/kubernetes/node-problem-detector>

## How to install this chart

Add Delivery Hero public chart repo:

```console
helm repo add deliveryhero https://charts.deliveryhero.io/
```

A simple install with default values:

```console
helm install deliveryhero/node-problem-detector
```

To install the chart with the release name `my-release`:

```console
helm install my-release deliveryhero/node-problem-detector
```

To install with some set values:

```console
helm install my-release deliveryhero/node-problem-detector --set values_key1=value1 --set values_key2=value2
```

To install with custom values file:

```console
helm install my-release deliveryhero/node-problem-detector -f values.yaml
```

## Source Code

* <https://github.com/kubernetes/node-problem-detector>
* <https://kubernetes.io/docs/concepts/architecture/nodes/#condition>

## Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| affinity | object | `{}` | |
| annotations | object | `{}` | |
| env | string | `nil` | |
| extraVolumeMounts | list | `[]` | |
| extraVolumes | list | `[]` | |
| fullnameOverride | string | `""` | |
| hostNetwork | bool | `false` | Run pod on host network Flag to run Node Problem Detector on the host's network. This is typically not recommended, but may be useful for certain use cases. |
| hostPID | bool | `false` | |
| hostpath.logdir | string | `"/var/log/"` | Log directory path on K8s host |
| image.pullPolicy | string | `"IfNotPresent"` | |
| image.repository | string | `"k8s.gcr.io/node-problem-detector/node-problem-detector"` | |
| image.tag | string | `"v0.8.5"` | |
| imagePullSecrets | list | `[]` | |
| labels | object | `{}` | |
| maxUnavailable | int | `1` | The max pods unavailable during an update |
| metrics.serviceMonitor.additionalLabels | object | `{}` | |
| metrics.serviceMonitor.enabled | bool | `false` | |
| nameOverride | string | `""` | |
| nodeSelector | object | `{}` | |
| priorityClassName | string | `""` | |
| rbac.create | bool | `true` | |
| rbac.pspEnabled | bool | `false` | |
| resources | object | `{}` | |
| securityContext.privileged | bool | `true` | |
| serviceAccount.create | bool | `true` | |
| serviceAccount.name | string | `nil` | |
| settings.custom_monitor_definitions | object | `{}` | Custom plugin monitor config files |
| settings.custom_plugin_monitors | list | `[]` | |
| settings.heartBeatPeriod | string | `"5m0s"` | Syncing interval with API server |
| settings.log_monitors | list | `["/config/kernel-monitor.json","/config/docker-monitor.json"]` | User-specified custom monitor definitions |
| settings.prometheus_address | string | `"0.0.0.0"` | Prometheus exporter address |
| settings.prometheus_port | int | `20257` | Prometheus exporter port |
| tolerations[0].effect | string | `"NoSchedule"` | |
| tolerations[0].operator | string | `"Exists"` | |
| updateStrategy | string | `"RollingUpdate"` | Manage the daemonset update strategy |

## Maintainers

| Name | Email | Url |
| ---- | ------ | --- |
| max-rocket-internet | [email protected] | |
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
To verify that the node-problem-detector pods have started, run:

kubectl --namespace={{ .Release.Namespace }} get pods -l "app.kubernetes.io/name={{ template "node-problem-detector.name" . }},app.kubernetes.io/instance={{ .Release.Name }}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
{{/* vim: set filetype=mustache: */}}

{{/*
Expand the name of the chart.
*/}}
{{- define "node-problem-detector.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
{{- end -}}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "node-problem-detector.fullname" -}}
{{- if .Values.fullnameOverride -}}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- if contains $name .Release.Name -}}
{{- .Release.Name | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}
{{- end -}}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "node-problem-detector.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}}
{{- end -}}

{{/* Create the name of the service account to use */}}
{{- define "node-problem-detector.serviceAccountName" -}}
{{- if .Values.serviceAccount.create -}}
{{ default (include "node-problem-detector.fullname" .) .Values.serviceAccount.name }}
{{- else -}}
{{ default "default" .Values.serviceAccount.name }}
{{- end -}}
{{- end -}}

{{/*
Create the name of the configmap for storing custom monitor definitions
*/}}
{{- define "node-problem-detector.customConfig" -}}
{{- $fullname := include "node-problem-detector.fullname" . -}}
{{- printf "%s-custom-config" $fullname | replace "+" "_" | trunc 63 -}}
{{- end -}}

{{/*
Return the appropriate apiVersion for podSecurityPolicy.
*/}}
{{- define "podSecurityPolicy.apiVersion" -}}
{{- if semverCompare ">=1.10-0" .Capabilities.KubeVersion.GitVersion -}}
{{- print "policy/v1beta1" -}}
{{- else -}}
{{- print "extensions/v1beta1" -}}
{{- end -}}
{{- end -}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
{{- if .Values.rbac.create -}}
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: {{ template "node-problem-detector.fullname" . }}
labels:
app.kubernetes.io/name: {{ include "node-problem-detector.name" . }}
helm.sh/chart: {{ include "node-problem-detector.chart" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- update
{{- end -}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{{- if .Values.rbac.create -}}
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: {{ template "node-problem-detector.fullname" . }}
labels:
app.kubernetes.io/name: {{ include "node-problem-detector.name" . }}
helm.sh/chart: {{ include "node-problem-detector.chart" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
subjects:
- kind: ServiceAccount
name: {{ template "node-problem-detector.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: {{ template "node-problem-detector.fullname" . }}
apiGroup: rbac.authorization.k8s.io
{{- end -}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: v1
data:
{{ .Values.settings.custom_monitor_definitions | toYaml | indent 2 }}
kind: ConfigMap
metadata:
name: {{ include "node-problem-detector.customConfig" . }}
labels:
app.kubernetes.io/name: {{ include "node-problem-detector.name" . }}
helm.sh/chart: {{ include "node-problem-detector.chart" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: {{ include "node-problem-detector.fullname" . }}
labels:
app.kubernetes.io/name: {{ include "node-problem-detector.name" . }}
helm.sh/chart: {{ include "node-problem-detector.chart" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- range $key, $val := .Values.labels }}
{{ $key }}: {{ $val | quote }}
{{- end}}
spec:
updateStrategy:
type: {{ .Values.updateStrategy }}
{{- if eq .Values.updateStrategy "RollingUpdate"}}
rollingUpdate:
maxUnavailable: {{ .Values.maxUnavailable }}
{{- end}}
selector:
matchLabels:
app.kubernetes.io/name: {{ include "node-problem-detector.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app: {{ include "node-problem-detector.name" . }}
template:
metadata:
labels:
app.kubernetes.io/name: {{ include "node-problem-detector.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app: {{ include "node-problem-detector.name" . }}
{{- range $key, $val := .Values.labels }}
{{ $key }}: {{ $val | quote }}
{{- end}}
annotations:
checksum/config: {{ include (print $.Template.BasePath "/custom-config-configmap.yaml") . | sha256sum }}
scheduler.alpha.kubernetes.io/critical-pod: ''
{{- if .Values.annotations }}
{{ toYaml .Values.annotations | indent 8 }}
{{- end }}
spec:
serviceAccountName: {{ template "node-problem-detector.serviceAccountName" . }}
{{- if .Values.imagePullSecrets }}
imagePullSecrets: {{ toYaml .Values.imagePullSecrets | nindent 8 }}
{{- end }}
hostNetwork: {{ .Values.hostNetwork }}
hostPID: {{ .Values.hostPID }}
terminationGracePeriodSeconds: 30
{{- if .Values.priorityClassName }}
priorityClassName: {{ .Values.priorityClassName | quote }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy | default "IfNotPresent" | quote }}
command:
- "/bin/sh"
- "-c"
- "exec /node-problem-detector --logtostderr --config.system-log-monitor={{- range $index, $monitor := .Values.settings.log_monitors }}{{if ne $index 0}},{{end}}{{ $monitor }}{{- end }} {{- if .Values.settings.custom_plugin_monitors }} --custom-plugin-monitors={{- range $index, $monitor := .Values.settings.custom_plugin_monitors }}{{if ne $index 0}},{{end}}{{ $monitor }}{{- end }} {{- end }} --prometheus-address={{ .Values.settings.prometheus_address }} --prometheus-port={{ .Values.settings.prometheus_port }} --k8s-exporter-heartbeat-period={{ .Values.settings.heartBeatPeriod }}"
{{- if .Values.securityContext }}
securityContext:
{{ toYaml .Values.securityContext | indent 12 }}
{{- end }}
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
{{- if .Values.env }}
{{ toYaml .Values.env | indent 12 }}
{{- end }}
volumeMounts:
- name: log
mountPath: {{ .Values.hostpath.logdir }}
- name: localtime
mountPath: /etc/localtime
readOnly: true
- name: custom-config
mountPath: /custom-config
readOnly: true
{{- if .Values.extraVolumeMounts }}
{{ toYaml .Values.extraVolumeMounts | indent 12 }}
{{- end }}
ports:
- containerPort: {{ .Values.settings.prometheus_port }}
name: exporter
resources:
{{ toYaml .Values.resources | indent 12 }}
{{- with .Values.affinity }}
affinity:
{{ toYaml . | indent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{ toYaml . | indent 8 }}
{{- end }}
{{- if .Values.nodeSelector }}
nodeSelector:
{{ toYaml .Values.nodeSelector | indent 8 }}
{{- end }}
volumes:
- name: log
hostPath:
path: {{ .Values.hostpath.logdir }}
- name: localtime
hostPath:
path: /etc/localtime
type: "FileOrCreate"
- name: custom-config
configMap:
name: {{ include "node-problem-detector.customConfig" . }}
{{- if .Values.extraVolumes }}
{{ toYaml .Values.extraVolumes | indent 8 }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{{- if .Values.rbac.pspEnabled }}
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: {{ template "node-problem-detector.fullname" . }}-psp
labels:
app.kubernetes.io/name: {{ include "node-problem-detector.name" . }}
helm.sh/chart: {{ include "node-problem-detector.chart" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames:
- {{ template "node-problem-detector.fullname" . }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{{- if .Values.rbac.pspEnabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ template "node-problem-detector.fullname" . }}-psp
labels:
app.kubernetes.io/name: {{ include "node-problem-detector.name" . }}
helm.sh/chart: {{ include "node-problem-detector.chart" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ template "node-problem-detector.fullname" . }}-psp
subjects:
- kind: ServiceAccount
name: {{ template "node-problem-detector.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
{{- end }}
Loading

0 comments on commit 87e9c0d

Please sign in to comment.