Skip to content

Commit

Permalink
Add node-problem-detector
Browse files Browse the repository at this point in the history
Signed-off-by: Furkan <[email protected]>
Co-authored-by: Batuhan <[email protected]>
  • Loading branch information
Dentrax and developer-guy committed Jul 4, 2023
1 parent 5208cf1 commit 1d38a72
Show file tree
Hide file tree
Showing 9 changed files with 241 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@
| [newrelic](./images/newrelic) | `cgr.dev/chainguard/newrelic` | stable | [![](https://storage.googleapis.com/chainguard-images-build-outputs/badges/newrelic.build.status.latest.svg)](https://registry-ui.chainguard.app/?image=cgr.dev/chainguard/newrelic:latest) |
| [nginx](./images/nginx) | `cgr.dev/chainguard/nginx` | stable | [![](https://storage.googleapis.com/chainguard-images-build-outputs/badges/nginx.build.status.latest.svg)](https://registry-ui.chainguard.app/?image=cgr.dev/chainguard/nginx:latest) |
| [node](./images/node) | `cgr.dev/chainguard/node` | stable | [![](https://storage.googleapis.com/chainguard-images-build-outputs/badges/node.build.status.18.svg)](https://registry-ui.chainguard.app/?image=cgr.dev/chainguard/node:18) |
| [node-problem-detector](./images/node-problem-detector) | `cgr.dev/chainguard/node-problem-detector` | stable | [![](https://storage.googleapis.com/chainguard-images-build-outputs/badges/node-problem-detector.build.status.latest.svg)](https://registry-ui.chainguard.app/?image=cgr.dev/chainguard/node-problem-detector:latest) |
| [nodetaint](./images/nodetaint) | `cgr.dev/chainguard/nodetaint` | stable | [![](https://storage.googleapis.com/chainguard-images-build-outputs/badges/nodetaint.build.status.latest.svg)](https://registry-ui.chainguard.app/?image=cgr.dev/chainguard/nodetaint:latest) |
| [ntpd-rs](./images/ntpd-rs) | `cgr.dev/chainguard/ntpd-rs` | experimental | [![](https://storage.googleapis.com/chainguard-images-build-outputs/badges/ntpd-rs.build.status.latest.svg)](https://registry-ui.chainguard.app/?image=cgr.dev/chainguard/ntpd-rs:latest) |
| [nvidia-device-plugin](./images/nvidia-device-plugin) | `cgr.dev/chainguard/nvidia-device-plugin` | stable | [![](https://storage.googleapis.com/chainguard-images-build-outputs/badges/nvidia-device-plugin.build.status.latest.svg)](https://registry-ui.chainguard.app/?image=cgr.dev/chainguard/nvidia-device-plugin:latest) |
Expand Down
39 changes: 39 additions & 0 deletions images/node-problem-detector/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
<!--monopod:start-->
# node-problem-detector
| | |
| - | - |
| **Status** | stable |
| **OCI Reference** | `cgr.dev/chainguard/node-problem-detector` |


* [View Image in Chainguard Academy](https://edu.chainguard.dev/chainguard/chainguard-images/reference/node-problem-detector/overview/)
* [View Image Catalog](https://console.enforce.dev/images/catalog) for a full list of available tags.
*[Contact Chainguard](https://www.chainguard.dev/chainguard-images) for enterprise support, SLAs, and access to older tags.*

---
<!--monopod:end-->

[node-problem-detector](https://github.com/kubernetes/node-problem-detector) aims to make various node problems visible to the upstream layers in the cluster management stack.

## Get It!

The image is available on `cgr.dev`:

```
docker pull cgr.dev/chainguard/node-problem-detector
```

## Usage

Install via `helm` using the upstream source shown below:

```bash
helm repo add deliveryhero https://charts.deliveryhero.io/
helm upgrade --install npd deliveryhero/node-problem-detector \
--namespace node-problem-detector \
--create-namespace \
--set image.repository=cgr.dev/chainguard/node-problem-detector \
--set image.tag=latest
```

> WARNING: The example above should _not_ may work directly. In the official image, binary located under `/` path whereas we put under `/usr/bin/`. You may have to patch it: `$ kubectl patch daemonsets.apps npd-node-problem-detector --type='json' -p='[{"op": "remove", "path": "/spec/template/spec/containers/0/command"}, {"op": "add", "path": "/spec/template/spec/containers/0/args", "value": ["--config.system-log-monitor=/config/kernel-monitor.json,/config/docker-monitor.json", "--prometheus-address=0.0.0.0", "--prometheus-port=20257", "--k8s-exporter-heartbeat-period=5m0s"]}]'`
50 changes: 50 additions & 0 deletions images/node-problem-detector/configs/latest.apko.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
contents:
packages:
- node-problem-detector
- health-checker
- log-counter

accounts:
groups:
- groupname: nonroot
gid: 65532
users:
- username: nonroot
uid: 65532
gid: 65532
run-as: 0

paths:
- path: /config
type: directory
uid: 65532
gid: 65532
permissions: 0o777
recursive: true
- path: /custom-config
type: directory
uid: 65532
gid: 65532
permissions: 0o777
recursive: true
- path: /var/log
type: directory
uid: 65532
gid: 65532
permissions: 0o777
recursive: true
- path: /dev
type: directory
uid: 65532
gid: 65532
permissions: 0o777
recursive: true

entrypoint:
command: /usr/bin/node-problem-detector
cmd: --config.system-log-monitor=/config/kernel-monitor.json

annotations:
"org.opencontainers.image.authors": "Chainguard Team https://www.chainguard.dev/"
"org.opencontainers.image.url": https://edu.chainguard.dev/chainguard/chainguard-images/reference/node-problem-detector/
"org.opencontainers.image.source": https://github.com/chainguard-images/images/tree/main/images/node-problem-detector
3 changes: 3 additions & 0 deletions images/node-problem-detector/image.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
versions:
- apko:
config: configs/latest.apko.yaml
72 changes: 72 additions & 0 deletions images/node-problem-detector/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
terraform {
required_providers {
apko = { source = "chainguard-dev/apko" }
oci = { source = "chainguard-dev/oci" }
}
}

variable "target_repository" {
description = "The docker repo into which the image and attestations should be published."
}

variable "extra_repositories" {
type = list(string)
default = ["https://packages.wolfi.dev/os"]
description = "The list of additional repositories to append to the apko configuration."
}

variable "extra_keyring" {
type = list(string)
default = ["https://packages.wolfi.dev/os/wolfi-signing.rsa.pub"]
description = "The list of additional keyring entries to append to the apko configuration."
}

variable "extra_packages" {
type = list(string)
default = ["wolfi-baselayout"]
description = "The list of additional packages to append to the apko configuration."
}

provider "apko" {
extra_repositories = var.extra_repositories
extra_keyring = var.extra_keyring
default_archs = ["x86_64", "aarch64"]
}

module "latest" {
source = "../../tflib/publisher"

target_repository = var.target_repository
config = file("${path.module}/configs/latest.apko.yaml")
extra_packages = var.extra_packages
}

module "version-tags" {
source = "../../tflib/version-tags"
package = "node-problem-detector"
config = module.latest.config
}

module "test-latest" {
source = "./tests"
digest = module.latest.image_ref
}

module "test-latest-dev" {
source = "./tests"
digest = module.latest-dev.image_ref
}

module "tagger" {
source = "../../tflib/tagger"

depends_on = [
module.test-latest,
module.test-latest-dev,
]

tags = merge(
{ for t in toset(concat(["latest"], module.version-tags.tag_list)) : t => module.latest.image_ref },
{ for t in toset(concat(["latest"], module.version-tags.tag_list)) : "${t}-dev" => module.latest-dev.image_ref },
)
}
11 changes: 11 additions & 0 deletions images/node-problem-detector/tests/01-runs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env bash

set -o errexit -o nounset -o errtrace -o pipefail -x

if [[ "${IMAGE_NAME}" == "" ]]; then
echo "Must set IMAGE_NAME environment variable. Exiting."
exit 1
fi

set +o pipefail
docker run --rm $IMAGE_NAME 2>&1 | grep "unable to load in-cluster config"
41 changes: 41 additions & 0 deletions images/node-problem-detector/tests/02-helm.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#!/usr/bin/env bash

# monopod:tag:k8s

set -o errexit -o nounset -o errtrace -o pipefail -x

function preflight() {
if [[ "${IMAGE_REGISTRY}" == "" ]]; then
echo "Must set IMAGE_REGISTRY environment variable. Exiting."
exit 1
fi

if [[ "${IMAGE_REPOSITORY}" == "" ]]; then
echo "Must set IMAGE_REPOSITORY environment variable. Exiting."
exit 1
fi

if [[ "${IMAGE_TAG}" == "" ]]; then
echo "Must set IMAGE_TAG environment variable. Exiting."
exit 1
fi
}

preflight

helm repo add deliveryhero https://charts.deliveryhero.io/
helm upgrade --install npd deliveryhero/node-problem-detector \
--namespace node-problem-detector \
--create-namespace \
--set image.repository="${IMAGE_REGISTRY}/${IMAGE_REPOSITORY}" \
--set image.tag="${IMAGE_TAG}"

sleep 3

# Since we put the `node-problem-detector` binary under /usr/bin/ by default, official image puts it under root / path. So we have to adjust the command to make it work.
# https://github.com/deliveryhero/helm-charts/blob/d2b99b2d0dec9d1d879e99cbd8bffa135eb9b4e6/stable/node-problem-detector/templates/daemonset.yaml#L59-L62
kubectl patch daemonsets.apps npd-node-problem-detector --type='json' -p='[{"op": "remove", "path": "/spec/template/spec/containers/0/command"}, {"op": "add", "path": "/spec/template/spec/containers/0/args", "value": ["--config.system-log-monitor=/config/kernel-monitor.json,/config/docker-monitor.json", "--prometheus-address=0.0.0.0", "--prometheus-port=20257", "--k8s-exporter-heartbeat-period=5m0s"]}]'

sleep 3

kubectl wait --for=condition=ready pod -n node-problem-detector --selector "app.kubernetes.io/name=npd" --timeout=120s
19 changes: 19 additions & 0 deletions images/node-problem-detector/tests/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
terraform {
required_providers {
oci = { source = "chainguard-dev/oci" }
}
}

variable "digest" {
description = "The image digest to run tests over."
}

data "oci_exec_test" "runs" {
digest = var.digest
script = "${path.module}/01-runs.sh"
}

data "oci_exec_test" "helm" {
digest = var.digest
script = "${path.module}/02-helm.sh"
}
5 changes: 5 additions & 0 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -534,6 +534,11 @@ module "nodetaint" {
target_repository = "${var.target_repository}/nodetaint"
}

module "node-problem-detector" {
source = "./images/node-problem-detector"
target_repository = "${var.target_repository}/node-problem-detector"
}

module "ntpd-rs" {
source = "./images/ntpd-rs"
target_repository = "${var.target_repository}/ntpd-rs"
Expand Down

0 comments on commit 1d38a72

Please sign in to comment.