Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] add metrics to zarf-agent #1853

Closed
wants to merge 48 commits into from
Closed
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
987a4bb
[FEATURE] add metrics to zarf-agent
cmwylie19 Jun 26, 2023
72d51b3
Merge branch 'main' into 1849
cmwylie19 Jun 27, 2023
480b5f6
[DOCS] show how to scrape zarf-agent
cmwylie19 Jun 28, 2023
9c97983
[TASK] add label to agent-hook service for Service Monitor to scrape;
cmwylie19 Jun 28, 2023
1ecbf9b
Merge branch 'main' into 1849
cmwylie19 Jun 28, 2023
5dd63d0
Merge branch 'main' into 1849
cmwylie19 Jun 28, 2023
0039cd5
Update examples/big-bang/README.md
cmwylie19 Jun 28, 2023
c750db9
Merge branch 'main' into 1849
cmwylie19 Jun 29, 2023
77604b7
[TASK] zarf port-forward suggestion
cmwylie19 Jun 29, 2023
9d54451
[TASK] zarf tools k create suggestion
cmwylie19 Jun 29, 2023
402cea3
init
cmwylie19 Jun 29, 2023
4a8098d
[DOCS] scraping example
cmwylie19 Jun 29, 2023
e8de544
[DOCS] prom operator bundle
cmwylie19 Jun 29, 2023
5932a5a
[TEST] e2e test for Prometheus
cmwylie19 Jun 29, 2023
1a57006
Merge branch 'main' into 1849
cmwylie19 Jun 29, 2023
f6d0aa3
[TASK] fix tests
cmwylie19 Jun 29, 2023
94e7f44
Merge branch 'main' into 1849
cmwylie19 Jun 29, 2023
46b0860
[TASK] Make sure prometheus comes up
cmwylie19 Jun 29, 2023
c2039e1
Merge branch 'main' into 1849
cmwylie19 Jun 30, 2023
da58342
[TASK] undo tests
cmwylie19 Jun 30, 2023
c6d84cd
[TASK] revert changes and inadvertent test
cmwylie19 Jun 30, 2023
888be41
[TASK] end to end tests
cmwylie19 Jun 30, 2023
f5c3795
Update examples/scraping-zarf-agent/zarf.yaml
cmwylie19 Jul 5, 2023
65a3b5e
Update examples/big-bang/README.md
cmwylie19 Jul 5, 2023
1614dd9
Update examples/scraping-zarf-agent/README.md
cmwylie19 Jul 5, 2023
087b84f
[TASK] upstream BB README
cmwylie19 Jul 5, 2023
9b75e7a
Merge branch 'main' into 1849
cmwylie19 Jul 5, 2023
63009bb
[TASK] fix newlines and prom-service
cmwylie19 Jul 5, 2023
158c7ac
[TASK] WIP
cmwylie19 Jul 6, 2023
5ef8e6d
[TASK] Updates
cmwylie19 Jul 6, 2023
be7eb94
Merge branch 'main' into 1849
cmwylie19 Jul 10, 2023
31ea09b
Merge branch 'main' into 1849
cmwylie19 Jul 10, 2023
78697ef
Merge branch 'main' into 1849
cmwylie19 Jul 10, 2023
59717d9
[TASK] Update example
cmwylie19 Jul 10, 2023
09eaaff
[TASK] update scraping example
cmwylie19 Jul 10, 2023
7d7998d
[TASK] prepare demo
cmwylie19 Jul 11, 2023
afcfa7e
Merge branch 'main' into 1849
cmwylie19 Jul 11, 2023
6cdd0c7
[TASK] reduce size in values.yaml
cmwylie19 Jul 12, 2023
f592280
[TASK] fix tests
cmwylie19 Jul 12, 2023
e2ed6b0
[TASK] sboms local testing
cmwylie19 Jul 12, 2023
e2e1046
[TASK] zarf init file
cmwylie19 Jul 12, 2023
f3c4e38
[TASK] update prometheus svc name
cmwylie19 Jul 12, 2023
5fcfe39
[TASK] format README.md
cmwylie19 Jul 12, 2023
7eaf174
Merge branch 'defenseunicorns:main' into 1849
cmwylie19 Jul 12, 2023
16a7c5d
Merge branch 'main' into 1849
cmwylie19 Jul 12, 2023
ac0b17c
[TASK] add link to original values file
cmwylie19 Jul 12, 2023
a98ed2c
Merge branch 'main' into 1849
cmwylie19 Jul 20, 2023
0b32193
Merge branch 'main' into 1849
cmwylie19 Jul 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,8 @@ build-examples: ## Build all of the example packages

@test -s ./build/zarf-package-yolo-$(ARCH).tar.zst || $(ZARF_BIN) package create examples/yolo -o build -a $(ARCH) --confirm

@test -s ./build/zarf-package-scrape-zarf-agent-$(ARCH).tar.zst || $(ZARF_BIN) package create examples/scraping-zarf-agent -o build -a $(ARCH) --confirm

## NOTE: Requires an existing cluster or the env var APPLIANCE_MODE=true
.PHONY: test-e2e
test-e2e: build-examples ## Run all of the core Zarf CLI E2E tests (builds any deps that aren't present)
Expand Down
6 changes: 6 additions & 0 deletions examples/big-bang/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,9 @@ To view the example in its entirety, select the `Edit this page` link below the
:::

<ExampleYAML example="big-bang/yolo" showLink={false} />

## Big Bang Zarf Agent Metrics Support

Using Big Bang's Prometheus instance, we can easily add a `ServiceMonitor` to scrape the Zarf Agent's metrics. Check out the [scraping-zarf-agent example](../scraping-zarf-agent/README.md) for more details. Make sure monitoring in enabled.
cmwylie19 marked this conversation as resolved.
Show resolved Hide resolved

:::
76 changes: 76 additions & 0 deletions examples/scraping-zarf-agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
import ExampleYAML from "@site/src/components/ExampleYAML";

# Scrape Zarf Agent
cmwylie19 marked this conversation as resolved.
Show resolved Hide resolved

This example demonstrates how to scrape the Zarf Agent container image from the Prometheus Operator.

## Prerequisites

- A running K8s cluster.

:::note

The cluster does not need to have the Zarf init package installed or any other Zarf-related bootstrapping.

:::

## Instructions

Initialize Zarf (interactively):

```bash
zarf init
# Make these choices at the prompts
# ? Do you want to download this init package? Yes
# ? Deploy this Zarf package? Yes
# ? Deploy the k3s component? No
# ? Deploy the logging component? No
# ? Deploy the git-server component? No
```

Create the package:

```bash
zarf package create --confirm
```

Deploy the package

```bash
# Run the following command to deploy the created package to the cluster
zarf package deploy

# Choose the yolo package from the list
? Choose or type the package file [tab for suggestions]
> zarf-package-scrape-zarf-agent-<ARCH>.tar.zst

# Confirm the deployment
? Deploy this Zarf package? (y/N) [y]
```

Wait for the `prometheus-k8s` StatefulSet's replicas to become ready:

```bash
zarf tools kubectl wait --for=jsonpath='{.status.availableReplicas}'=1 sts/prometheus-k8s -n monitoring --timeout=180s
```

Port-forward the Prometheus Operator's Prometheus instance:

```bash
zarf connect --name=prometheus-operated --namespace monitoring --remote-port 9090 --local-port=9090
```

Navigate to the [Prometheus UI targets](http://localhost:9090/targets) at http://localhost:9090/targets.

Checkout metrics emitted by the Zarf Agent by querying against the `agent-hook` job. Click this [link](http://localhost:9090/graph?g0.expr=%7Bjob%3D%22agent-hook%22%7D&g0.tab=1&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1h) to see the Zarf Agent metrics while port-forwarding.


## `zarf.yaml` {#zarf.yaml}

:::info

To view the example in its entirety, select the `Edit this page` link below the article and select the parent folder.

:::

<ExampleYAML example="scraping-zarf-agent" showLink={false} />
39,624 changes: 39,624 additions & 0 deletions examples/scraping-zarf-agent/manifests/prometheus-deps/bundle.yaml
cmwylie19 marked this conversation as resolved.
Show resolved Hide resolved

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: null
name: scrape-cr
rules:
- apiGroups:
- ""
resources:
- pods
- pods/status
- endpoints
- services
verbs:
- watch
- get
- list
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
creationTimestamp: null
name: scrape-cr-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: scrape-cr
subjects:
- kind: ServiceAccount
name: prometheus-operator
namespace: monitoring
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
name: k8s
namespace: monitoring
spec:
serviceMonitorSelector: {}
serviceMonitorNamespaceSelector: {}
logLevel: debug
logFormat: json
replicas: 1
image: quay.io/prometheus/prometheus:v2.45.0
serviceAccountName: prometheus-operator
resources:
requests:
memory: 400Mi

Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
artifact: monitoring-agent-hook
name: monitoring-agent-hook
namespace: monitoring
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
port: https
# targetPort: 443
scheme: https
tlsConfig:
caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecureSkipVerify: true
# host name for the TLS handshake
serverName: agent-hook.zarf.svc
jobLabel: zarf-agent
namespaceSelector:
matchNames:
- zarf
selector:
matchLabels:
app: agent-hook
38 changes: 38 additions & 0 deletions examples/scraping-zarf-agent/zarf.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
kind: ZarfPackageConfig
metadata:
name: scrape-zarf-agent
yolo: false
description: Scrape Zarf Agent with Prometheus Operator.

components:
- name: prometheus-deps
required: true
manifests:
- name: prometheus-deps
namespace: monitoring
files:
- manifests/prometheus-deps/bundle.yaml
images:
- quay.io/prometheus-operator/prometheus-operator:v0.66.0
actions:
onDeploy:
after:
- wait:
cluster:
kind: deployment
name: prometheus-operator
namespace: monitoring
condition: available
- name: prometheus-operator
required: true
manifests:
- name: prometheus-operator
namespace: monitoring
files:
- manifests/prometheus-operator/prometheus.yaml
- manifests/prometheus-operator/servicemonitor-agent.yaml
- manifests/prometheus-operator/clusterrole.yaml
- manifests/prometheus-operator/clusterrolebinding.yaml
images:
- quay.io/prometheus/prometheus:v2.45.0
- quay.io/prometheus-operator/prometheus-config-reloader:v0.66.0
3 changes: 3 additions & 0 deletions packages/zarf-agent/manifests/service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,12 @@ kind: Service
metadata:
name: agent-hook
namespace: zarf
labels:
app: agent-hook
spec:
selector:
app: agent-hook
ports:
- port: 443
targetPort: 8443
name: https
3 changes: 3 additions & 0 deletions src/internal/agent/http/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (

"github.com/defenseunicorns/zarf/src/internal/agent/hooks"
"github.com/defenseunicorns/zarf/src/pkg/message"
"github.com/prometheus/client_golang/prometheus/promhttp"
)

// NewAdmissionServer creates an http.Server for the mutating webhook admission handler.
Expand All @@ -26,6 +27,7 @@ func NewAdmissionServer(port string) *http.Server {
mux.Handle("/healthz", healthz())
mux.Handle("/mutate/pod", ah.Serve(podsMutation))
mux.Handle("/mutate/flux-gitrepository", ah.Serve(gitRepositoryMutation))
mux.Handle("/metrics", promhttp.Handler())

return &http.Server{
Addr: fmt.Sprintf(":%s", port),
Expand All @@ -40,6 +42,7 @@ func NewProxyServer(port string) *http.Server {
mux := http.NewServeMux()
mux.Handle("/healthz", healthz())
mux.Handle("/", ProxyHandler())
mux.Handle("/metrics", promhttp.Handler())

return &http.Server{
Addr: fmt.Sprintf(":%s", port),
Expand Down
5 changes: 5 additions & 0 deletions src/internal/cluster/tunnel.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ const (
ZarfLogging = "LOGGING"
ZarfGit = "GIT"
ZarfInjector = "INJECTOR"
Prometheus = "PROMETHEUS"

// See https://regex101.com/r/OWVfAO/1.
serviceURLPattern = `^(?P<name>[^\.]+)\.(?P<namespace>[^\.]+)\.svc\.cluster\.local$`
Expand Down Expand Up @@ -244,6 +245,10 @@ func (tunnel *Tunnel) Connect(target string, blocking bool) error {
tunnel.resourceName = "zarf-injector"
tunnel.remotePort = 5000

case Prometheus:
tunnel.resourceName = "prometheus-operator"
tunnel.remotePort = 8080

default:
if target != "" {
if err := tunnel.checkForZarfConnectLabel(target); err != nil {
Expand Down
26 changes: 26 additions & 0 deletions src/test/e2e/26_simple_packages_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,32 @@ import (
"github.com/stretchr/testify/require"
)

func TestPrometheus(t *testing.T) {
t.Log("E2E: Prometheus")
e2e.SetupWithCluster(t)

path := fmt.Sprintf("build/zarf-package-scrape-zarf-agent-%s.tar.zst", e2e.Arch)

// Deploy Prometheus
stdOut, stdErr, err := e2e.Zarf("package", "deploy", path, "--confirm")
require.NoError(t, err, stdOut, stdErr)

// tunnel, err := cluster.NewTunnel("monitoring", "svc", "prometheus-operator", 8080, 8080)
tunnel, err := cluster.NewTunnel("monitoring", "svc", "", 8080, 0)
require.NoError(t, err)
err = tunnel.Connect("PROMETHEUS", false)
require.NoError(t, err)
defer tunnel.Close()

// Check that 'curl' returns something.
resp, err := http.Get(tunnel.HTTPEndpoint() + "/healthz")
require.NoError(t, err, resp)
require.Equal(t, 200, resp.StatusCode)

stdOut, stdErr, err = e2e.Zarf("package", "remove", "scrape-zarf-agent", "--confirm")
require.NoError(t, err, stdOut, stdErr)

}
func TestDosGames(t *testing.T) {
t.Log("E2E: Dos games")
e2e.SetupWithCluster(t)
Expand Down