Skip to content

Commit

Permalink
charts,salt,tools: Add Thanos querier in front of Prometheus
Browse files Browse the repository at this point in the history
As part of kube-prometheus-stack chart, deploy Thanos sidecar as part
of Prometheus Pod and the Thanos sidecar service used by Thanos querier
to discover the Thanos sidecars IPs.

Re-render the chart using:
```
./charts/render.py prometheus-operator \
  charts/kube-prometheus-stack.yaml \
  charts/kube-prometheus-stack/ \
  --namespace metalk8s-monitoring \
  --service-config grafana \
  metalk8s-grafana-config \
  metalk8s/addons/prometheus-operator/config/grafana.yaml \
  metalk8s-monitoring \
  --service-config prometheus \
  metalk8s-prometheus-config \
  metalk8s/addons/prometheus-operator/config/prometheus.yaml \
  metalk8s-monitoring \
  --service-config alertmanager \
  metalk8s-alertmanager-config \
  metalk8s/addons/prometheus-operator/config/alertmanager.yaml \
  metalk8s-monitoring \
  --service-config dex \
  metalk8s-dex-config \
  metalk8s/addons/dex/config/dex.yaml.j2 metalk8s-auth \
  --drop-prometheus-rules charts/drop-prometheus-rules.yaml \
  > salt/metalk8s/addons/prometheus-operator/deployed/chart.sls
```

Import Thanos helm chart from banzaicloud:
```
helm repo add banzaicloud-stable https://kubernetes-charts.banzaicloud.com
helm repo update
helm fetch -d charts --untar banzaicloud-stable/thanos
```

Note that we only deploy the Thanos querier and not all other
components.
We also bump Thanos image since the one set in the helm chart is a bit
old and do not support all Prometheus endpoint we need.

Render the Thanos helm chart using
```
./charts/render.py thanos \
  charts/thanos.yaml charts/thanos/ \
  --namespace metalk8s-monitoring \
  > salt/metalk8s/addons/prometheus-operator/deployed/thanos-chart.sls
```

Then we replace the Prometheus datasource for Grafana in order to use
this Thanos querier. We also replace the proxy ingress used by MetalK8s
UI to use this Thanos querier as well.

Since we now reach Thanos to retrieve Prometheus information, the order
for the Prometheus rules changed a bit, so they get extracted again
using
```
./tools/rule_extractor/rule_extractor.py \
  -i <control-plane-ip> -p 8443 -t rules
```

NOTE: Test not get updated a lot, just added sanity check to ensure
Thanos querier Pod is running, since we already have some tests about
Prometheus and for all those tests we now reach Thanos so we implicitly
test Thanos querier
  • Loading branch information
TeddyAndrieux committed Oct 21, 2021
1 parent af16ae3 commit 889fee7
Show file tree
Hide file tree
Showing 61 changed files with 7,731 additions and 2,290 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,10 @@
PR[#3420](https://github.com/scality/metalk8s/pull/3420),
PR[#3501](https://github.com/scality/metalk8s/pull/3501))

- Deploy Thanos querier in front of Prometheus in order to make metrics
highly-available when we have multiple Prometheus instances
(PR[#3573](https://github.com/scality/metalk8s/pull/3573))

## Release 2.10.5 (in development)

## Release 2.10.4
Expand Down
1 change: 1 addition & 0 deletions buildchain/buildchain/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
PROMETHEUS_ADAPTER_REPOSITORY: str = "docker.io/directxman12"
PROMETHEUS_OPERATOR_REPOSITORY: str = "quay.io/prometheus-operator"
PROMETHEUS_REPOSITORY: str = "quay.io/prometheus"
THANOS_REPOSITORY: str = "quay.io/thanos"

# Paths {{{

Expand Down
3 changes: 3 additions & 0 deletions buildchain/buildchain/image.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,9 @@ def _operator_image(name: str, **kwargs: Any) -> targets.OperatorImage:
"node-exporter",
"prometheus",
],
constants.THANOS_REPOSITORY: [
"thanos",
],
}

REMOTE_NAMES: Dict[str, str] = {
Expand Down
1 change: 1 addition & 0 deletions buildchain/buildchain/salt_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,7 @@ def task(self) -> types.TaskDict:
Path("salt/metalk8s/addons/prometheus-operator/deployed/namespace.sls"),
Path("salt/metalk8s/addons/prometheus-operator/deployed/prometheus-rules.sls"),
Path("salt/metalk8s/addons/prometheus-operator/deployed/service-configuration.sls"),
Path("salt/metalk8s/addons/prometheus-operator/deployed/thanos-chart.sls"),
Path("salt/metalk8s/addons/ui/deployed/dependencies.sls"),
Path("salt/metalk8s/addons/ui/deployed/ingress.sls"),
Path("salt/metalk8s/addons/ui/deployed/init.sls"),
Expand Down
5 changes: 5 additions & 0 deletions buildchain/buildchain/versions.py
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,11 @@ def _version_prefix(version: str, prefix: str = "v") -> str:
version="v0.48.1",
digest="sha256:2e7b61c86ee8b0aef4f5da8b6a4e51ecef249c9ccf4a329c5aa0c81e3fd074c1",
),
Image(
name="thanos",
version="v0.23.1",
digest="sha256:2f7d1ddc7877b076efbc3fa626b5003f7f197efbd777cff0eec2b20c2cd68d20",
),
# Local images
Image(
name="metalk8s-alert-logger",
Expand Down
10 changes: 10 additions & 0 deletions charts/kube-prometheus-stack.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,13 @@ prometheusOperator:


prometheus:
thanosService:
enabled: true

prometheusSpec:
thanos:
image: '__full_image__(thanos)'

image:
repository: '__image__(prometheus)'

Expand Down Expand Up @@ -149,6 +155,10 @@ grafana:
image:
repository: '__image__(k8s-sidecar)'

datasources:
# Service deployed by Thanos
url: http://thanos-query-http:10902/

nodeSelector:
node-role.kubernetes.io/infra: ''

Expand Down
46 changes: 46 additions & 0 deletions charts/thanos.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
store:
enabled: false

queryFrontend:
enabled: false

compact:
enabled: false

bucket:
enabled: false

rule:
enabled: false

# This one is deployed by Prometheus operator
sidecar:
enabled: false

image:
repository: '__image__(thanos)'
tag: v0.23.1

query:
enabled: true

replicaLabels:
- prometheus_replica

storeDNSDiscovery: false
sidecarDNSDiscovery: false

stores:
# Service deployed by Prometheus operator to expose Thanos sidecars
- dnssrv+_grpc._tcp.prometheus-operator-thanos-discovery

tolerations:
- key: 'node-role.kubernetes.io/bootstrap'
operator: 'Exists'
effect: 'NoSchedule'
- key: 'node-role.kubernetes.io/infra'
operator: 'Exists'
effect: 'NoSchedule'

nodeSelector:
node-role.kubernetes.io/infra: ''
21 changes: 21 additions & 0 deletions charts/thanos/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*~
# Various IDEs
.project
.idea/
*.tmproj
18 changes: 18 additions & 0 deletions charts/thanos/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
apiVersion: v1
appVersion: 0.17.1
description: Thanos is a set of components that can be composed into a highly available
metric system with unlimited storage capacity, which can be added seamlessly on
top of existing Prometheus deployments.
icon: https://raw.githubusercontent.com/thanos-io/thanos/master/docs/img/Thanos-logo_fullmedium.png
keywords:
- thanos
- prometheus
- metrics
maintainers:
- email: [email protected]
name: Banzai Cloud
name: thanos
sources:
- https://github.com/thanos-io/thanos
- https://github.com/banzaicloud/banzai-charts/tree/master/thanos
version: 0.4.6
Loading

0 comments on commit 889fee7

Please sign in to comment.