Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Alertmanager config and templates in Helm chart #188

Merged
merged 32 commits into from
Jan 15, 2025

Conversation

TheoBrigitte
Copy link
Member

@TheoBrigitte TheoBrigitte commented Dec 10, 2024

Towards: giantswarm/roadmap#3746

This PR does couple of things to get the Alertmanager into a Secret in the Helm chart:

  • Helm
    • Add secret resource, embedding raw and templated alertmanager files
    • Expose alertmanager templates values as helm chart values
  • Alertmanager
    • Remove all Mimir related conditions from templates
    • Escape template in template
    • Split template into url and notification templates, to reduce template in template escaping
    • Re-use slack actions, to reduce template in template escaping
    • Drop template directive, dynamically set by the operator

How I generated the new Alertmanager config and notification template

Alertmanager config
wget https://raw.githubusercontent.com/giantswarm/prometheus-meta-operator/refs/heads/main/files/templates/alertmanager/alertmanager.yaml
sed -i -e 's/\[\[/{{/g' -e 's/\]\]/}}/g' alertmanager.yaml
patch alertmanager.yaml < <following patch>
13,15d12
< templates:
< - '/etc/alertmanager/config/*.tmpl'
<
24d20
<   {{- if .MimirEnabled }}
34d29
<   {{- end }}
173d167
< {{- if .MimirEnabled }}
187d180
< {{- end }}
206c199
<     actions:
---
>     actions: &slack-actions
209,210c202,203
<       url: '{{ template "__runbookurl" . }}'
<       style: '{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}'
---
>       url: {{`{{ template "__runbookurl" . }}`}}
>       style: {{`{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}`}}
213c206
<       url: '{{ template "__alert_linked_postmortems" . }}'
---
>       url: {{`{{ template "__alert_linked_postmortems" . }}`}}
216c209
<       url: '{{ template "__alerturl" . }}'
---
>       url: {{`{{ template "__alerturl" . }}`}}
219c212
<       url: '{{ template "__dashboardurl" . }}'
---
>       url: {{`{{ template "__dashboardurl" . }}`}}
222,223c215,216
<       url: '{{ template "__alert_silence_link" .}}'
<       style: '{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}'
---
>       url: {{`{{ template "__alert_silence_link" .}}`}}
>       style: {{`{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}`}}
242,259c235
<     actions:
<     - type: button
<       text: ':green_book: OpsRecipe'
<       url: '{{ template "__runbookurl" . }}'
<       style: '{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}'
<     - type: button
<       text: ':coffin: Linked PMs'
<       url: '{{ template "__alert_linked_postmortems" . }}'
<     - type: button
<       text: ':mag: Query'
<       url: '{{ template "__alerturl" . }}'
<     - type: button
<       text: ':grafana: Dashboard'
<       url: '{{ template "__dashboardurl" . }}'
<     - type: button
<       text: ':no_bell: Silence'
<       url: '{{ template "__alert_silence_link" .}}'
<       style: '{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}'
---
>     actions: *slack-actions
278,295c254
<     actions:
<     - type: button
<       text: ':green_book: OpsRecipe'
<       url: '{{ template "__runbookurl" . }}'
<       style: '{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}'
<     - type: button
<       text: ':coffin: Linked PMs'
<       url: '{{ template "__alert_linked_postmortems" . }}'
<     - type: button
<       text: ':mag: Query'
<       url: '{{ template "__alerturl" . }}'
<     - type: button
<       text: ':grafana: Dashboard'
<       url: '{{ template "__dashboardurl" . }}'
<     - type: button
<       text: ':no_bell: Silence'
<       url: '{{ template "__alert_silence_link" . }}'
<       style: '{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}'
---
>     actions: *slack-actions
314,331c273
<     actions:
<     - type: button
<       text: ':green_book: OpsRecipe'
<       url: '{{ template "__runbookurl" . }}'
<       style: '{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}'
<     - type: button
<       text: ':coffin: Linked PMs'
<       url: '{{ template "__alert_linked_postmortems" . }}'
<     - type: button
<       text: ':mag: Query'
<       url: '{{ template "__alerturl" . }}'
<     - type: button
<       text: ':grafana: Dashboard'
<       url: '{{ template "__dashboardurl" . }}'
<     - type: button
<       text: ':no_bell: Silence'
<       url: '{{ template "__alert_silence_link" . }}'
<       style: '{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}'
---
>     actions: *slack-actions
350,367c292
<     actions:
<     - type: button
<       text: ':green_book: OpsRecipe'
<       url: '{{ template "__runbookurl" . }}'
<       style: '{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}'
<     - type: button
<       text: ':coffin: Linked PMs'
<       url: '{{ template "__alert_linked_postmortems" . }}'
<     - type: button
<       text: ':mag: Query'
<       url: '{{ template "__alerturl" . }}'
<     - type: button
<       text: ':grafana: Dashboard'
<       url: '{{ template "__dashboardurl" . }}'
<     - type: button
<       text: ':no_bell: Silence'
<       url: '{{ template "__alert_silence_link" . }}'
<       style: '{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}'
---
>     actions: *slack-actions
382,399c307
<     actions:
<     - type: button
<       text: ':green_book: OpsRecipe'
<       url: '{{ template "__runbookurl" . }}'
<       style: '{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}'
<     - type: button
<       text: ':coffin: Linked PMs'
<       url: '{{ template "__alert_linked_postmortems" . }}'
<     - type: button
<       text: ':mag: Query'
<       url: '{{ template "__alerturl" . }}'
<     - type: button
<       text: ':grafana: Dashboard'
<       url: '{{ template "__dashboardurl" . }}'
<     - type: button
<       text: ':no_bell: Silence'
<       url: '{{ template "__alert_silence_link" .}}'
<       style: '{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}'
---
>     actions: *slack-actions
418,435c326
<     actions:
<     - type: button
<       text: ':green_book: OpsRecipe'
<       url: '{{ template "__runbookurl" . }}'
<       style: '{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}'
<     - type: button
<       text: ':coffin: Linked PMs'
<       url: '{{ template "__alert_linked_postmortems" . }}'
<     - type: button
<       text: ':mag: Query'
<       url: '{{ template "__alerturl" . }}'
<     - type: button
<       text: ':grafana: Dashboard'
<       url: '{{ template "__dashboardurl" . }}'
<     - type: button
<       text: ':no_bell: Silence'
<       url: '{{ template "__alert_silence_link" . }}'
<       style: '{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}'
---
>     actions: *slack-actions
450,467c341
<     actions:
<     - type: button
<       text: ':green_book: OpsRecipe'
<       url: '{{ template "__runbookurl" . }}'
<       style: '{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}'
<     - type: button
<       text: ':coffin: Linked PMs'
<       url: '{{ template "__alert_linked_postmortems" . }}'
<     - type: button
<       text: ':mag: Query'
<       url: '{{ template "__alerturl" . }}'
<     - type: button
<       text: ':grafana: Dashboard'
<       url: '{{ template "__dashboardurl" . }}'
<     - type: button
<       text: ':no_bell: Silence'
<       url: '{{ template "__alert_silence_link" .}}'
<       style: '{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}'
---
>     actions: *slack-actions
482,499c356
<     actions:
<     - type: button
<       text: ':green_book: OpsRecipe'
<       url: '{{ template "__runbookurl" . }}'
<       style: '{{ if eq .Status "firing" }}primary{{ else }}default{{ end }}'
<     - type: button
<       text: ':coffin: Linked PMs'
<       url: '{{ template "__alert_linked_postmortems" . }}'
<     - type: button
<       text: ':mag: Query'
<       url: '{{ template "__alerturl" . }}'
<     - type: button
<       text: ':grafana: Dashboard'
<       url: '{{ template "__dashboardurl" . }}'
<     - type: button
<       text: ':no_bell: Silence'
<       url: '{{ template "__alert_silence_link" .}}'
<       style: '{{ if eq .Status "firing" }}danger{{ else }}default{{ end }}'
---
>     actions: *slack-actions
504c361
<     tags: "{{ (index .Alerts 0).Labels.alertname }},{{ (index .Alerts 0).Labels.cluster_type }},{{ (index .Alerts 0).Labels.severity }},{{ (index .Alerts 0).Labels.team }},{{ (index .Alerts 0).Labels.area }},{{ (index .Alerts 0).Labels.service_priority }},{{ (index .Alerts 0).Labels.provider }},{{ (index .Alerts 0).Labels.installation }},{{ (index .Alerts 0).Labels.pipeline }},{{ (index .Alerts 0).Labels.customer }}"
---
>     tags: {{`{{ (index .Alerts 0).Labels.alertname }},{{ (index .Alerts 0).Labels.cluster_type }},{{ (index .Alerts 0).Labels.severity }},{{ (index .Alerts 0).Labels.team }},{{ (index .Alerts 0).Labels.area }},{{ (index .Alerts 0).Labels.service_priority }},{{ (index .Alerts 0).Labels.provider }},{{ (index .Alerts 0).Labels.installation }},{{ (index .Alerts 0).Labels.pipeline }},{{ (index .Alerts 0).Labels.customer }}`}}
Notification template
wget https://raw.githubusercontent.com/giantswarm/prometheus-meta-operator/refs/heads/main/files/templates/alertmanager/notification-template.tmpl
patch notification-template.tmpl < <following patch>
3,21d2
< {{ define "__alerturl" }}
< [[- if .MimirEnabled -]]
< [[ .GrafanaAddress ]]/alerting/Mimir/{{ .CommonLabels.alertname }}/find
< [[- else -]]
< {{ .ExternalURL }}/#/alerts?receiver={{ .Receiver }}&silenced=false&inhibited=false&active=true&filter=%7Balertname%3D%22{{ .CommonLabels.alertname }}%22%7D
< [[- end -]]
< {{ end }}
<
< {{ define "__dashboardurl" -}}{{ if match "^https://.+" (index .Alerts 0).Annotations.dashboard }}{{ (index .Alerts 0).Annotations.dashboard }}{{ else }}[[ .GrafanaAddress ]]/d/{{ (index .Alerts 0).Annotations.dashboard }}{{ end }}{{- end }}
< {{ define "__runbookurl" -}}https://intranet.giantswarm.io/docs/support-and-ops/ops-recipes/{{ (index .Alerts 0).Annotations.opsrecipe }}{{- end }}
<
< {{ define "__queryurl" }}
< [[- if .MimirEnabled -]]
< [[ .GrafanaAddress ]]/alerting/Mimir/{{ .CommonLabels.alertname }}/find
< [[- else -]]
< {{ (index .Alerts 0).GeneratorURL }}
< [[- end -]]
< {{ end }}
<
59d39
< [[- if .MimirEnabled ]]
61,64d40
< [[- else ]]
< 🔔 Alertmanager {{ template "__alerturl" . }}
< 👀 Query: {{ template "__queryurl" . }}
< [[- end ]]

I would like some opinions before I continue in this direction, because I feel there are a lot of workaround here to get this config into a Secret and it could be easier to have it directly in code. Also does anyone remember what the ProxyURL is used for ? It seems to be OpsGenie related but I gladly ignored it and I have a feeling this could be important.

@TheoBrigitte TheoBrigitte requested a review from a team as a code owner December 10, 2024 09:05
@TheoBrigitte TheoBrigitte self-assigned this Dec 10, 2024
@TheoBrigitte TheoBrigitte changed the base branch from main to alertmanager-config December 10, 2024 09:06
send_resolved: true
actions: *slack-actions

- name: team_turtles_slack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maybe should go away

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True

@@ -0,0 +1,18 @@
{{`
{{ define "__alerturl" }}
`}}{{ .alertmanager.grafanaAddress }}{{`/alerting/Mimir/{{ .CommonLabels.alertname }}/find
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to replace this with the grafanaExploreURL or this queryFromGeneratorURL instead https://grafana.com/docs/mimir/latest/references/architecture/components/alertmanager/#templating right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'll look into this later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to see if we can get rid of this whole file. that would make things simpler imo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using those we go back to having long urls containing the alert query. I don't think we want to go back there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can discuss this sure :)

@TheoBrigitte TheoBrigitte force-pushed the alertmanager-config-helm branch from 4b4c161 to 11abd5f Compare December 10, 2024 15:18
@TheoBrigitte TheoBrigitte changed the base branch from alertmanager-config to main December 10, 2024 17:45
TheoBrigitte and others added 8 commits December 10, 2024 18:48
- Add secret resource, embedding raw and templated alertmanager files
- Expose alertmanager templates values as helm chart values
- Remove all Mimir related conditions
- Split template into url and notification templates
- Drop template directive, dynamically set by the operator
- Escape template in template
- Re-use slack actions
This fixes the infamous: error calling tpl: cannot retrieve Template.Basepath from values inside tpl function

It does use .Values in templates to access values and pass $ root context to tpl
Co-authored-by: Quentin Bisson <[email protected]>
@TheoBrigitte TheoBrigitte force-pushed the alertmanager-config-helm branch from 9d0c548 to 5d320b9 Compare December 10, 2024 17:50
* Team: {{ (index .Alerts 0).Labels.team }}
* Area: {{ (index .Alerts 0).Labels.area }} / {{ (index .Alerts 0).Labels.topic }}
* Instances:{{ range .Alerts.Firing }}
🔥 {{ if .Labels.instance }}{{ .Labels.instance }}: {{ end }}{{ .Annotations.description }}{{ end }}
{{- end }}

# This builds the silence URL. We exclude the alertname in the range
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to remove this silence URL right?

@TheoBrigitte TheoBrigitte enabled auto-merge (squash) January 15, 2025 12:06
@TheoBrigitte TheoBrigitte disabled auto-merge January 15, 2025 12:09
@TheoBrigitte TheoBrigitte enabled auto-merge (squash) January 15, 2025 12:44
@TheoBrigitte TheoBrigitte merged commit 510b5f4 into main Jan 15, 2025
9 checks passed
@TheoBrigitte TheoBrigitte deleted the alertmanager-config-helm branch January 15, 2025 12:52
TheoBrigitte added a commit that referenced this pull request Jan 15, 2025
TheoBrigitte added a commit that referenced this pull request Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants