[Bug]GrafanaDashboard resource created dashboards are not cleaned up when removed #1581

ak185158 · 2024-06-13T18:46:00Z

Describe the bug
A clear and concise description of what the bug is.

When a GrafanaDashboard custom resource is used to create/manage a dashboard instance, it is expected that resulting dashboard instance created in Grafana would be cleaned up when the resource is removed. This does not appear to be the case and results in stale/orphaned dashboard instances that persist.

Version
Full semver version of the operator being used e.g. v4.10.0, v5.0.0-rc0

v5.9.2

To Reproduce
Steps to reproduce the behavior:

Create a GrafanaDashboard custom resource
Verify the corresponding dashboard instance is created in Grafana from the GrafanaDashboard resource
Remove the GrafanaDashboard custom resource
Verify the dashboard instance persists even though the originating custom resource that created it has been removed

Expected behavior
Grafana-operator should remove the dashboard instance that was created by the custom resource once it is no longer present. Not doing so results in stale, orphaned dashboard instances once the underlying resource that created it is removed.

theSuess · 2024-06-17T10:16:27Z

Hey, I was unable to reproduce this issue. Maybe this has something to do with the permissions of your setup. How did you deploy the Grafana operator?

chaijunkin · 2024-06-25T10:18:46Z

I have similar issue when I deployed the grafana dashboard (operator managing) via argocd, not sure can I reproduce the step, but I will list them below
Step
1 - original dashboard
2 - upgrade dashboard version (change original folder path name and remove original dashboard)
3 - the dashboard is not deleted

mkyc · 2024-07-05T12:33:51Z

exactly the same issue here, but it is inconsistent. During tests I approach it on random occasions.

Here are steps to reproduce (I'm copying from my k3d setup script):

setup

install operator

kubectl create namespace pmon-grafana-operator || true
helm upgrade -i grafana-operator oci://ghcr.io/grafana/helm-charts/grafana-operator --version v5.9.2 --namespace pmon-grafana-operator --values grafana-operator.values.yaml --wait

grafana-operator.values.yaml:

serviceMonitor:
  enabled: true

install Grafana

kubectl create namespace pmon-grafana || true
kubectl apply -f grafana.yaml --namespace pmon-grafana

grafana.yaml:

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: grafana-var-lib-grafana-pv
spec:
  storageClassName: manual
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: /tmp/var-lib-grafana
...
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-var-lib-grafana-pvc
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
...
---
apiVersion: grafana.integreatly.org/v1beta1
kind: Grafana
metadata:
  name: grafana
  labels:
    dashboards: gitops
spec:
  deployment:
    spec:
      template:
        spec:
          containers:
            - name: grafana
              volumeMounts:
                - name: grafana-var-lib-grafana-pv
                  mountPath: /var/lib/grafana
          volumes:
            - name: grafana-var-lib-grafana-pv
              persistentVolumeClaim:
                claimName: grafana-var-lib-grafana-pvc
  service:
    spec:
      type: NodePort
    metadata:
      labels:
        app: grafana
  config:
    log:
      mode: "console"
    security:
      admin_user: root
      admin_password: secret
      disable_gravatar: "true"
    auth.anonymous:
      enabled: "false"
...

install Grafana resources:

kubectl create namespace pmon-grafana-resources || true
kubectl apply -f grafana-resources.yaml --namespace pmon-grafana-resources

grafana-resources.yaml:

---
apiVersion: grafana.integreatly.org/v1beta1
kind: GrafanaDatasource
metadata:
  name: loki-datasource
spec:
  allowCrossNamespaceImport: true
  instanceSelector:
    matchLabels:
      dashboards: gitops
  datasource:
    name: loki
    type: loki
    uid: loki1
    access: proxy
    url: http://lgtm-loki-gateway.pmon-lgtm.svc.cluster.local
    isDefault: true
    jsonData:
      timeout: 60
      maxLines: 1000
...
---
apiVersion: grafana.integreatly.org/v1beta1
kind: GrafanaDatasource
metadata:
  name: mimir-datasource
spec:
  allowCrossNamespaceImport: true
  instanceSelector:
    matchLabels:
      dashboards: gitops
  datasource:
    name: mimir
    uid: mimir1
    type: prometheus
    access: proxy
    url: http://lgtm-mimir-nginx.pmon-lgtm.svc.cluster.local/prometheus
    isDefault: false
...
---
apiVersion: grafana.integreatly.org/v1beta1
kind: GrafanaFolder
metadata:
  name: test-folder
spec:
  allowCrossNamespaceImport: true
  instanceSelector:
    matchLabels:
      dashboards: gitops

  # If title is not defined, the value will be taken from metadata.name
  title: lalala/lilili
  # When permissions value is empty/absent, a folder is created with default permissions
  # When empty JSON is passed ("{}"), the access is stripped for everyone except for Admin (default Grafana behaviour)
  permissions: |
    {
      "items": [
        {
          "role": "Admin",
          "permission": 4
        },
        {
          "role": "Editor",
          "permission": 2
        }, 
        {
          "role": "Viewer",
          "permission": 1
        }
      ]
    }
...
---
apiVersion: grafana.integreatly.org/v1beta1
kind: GrafanaDashboard
metadata:
  name: coredns-test-dashboard
spec:
  allowCrossNamespaceImport: true
  instanceSelector:
    matchLabels:
      dashboards: gitops
  grafanaCom:
    id: 15762
    revision: 18
...

result

as expected:

("Logs/App" is added manually to test if operator doesn't interfere with those).

remove

Option 1:

kubectl delete -f grafana-resources.yaml --namespace pmon-grafana-resources || true

with grafana-resources.yaml form previous step:

There are errors regarding loki datasource during reconciliation loop, but eventually those go away and are unrelated I guess.

Option 2:

kubectl delete --namespace pmon-grafana-resources GrafanaDashboard/coredns-test-dashboard

not even single log message and:

so nothing got removed, and it looks like operator didn't even noticed that resource was deleted.

But ... sometimes it works. If I run that same sequence of steps 3-5 times:

kubectl apply -f grafana-resources.yaml --namespace pmon-grafana-resources
kubectl delete --namespace pmon-grafana-resources GrafanaDashboard/coredns-test-dashboard

eventually it will start removing that dashboard:

and it will be adding and removing in next repeats.

It looks to me like that is operator not getting some events on removed dashboards sometimes. I didn't notice that for folders though, just for Dashboards.

pb82 · 2024-07-08T08:21:21Z

thanks @mkyc I'll try to reproduce from the provided steps now.

github-actions · 2024-08-08T01:27:24Z

This issue hasn't been updated for a while, marking as stale, please respond within the next 7 days to remove this label

Fantaztig · 2024-08-14T09:26:37Z

As I read the example by @mkyc the commands will delete both the dashboard and the containing folder at once, which leads to none of them being deleted in the instance.
This behavior looks to be the same as described in #1626, right? @ak185158 do you experience the same issue when deleting only the dashboard?

github-actions · 2024-09-14T01:27:42Z

This issue hasn't been updated for a while, marking as stale, please respond within the next 7 days to remove this label

github-actions · 2024-10-18T01:28:09Z

This issue hasn't been updated for a while, marking as stale, please respond within the next 7 days to remove this label

yurii-kryvosheia · 2024-10-25T09:29:37Z

We have recently added PVC to our instance and noticed some dashboards are still hanging in UI even though custom resources were deleted long ago. We guessed this related to persistence, though it wasn't confirmed after a few creation\deletion actions.
I can confirm it is inconsistent.

theSuess · 2024-11-12T08:41:35Z

With #1728, folder deletions are now forced which solves the issue

ak185158 added bug Something isn't working needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 13, 2024

theSuess self-assigned this Jun 17, 2024

theSuess added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 17, 2024

github-actions bot added the stale label Aug 8, 2024

github-actions bot removed the stale label Aug 15, 2024

github-actions bot added the stale label Sep 14, 2024

theSuess removed the stale label Sep 17, 2024

github-actions bot added the stale label Oct 18, 2024

theSuess removed the stale label Oct 18, 2024

theSuess closed this as completed Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]GrafanaDashboard resource created dashboards are not cleaned up when removed #1581

[Bug]GrafanaDashboard resource created dashboards are not cleaned up when removed #1581

ak185158 commented Jun 13, 2024 •

edited

Loading

theSuess commented Jun 17, 2024

chaijunkin commented Jun 25, 2024

mkyc commented Jul 5, 2024

pb82 commented Jul 8, 2024

github-actions bot commented Aug 8, 2024

Fantaztig commented Aug 14, 2024

github-actions bot commented Sep 14, 2024

github-actions bot commented Oct 18, 2024

yurii-kryvosheia commented Oct 25, 2024

theSuess commented Nov 12, 2024

[Bug]GrafanaDashboard resource created dashboards are not cleaned up when removed #1581

[Bug]GrafanaDashboard resource created dashboards are not cleaned up when removed #1581

Comments

ak185158 commented Jun 13, 2024 • edited Loading

theSuess commented Jun 17, 2024

chaijunkin commented Jun 25, 2024

mkyc commented Jul 5, 2024

setup

result

remove

pb82 commented Jul 8, 2024

github-actions bot commented Aug 8, 2024

Fantaztig commented Aug 14, 2024

github-actions bot commented Sep 14, 2024

github-actions bot commented Oct 18, 2024

yurii-kryvosheia commented Oct 25, 2024

theSuess commented Nov 12, 2024

ak185158 commented Jun 13, 2024 •

edited

Loading