Skip to content

Overriding Observability

Sam Barker edited this page Jul 1, 2022 · 3 revisions

Use Cases

  • As a developer, I want to test Observability changes

Method

  1. Fork https://github.com/bf2fc6cc711aee1a0c2a/observability-resources-mk/
  2. Create a branch on your fork for the changes.
  3. Override OBSERVABILITY_CONFIG_REPO to point at your fork.
  4. Override OBSERVABILITY_CONFIG_TAG to point at your branch
  5. Deploy kas-install as normal.
  6. Patch observability-stack to decrease the resyncPeriod using command oc patch observabilities.observability.redhat.com observability-stack --type='json' -p='[{"op": "replace", "path": "/spec/resyncPeriod", "value": "60s"}]'
  7. Develop your observability changes locally pushing your changes to your branch

Useful debugging commands

  • oc cp -c config-reloader prometheus-kafka-prometheus-0:/etc/prometheus/config_out/prometheus.env.yaml prom.yaml Copies the rendered prometheus config file from the config-reloader pod. The config-reloader is responsible for turning the operator resources into a prometheus config file.

Potential issues /fixes

  • On a clean install prometheus-kafka-prometheus-0 entered CrashLoopBackOff with 2/3 containers running. This was caused by
level=error ts=2022-07-01T02:11:57.511Z caller=main.go:290 msg="Error loading config (--config.file=/etc/prometheus/config_out/prometheus.env.yaml)" err="parsing YAML file /etc/prometheus/config_out/prometheus.env.yaml: relabel configuration for replace action requires 'target_label' value"

Despite fixing the configuration error and being able to see the corrected YAML in the PodMonitor definition the pod did not recover. Deleting and re-creating the pod was required to re-generate the configuration file.

  • PodMonitor YAML definitions have a different syntax to prometheus (and thus relabler.promlabs.com) for example
spec:
  podMetricsEndpoints:
      metricRelabelings:
        - source_labels:
            - __name__
            - state_ordinal
          target_label: state
          regex: kafka_broker_state;0
          separator: ;
          replacement: NOT_RUNNING
          action: replace

Is in valid and will result in:

spec:
  podMetricsEndpoints:
      metricRelabelings:
        - regex: kafka_broker_state;0
          separator: ;
          replacement: NOT_RUNNING
          action: replace

Which will make prometheus very un-happy. Obviously the fix is source_labels -> sourceLabels and target_label -> targetLabel

Useful links