You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running Elasticsearch on Kubernetes, we occasionally see the Elasticsearch cluster status getting stuck in phase ApplyingChanges:
The root cause of the Applying Changes status is due to #5979 with this error:
400 Bad Request: {Status:400 Error:{CausedBy:{Reason: Type:}
Reason:
Desired nodes with history [d7bf9e8e-47a0-40ad-8156-400ae519eb6c] and version [2] already exists with a different definition
We know how to workaround the issue by adding an annotation.
However I want to configure a Prometheus alert that fires whenever the Elasticsearch cluster is in status ApplyingChanges for more than 15 minutes.
The problem is that I cannot find any exported Prometheus metrics, for Elasticsearch, Elasticsearch Operator or Elasticsearch Exporter, which exposes the "phase" of ApplyingChanges
The cluster health metric indicates Green so isn't a reliable indicator of this failure state.
Does anyone know where I can get this ApplyingChanges status from in Prometheus?
Thanks.
The text was updated successfully, but these errors were encountered:
johnswarbrick-napier
changed the title
How to monitor for Elasticsearch cluster status in phase "Applying Changes" using Prometheus?
How to monitor for Elasticsearch cluster status in phase "ApplyingChanges" using Prometheus?
Jan 14, 2025
Hi -
When running Elasticsearch on Kubernetes, we occasionally see the Elasticsearch cluster status getting stuck in phase
ApplyingChanges
:The root cause of the
Applying Changes
status is due to #5979 with this error:We know how to workaround the issue by adding an annotation.
However I want to configure a Prometheus alert that fires whenever the Elasticsearch cluster is in status
ApplyingChanges
for more than 15 minutes.The problem is that I cannot find any exported Prometheus metrics, for Elasticsearch, Elasticsearch Operator or Elasticsearch Exporter, which exposes the "phase" of
ApplyingChanges
The cluster health metric indicates
Green
so isn't a reliable indicator of this failure state.Does anyone know where I can get this
ApplyingChanges
status from in Prometheus?Thanks.
The text was updated successfully, but these errors were encountered: