Log results of cluster health check #1203

danielmitterdorfer · 2021-03-09T12:33:45Z

When the cluster health check fails e.g. due to invalid setup where we
check for a green cluster on a single node but indices have been
configured with replicas, the benchmark appears to hang indefinitely.
The reason is that the cluster health check is retried by default but
there is no indication in the logs.

With this commit we log the result of the cluster health check. We also
raise the log level of retry log messages so users can analyze the
behavior by inspecting logs.

Closes #1150

When the cluster health check fails e.g. due to invalid setup where we check for a green cluster on a single node but indices have been configured with replicas, the benchmark appears to hang indefinitely. The reason is that the cluster health check is retried by default but there is no indication in the logs. With this commit we log the result of the cluster health check. We also raise the log level of retry log messages so users can analyze the behavior by inspecting logs. Closes elastic#1150

danielmitterdorfer · 2021-03-09T12:34:36Z

Here are some examples of the behavior from my local testing:

In the success case we see in the logs:

PID:10002 esrally.driver.runner INFO cluster-health: expected status=[green], actual status=[green], relocating shards=[0], success=[True].

When the cluster health status is not reached, we get:

PID:11154 esrally.driver.runner INFO [cluster-health] has timed out. Retrying in [0.50] seconds.

and additionally, the Python client issues a warning right before the log line above:

PID:11154 elasticsearch WARNING GET http://127.0.0.1:39200/_cluster/health/logs-*?wait_for_status=green&wait_for_no_relocating_shards=true [status:408 request:30.006s]

DJRickyB

This is great, thank you. Upping the log level for retry is a good idea

danielmitterdorfer · 2021-03-09T14:35:06Z

Thanks for the review!

danielmitterdorfer added enhancement Improves the status quo :Usability Makes Rally easier to use labels Mar 9, 2021

danielmitterdorfer added this to the 2.1.0 milestone Mar 9, 2021

danielmitterdorfer requested a review from DJRickyB March 9, 2021 12:33

danielmitterdorfer self-assigned this Mar 9, 2021

DJRickyB approved these changes Mar 9, 2021

View reviewed changes

danielmitterdorfer merged commit 5d93d2b into elastic:master Mar 9, 2021

danielmitterdorfer deleted the log-cluster-health branch March 9, 2021 14:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log results of cluster health check #1203

Log results of cluster health check #1203

danielmitterdorfer commented Mar 9, 2021

danielmitterdorfer commented Mar 9, 2021

DJRickyB left a comment

danielmitterdorfer commented Mar 9, 2021

Log results of cluster health check #1203

Log results of cluster health check #1203

Conversation

danielmitterdorfer commented Mar 9, 2021

danielmitterdorfer commented Mar 9, 2021

DJRickyB left a comment

Choose a reason for hiding this comment

danielmitterdorfer commented Mar 9, 2021