-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus quantile metrics NaN #4254
Comments
Woops the last step should be: Also, in 1.2.0 I'm not seeing
The actual values show up there for ~10 seconds and then switch to NaN. My config has:
|
This appears to be the normal behaviour for summaries in Prometheus: the sum and count are eternal, but the quantiles expire and afterwards only contain NaN. The default for MaxAge in Prometheus is 10m, but for reasons not clear to me, in the go-metrics library used in Consul (that wraps the Prometheus client library), it's 10s: This means that you'd better use a scrape interval of 10s or less if you want to be able to capture quantile timings. |
Closing this issue out as it's expected behavior. |
but it's not a solution - there will be still "holes" in the data, and most of requests performed by prometheus will be useless. if someone needs to find out p99, there should be at least 100 requests (in this case he will need > 600rpm per instance) |
Overview of the Issue
Using the new
/agent/metrics?format=prometheus
in 1.1.0 we're seeing some of the quantile metrics reporting a value ofNaN
:Reproduction Steps
curl 127.0.0.1:8500/agent/metrics?format=prometheus
Consul info for both Client and Server
Server info
agent:
check_monitors = 0
check_ttls = 0
checks = 0
services = 0
build:
prerelease =
revision = 5174058
version = 1.1.0
consul:
bootstrap = false
known_datacenters = 1
leader = false
leader_addr = 10.142.15.195:8300
server = true
raft:
applied_index = 1374721
commit_index = 1374721
fsm_pending = 0
last_contact = 8.040834ms
last_log_index = 1374721
last_log_term = 6
last_snapshot_index = 1368148
last_snapshot_term = 6
latest_configuration = [{Suffrage:Voter ID:58c87947-99ec-81b6-0f43-0acd0e801823 Address:10.142.15.195:8300} {Suffrage:Voter ID:22f22641-84b4-418a-5533-e714e29d724b Address:10.142.15.193:8300} {Suffrage:Voter ID:b5f83794-8f13-af87-2c8a-6f2c791f5a70 Address:10.142.0.45:8300}]
latest_configuration_index = 1
num_peers = 2
protocol_version = 3
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Follower
term = 6
runtime:
arch = amd64
cpu_count = 1
goroutines = 87
max_procs = 1
os = linux
version = go1.10.2
serf_lan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 6
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 32
members = 5
query_queue = 0
query_time = 1
serf_wan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 17
members = 3
query_queue = 0
query_time = 1
Operating system and Environment details
Linux but seems platform independent
Log Fragments
I don't see any relevant logs for these metrics.
The text was updated successfully, but these errors were encountered: