Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: avoid empty api_request_latencies_* entries #692

Merged
merged 1 commit into from
May 20, 2024

Conversation

vitrvvivs
Copy link
Contributor

@vitrvvivs vitrvvivs commented May 15, 2024

Prometheus was filling up unusually fast.

$ promtool tsdb analyze /prometheus
Label names with highest cumulative label value length:
1376859 exported_endpoint
47404 __name__
37378 id
34344 mountpoint
22659 name

Highest cardinality labels:
64210 exported_endpoint
1311 __name__
519 name
357 id
282 le

Highest cardinality metric names:
787440 api_request_latencies_bucket
65620 api_request_latencies_sum
65620 api_request_latencies_count
7620 nginx_ingress_controller_request_duration_seconds_bucket
7416 etcd_request_duration_seconds_bucket

That is 1.4GB per day just from the labels' text values on that one histogram metric (probably more for the table structure overhead).

So I queried api_request_latencies_bucket and found it was creating a new label for every unique request, including 404s.

exported_endpoint="/ Copy of "
exported_endpoint="/ /evil.com/"
exported_endpoint="/ /bin/cat /etc/passwd"
exported_endpoint="/ /interact.sh/"
exported_endpoint="/ Set-Cookie:crlfinjection/.."
exported_endpoint="/ ../web-inf/web.xml"
exported_endpoint="/!.htaccess"

Cause: While the comment was correct, that prometheus.Timer doesn't create a record until it's observed, WithLabelValues does, so timer := m.RequestLatencies(metricName) was creating a bunch of new labels for every spam request the server received. https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#HistogramVec.GetMetricWithLabelValues

@pro-wh
Copy link
Collaborator

pro-wh commented May 15, 2024

what will be the result of this? pre-create a histogram at the beginning of a request instead of at the end?

@vitrvvivs
Copy link
Contributor Author

vitrvvivs commented May 15, 2024

Instead of creating 14 (12 buckets + sum + count) metrics per unique 404 per day, this should put them all under endpoint="ignored" the same way the api_requests metric does it.

No, the problem is that it was accidentally creating a histogram at the start of every request, before it had sanitized the name.

Copy link
Collaborator

@pro-wh pro-wh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ok cool. and now we'll also get an "ignored" section for latencies too

@vitrvvivs vitrvvivs force-pushed the vitrvvivs/request-metrics-insane-cardinality branch 3 times, most recently from 824e751 to 62c72e7 Compare May 17, 2024 18:33
@pro-wh
Copy link
Collaborator

pro-wh commented May 17, 2024

woa did tests flake out?

--- a/tmp/nexus-e2e-expected-edenfast/db__contract_gas_use.csv
+++ b/home/runner/work/nexus/nexus/tests/e2e_regression/edenfast/actual/db__contract_gas_use.csv
@@ -38,6 +38,7 @@ emerald,oasis1qqjkthcmc78v606vklceh23mzhrylser4c8rs0kr,,
 emerald,oasis1qqjv74rz8e6v9q6yn7y76zvf0yv5299ejsaxnjqu,549264,
 emerald,oasis1qqkcynmpm5tj9thyfq0fg5k2slvdap9teu02n77d,,
 emerald,oasis1qqktf2lksvculs4xujqfrpfph6guqllydvucy6z8,35614,
+emerald,oasis1qqkyg72kffgjyxtcdv0qqrg029c0jtrd9ssz9p4m,,
 emerald,oasis1qqnu9prrj5nddz3ccpnm5vkq92j3cncy5vdasvsm,,
 emerald,oasis1qqqr6r25urm0y8dy5nj3sdwghv2q8h05r523cc4a,,
 emerald,oasis1qqrrud0dzcdhxfxcj4nj0ag4ydr7xdxxwqx2epy2,,
@@ -84,6 +85,7 @@ emerald,oasis1qzfekqu9xc6srvhm2gg7l37qhwjnwdm605txmcfr,146228,
 emerald,oasis1qzftgnpq09kj3745p4pdvdekgfjz8k24qg006rgf,,
 emerald,oasis1qzga6l8qt7fqj6yycza2yknur45yvzseyy7lqk5w,,
 emerald,oasis1qzj2hwfs3yjm20jz5h70yk3etsst0k8s3cnl6l08,0,
+emerald,oasis1qzjjq5zutzc047lzm5yxvnjp348zacxmvckzwqj0,,
 emerald,oasis1qzkrq8jrrt96fzxq8j0z88683e4vhf4a5cy3qrc5,,
 emerald,oasis1qzl7w57gags5wlmhhxe22h5tqx7ggw8jfq0ryr93,464617,
 emerald,oasis1qzmdhwdwfl4ycwckf4dqj4705h7kf4uj0ypz6k2m,,
diff --git a/tmp/nexus-e2e-expected-edenfast/db__evm_contracts.csv b/home/runner/work/nexus/nexus/tests/e2e_regression/edenfast/actual/db__evm_contracts.csv
index d75b513..2eee6ce 100644
--- a/tmp/nexus-e2e-expected-edenfast/db__evm_contracts.csv
+++ b/home/runner/work/nexus/nexus/tests/e2e_regression/edenfast/actual/db__evm_contracts.csv
@@ -38,6 +38,7 @@ emerald,oasis1qqjkthcmc78v606vklceh23mzhrylser4c8rs0kr,,6fc0ddfa5f061ab14f52ef63
 emerald,oasis1qqjv74rz8e6v9q6yn7y76zvf0yv5299ejsaxnjqu,,
 emerald,oasis1qqkcynmpm5tj9thyfq0fg5k2slvdap9teu02n77d,,9bf4ffba7a8bd1efeeaef57b71e183ad
 emerald,oasis1qqktf2lksvculs4xujqfrpfph6guqllydvucy6z8,,
+emerald,oasis1qqkyg72kffgjyxtcdv0qqrg029c0jtrd9ssz9p4m,,08c7610a303ab55ebe5a2fa0dc82b004
 emerald,oasis1qqnu9prrj5nddz3ccpnm5vkq92j3cncy5vdasvsm,,4873684c5e22769e0d76327ee3c2de2c
 emerald,oasis1qqqr6r25urm0y8dy5nj3sdwghv2q8h05r523cc4a,,448a1263e2f6d68c49a8ba3f0e104821
 emerald,oasis1qqrrud0dzcdhxfxcj4nj0ag4ydr7xdxxwqx2epy2,,184e087e271e4b969c501ba3750b24ff
@@ -84,6 +85,7 @@ emerald,oasis1qzfekqu9xc6srvhm2gg7l37qhwjnwdm605txmcfr,,
 emerald,oasis1qzftgnpq09kj3745p4pdvdekgfjz8k24qg006rgf,,672586d52e0b7f8745ca375d6881069c
 emerald,oasis1qzga6l8qt7fqj6yycza2yknur45yvzseyy7lqk5w,,1bbfce8db3733e2f0f2ce33de3298c5b
 emerald,oasis1qzj2hwfs3yjm20jz5h70yk3etsst0k8s3cnl6l08,,f15bcafe906358e49c4cbb2247bca0d9
+emerald,oasis1qzjjq5zutzc047lzm5yxvnjp348zacxmvckzwqj0,,c78ab33fc0c0047e7c894e14252f8305
 emerald,oasis1qzkrq8jrrt96fzxq8j0z88683e4vhf4a5cy3qrc5,,3402c18ed97fdae16f23a6f00f1e1ae6
 emerald,oasis1qzl7w57gags5wlmhhxe22h5tqx7ggw8jfq0ryr93,,
 emerald,oasis1qzmdhwdwfl4ycwckf4dqj4705h7kf4uj0ypz6k2m,,bd90d0504f98aa7ee773a198cefaf3dc

@vitrvvivs vitrvvivs force-pushed the vitrvvivs/request-metrics-insane-cardinality branch 2 times, most recently from e1b1b8f to 67c3ec9 Compare May 20, 2024 16:15
@vitrvvivs vitrvvivs force-pushed the vitrvvivs/request-metrics-insane-cardinality branch from 67c3ec9 to 2a76985 Compare May 20, 2024 22:14
@vitrvvivs vitrvvivs merged commit 6b402ac into main May 20, 2024
16 checks passed
@vitrvvivs vitrvvivs deleted the vitrvvivs/request-metrics-insane-cardinality branch May 20, 2024 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants