-
-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus metrics cardinality for the /decisions endpoint #446
Comments
Yeah, that makes total sense - PRs welcomed! |
ok :) I'm fine giving an attempt to the implementation, but I would like, maybe, to discuss a bit beforehand the expected behaviour. I see at least a few ways to approach this, with more or less flexible solutions.
be enough? |
Yeah, absolutely! I also think we can default to true here. |
@aeneasr What about truncating query strings? I'm seeing high cardinality of metrics due to query strings. |
Describe the bug
More than a bug I would consider this a potentially unwanted behaviour.
My use case it to expose the request metrics by
endpoint
andstatus code
for a high level understanding of how the service is performing/accessed. ie. number of200s
,403s
,500s
and so on...Unfortunately the
ory_oathkeeper_request_*
metrics contain a very high cardinality label (request=
) for the/decisions
endpoint.This is due to the fact that, by design, this endpoint is extremely flexible and dynamic and that the uri includes all the unique parts of the original request.
When oathkeeper is used in a multitenant environment with multiple services the cardinality of the resulting metrics is almost unmanageable.
Reproducing the bug
Run a recent version of oathkeeper, ie. >
v0.38.1
Make sure the service is listening on the prometheus port:
use curl to send some requests to the
/decisions
endpoint, ie.curl the
/metrics
endpoint:You'll get an unique metric for each combination of
${service}
,${uri}
Expected behavior
It would be nice to have the possibility not to include the
uri
part in therequest
label to better control the metrics' cardinality.Up to a certain amount of metrics, this may even become an availability problem for the
/metrics
endpoint itself as the labels are potentially unboundedI badly hacked the code to demonstrate via a code example what would be the expected behaviour, and here it is:
Environment
The text was updated successfully, but these errors were encountered: