-
Notifications
You must be signed in to change notification settings - Fork 105
slow query log #982
Comments
+1 for the use case. Should an overview of slow queries be made available to our customers? I think that would be good, so that eventually they can self-serve. I rather rely on our jaeger integration since it solves much of this already and aligning with this project long term has other benefits (becoming more familiar with jaeger tracing is strategically useful, jaeger will improve and will gain new ways to query/analyze the data, etc). I see 3 options: A) querying the jaeger UI seems to solve the most urgent needs. though we don't want to expose this to our customers, and there are some bugs and limitations some of which are non-trivial: jaegertracing/jaeger#166, jaegertracing/jaeger#690, jaegertracing/jaeger#892 B) by routing all traces through a bus (e.g. their new kafka support see jaegertracing/jaeger#929) we'll able to write our own consumers allowing us to do our own processing, serve up a slow query overview, aggregated across all instances, etc. C) custom queries directly on the jaeger cassandra database |
looks like B would be easy to implement. per latest comments in the linked to ticket. |
There are two objectives here.
I am ok, with jeager, but it seems like a much larger project to be able to expose them to users. We need a more immediate solution. |
but then it's per mt instance, not across the cluster. |
the cortex guys told us we need to sample jaeger traces more aggressively, meaning until jaegertracing/jaeger#425 is implemented, we may randomly discard precisely those spans corresponding to slow queries. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
We often have customers overwhelming their instances with huge queries.
When this happens it is often difficult to track down what the queries are.
To make this easier, we would defined a "slow Query" limit, and keep a log of all queries that exceed the value.
We should then add an API endpoint to get the list of the slow queries.
Each log record should keep
The text was updated successfully, but these errors were encountered: