-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote read API not performant in v2.13 #9764
Comments
I saw fixes around remote read API not honoring hints. (ref) but I saw this perf issues during instant query so this is different issue than hinting fixes. |
The first thing that comes to my mind is that remote read supports two response types (specs):
Fetching samples is much slower than fetching the encoded chunks. Could you make sure that Thanos requests |
Could you also share the full trace |
@pracucci regarding Encoded chunks, I confirmed that encoded chunks is being used as response type and was introduced in thanos few years back (reference) Regarding full trace, I am still figuring out how to export it as full json, also we are issuing federated query for 100 of tenant which is far slower in remote read compared to range APIs . |
Trace-4f3442-2024-11-07 15_12_26.json Attaching a trace of an instant query which took 20+ seconds |
Thanks. I tried to load it in the Jaeger UI but doesn't work (apparently it's an invalid format for Jaeger). What format is the trace? Which application have you used to export it? Sorry for this ping-pong, but would be great if you could just give a me a trace that loads in the Jaeger UI. To test it in Jaeger you can run it with:
Then upload the |
@pracucci I downloaded it from Grafana UI, can you try visualizing it in Grafana. |
@rishabhkumar92 - I was unable to get Trace-4f3442-2024-11-07.15_12_26.json to load in Grafana Cloud, it fails with a parse error. Could you try exporting for Jaeger and/or send a trace that will upload into Grafana Cloud as an alternative? |
@mattsimonsen I was able to load the json in Zipkin UI to visualize trace, unfortunately we don't have a way to download trace which is compatible with Jaeger. https://github.com/openzipkin/zipkin?tab=readme-ov-file |
Sorry for the late reply. In the trace we can see that most nested item with an high latency is However, Since you mentioned that a range queries fetching the same raw series is fast, then I would guess it's not the ingester being slow, but the querier being slow. The remote read API is a streaming API. This means that everything is implemented in a streaming way. The querier may be slow because its CPU is saturated, or because the client who sent the API request is slow reading. As you theorised, one reason why the querier may be slow is just because remote read requests are not sharded. A single remote read request is executed single thread in the querier, so it doesn't even scale to multiple CPU cores. Remote reads are not that a common use case in Mimir, so we haven't invested into all the performance optimizations we've done for instant and range queries. Another theory, is the the client is slow reading. In this case, seeing an high latency on the remote read API endpoint is just a side effect of a slow client. As a test to exclude this option, you could try to run the same query using I hope this can give you some insights to let you further investigate it. |
Describe the bug
Hello Team,
We have been testing using thanos querier with Mimir remote read API and are seeing significant performance difference in range query v/s remote read. One of first thing I noticed was queries weren't sharded that might be contributing to majority of
To Reproduce
Steps to reproduce the behavior:
count(services_platform_service_request_count{namespace=~".*-staging$"}) by (namespace)
Expected behavior
This query is taking ~2 seconds to execute when query range API are used and it should take approx same time with remote read API too, however it took ~15+ seconds.
Environment
Additional Context
NA
The text was updated successfully, but these errors were encountered: