-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase default in-memory cache size or disable AST parsing #2227
Comments
Yeah, this is probably my bad here for not thinking this through entirely, but I think the mistake was to make So my suggestion would be to use the upcoming 2.2 release to change that default (with a clear mention in the changelog): I reckon there is a risk of surprising someone relying on #1878, but 1) it's "for the greater good" and 2) it's unlikely to be widely used yet and we can explicitly ping the author of that PR. That said, we can additionally improve the LRU cache size computation to check if the option is set and to increase its estimation in this case. |
Since apollographql#1878 (added in 2.1.0), the full AST of subgraph fetch operations has been store in query plans. However, this is not used by the gateway by default, while it noticeably increases the size of query plans, leading to either a less efficient query plan cache (if its size is limited) or increased memory consumption/pressue (if the cache is given plenty of size). This patch disable this option by default. Users that explicitely need to access the subgraph fetch ASTs can still make the trade-off of additional memory pressure by manually using the `GatewayConfig.queryPlannerConfig.exposeDocumentNodeInFetchNode` config option. Fixes apollographql#2227.
Since #1878 (added in 2.1.0), the full AST of subgraph fetch operations has been store in query plans. However, this is not used by the gateway by default, while it noticeably increases the size of query plans, leading to either a less efficient query plan cache (if its size is limited) or increased memory consumption/pressue (if the cache is given plenty of size). This patch disable this option by default. Users that explicitely need to access the subgraph fetch ASTs can still make the trade-off of additional memory pressure by manually using the `GatewayConfig.queryPlannerConfig.exposeDocumentNodeInFetchNode` config option. Fixes #2227
@pcmanus i wonder what was the reason to expose the AST of fetch nodes on first place, i would image some processing before sending requests to subgraph, but that comes with a big cost, which is the memory usage of the query plan store, also, if the plan for long term is to allow clients to define their own store implementation like redis, having to serialize and deserialize the query plan might cause a CPU bottleneck. |
That's my understanding, some sort of "processing". I could imagine a number of things that could be helped by having the request in "usable" form: conditional logging based on the request, fine-grained gateway-side authorisation, some cost-based rate-limiting, etc... Truth be told, this was a user request (see #1352 for the original ask) and I don't have a personal rational for it, but the ask felt reasonable so I saw no reason to oppose it. It definitively was a mistake to enable it by default, and that is entirely my bad for missing/underestimating that problem.
While it's true, if you do need access to the request on sent, re-parsing it every time (which you'd have to do if it's not kept around) can be rather costly CPU-wise and potentially add latency to your queries if whatever processing you do is not async, and more memory usage could well be a worthy trade-off in such cases. Again, enabling this by default was a mistake. But letting everyone pick whichever trade-off make the most sense for his specific use case, I'm ok with. |
Closing as fixed by #2230. |
With the addition of #1878, by default in the gateway, we are now storing the full AST object in memory AND the query plan document for every operation. We turned on this feature without updating the default store size:
federation/gateway-js/src/index.ts
Lines 196 to 201 in 6bfec24
Some enterprise customers are starting to see increased usage and scaling needs to now handle this change of usage which means a big jump in resources or time process cache evictions and the latter is causing some timeouts
If we are not going to increase the default cache size we might want to consider storing the AST in a separate store or turning it on with a flag so users of the parsed AST can opt in and can also be aware of the increased memory requirements
The text was updated successfully, but these errors were encountered: