Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase default in-memory cache size or disable AST parsing #2227

Closed
smyrick opened this issue Nov 1, 2022 · 4 comments · Fixed by #2218 or #2230
Closed

Increase default in-memory cache size or disable AST parsing #2227

smyrick opened this issue Nov 1, 2022 · 4 comments · Fixed by #2218 or #2230
Assignees
Milestone

Comments

@smyrick
Copy link
Member

smyrick commented Nov 1, 2022

With the addition of #1878, by default in the gateway, we are now storing the full AST object in memory AND the query plan document for every operation. We turned on this feature without updating the default store size:

// Create ~about~ a 30MiB InMemoryLRUCache. This is less than precise
// since the technique to calculate the size of a DocumentNode is
// only using JSON.stringify on the DocumentNode (and thus doesn't account
// for unicode characters, etc.), but it should do a reasonable job at
// providing a caching document store for most operations.
maxSize: Math.pow(2, 20) * (approximateQueryPlanStoreMiB || 30),

Some enterprise customers are starting to see increased usage and scaling needs to now handle this change of usage which means a big jump in resources or time process cache evictions and the latter is causing some timeouts

If we are not going to increase the default cache size we might want to consider storing the AST in a separate store or turning it on with a flag so users of the parsed AST can opt in and can also be aware of the increased memory requirements

@pcmanus
Copy link
Contributor

pcmanus commented Nov 2, 2022

Yeah, this is probably my bad here for not thinking this through entirely, but I think the mistake was to make exposeDocumentNodeInFetchNode true by default: having those documents available is only useful to specific users that make explicit use of them, and it's easier to ask those few users to opt-in than increase the requirements to anyone else, forcing them to opt-out if necessary.

So my suggestion would be to use the upcoming 2.2 release to change that default (with a clear mention in the changelog): I reckon there is a risk of surprising someone relying on #1878, but 1) it's "for the greater good" and 2) it's unlikely to be widely used yet and we can explicitly ping the author of that PR.

That said, we can additionally improve the LRU cache size computation to check if the option is set and to increase its estimation in this case.

@pcmanus pcmanus self-assigned this Nov 2, 2022
pcmanus pushed a commit to pcmanus/federation that referenced this issue Nov 2, 2022
Since apollographql#1878 (added in 2.1.0), the full AST of subgraph fetch operations
has been store in query plans. However, this is not used by the gateway
by default, while it noticeably increases the size of query plans,
leading to either a less efficient query plan cache (if its size is
limited) or increased memory consumption/pressue (if the cache is given
plenty of size).

This patch disable this option by default. Users that explicitely
need to access the subgraph fetch ASTs can still make the trade-off
of additional memory pressure by manually using the
`GatewayConfig.queryPlannerConfig.exposeDocumentNodeInFetchNode` config
option.

Fixes apollographql#2227.
pcmanus pushed a commit that referenced this issue Nov 2, 2022
Since #1878 (added in 2.1.0), the full AST of subgraph fetch operations
has been store in query plans. However, this is not used by the gateway
by default, while it noticeably increases the size of query plans,
leading to either a less efficient query plan cache (if its size is
limited) or increased memory consumption/pressue (if the cache is given
plenty of size).

This patch disable this option by default. Users that explicitely
need to access the subgraph fetch ASTs can still make the trade-off
of additional memory pressure by manually using the
`GatewayConfig.queryPlannerConfig.exposeDocumentNodeInFetchNode` config
option.

Fixes #2227
@samuelAndalon
Copy link
Contributor

@pcmanus i wonder what was the reason to expose the AST of fetch nodes on first place, i would image some processing before sending requests to subgraph, but that comes with a big cost, which is the memory usage of the query plan store, also, if the plan for long term is to allow clients to define their own store implementation like redis, having to serialize and deserialize the query plan might cause a CPU bottleneck.

@pcmanus
Copy link
Contributor

pcmanus commented Nov 2, 2022

would image some processing before sending requests to subgraph

That's my understanding, some sort of "processing". I could imagine a number of things that could be helped by having the request in "usable" form: conditional logging based on the request, fine-grained gateway-side authorisation, some cost-based rate-limiting, etc... Truth be told, this was a user request (see #1352 for the original ask) and I don't have a personal rational for it, but the ask felt reasonable so I saw no reason to oppose it. It definitively was a mistake to enable it by default, and that is entirely my bad for missing/underestimating that problem.

but that comes with a big cost, which is the memory usage of the query plan store

While it's true, if you do need access to the request on sent, re-parsing it every time (which you'd have to do if it's not kept around) can be rather costly CPU-wise and potentially add latency to your queries if whatever processing you do is not async, and more memory usage could well be a worthy trade-off in such cases.

Again, enabling this by default was a mistake. But letting everyone pick whichever trade-off make the most sense for his specific use case, I'm ok with.

@pcmanus
Copy link
Contributor

pcmanus commented Nov 18, 2022

Closing as fixed by #2230.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants