Query planning performance improvements #2610

pcmanus · 2023-05-31T14:00:57Z

This PR contains 2 relatively small (in patch size) performance improvement for query planning (performance in the sense of the time it takes to compute query plans):

the first commit memoize when possible the result of the FetchGroup.isUseless, which was showing near the top of profiling some slow planning. This optimisation is arguably a minor one in that this probably only help specific cases, but it consistently reduced query planning time by ~15% on at least one example, so probably worth the small additional complexity.
the 2nd commit is arguably more impactful: in some situation with multiple keys, the planner was not discarding some options that are trivially inefficient (see commit message and included test for details). Which while not incorrect per-se, was sometime multiplying the number of plans evaluated by a lot, leading to very inefficient query planning time. This change sometimes decrease query planning time by an order of magnitude (some plan that took close to 2s now takes less than 100ms).

netlify · 2023-05-31T14:01:00Z

👷 Deploy request for apollo-federation-docs pending review.

Visit the deploys page to approve it

Name	Link
🔨 Latest commit	`faf6e59`

changeset-bot · 2023-05-31T14:01:00Z

🦋 Changeset detected

Latest commit: faf6e59

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 7 packages

Name	Type
@apollo/query-planner	Patch
@apollo/query-graphs	Patch
@apollo/federation-internals	Patch
@apollo/gateway	Patch
@apollo/composition	Patch
@apollo/subgraph	Patch
apollo-federation-integration-testsuite	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

codesandbox-ci · 2023-05-31T14:01:49Z

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

query-planner-js/src/buildPlan.ts

clenfest · 2023-06-08T04:14:01Z

query-planner-js/src/buildPlan.ts

@@ -810,7 +813,21 @@ class FetchGroup {
    // key for that. Having it here saves us from re-computing it more than once.
    readonly subgraphAndMergeAtKey?: string,
    private cachedCost?: number,
+    // Cache used to save unecessary recomputation of the `isUseless` method.
+    private isKnownUseful: boolean = false,


Wouldn't it be better to cache the status either way? i.e. the type should be boolean | undefined and you can just return the value if it's not undefined and run isUseful() otherwise? Obviously you should change the variable name if you do that.

I don't think it'd be better. First, as I mentioned in one of the comment of the patch, when a group is shown to be useless, it is removed right away, so it doesn't really make matter whether that information is cached or not But adding to that, to properly cache the status either way, we'd have to add invalidation for that other way: that, changes to the group selection would have to invalid this cache, when currently only change to the inputs (which happens to be less frequent in practice) does so. So implementing the other would make things possibly less efficient overall.

clenfest · 2023-06-08T04:17:15Z

query-graphs-js/src/graphPath.ts

+// Given a list of just computed indirect paths and a field that we're trying to advance after those paths, this
+// method fields any path that should note be considered.
+//
+// Currently, this handle the case where the key used at the end of the indirect path contains (at top level) the field
+// being queried. Or to make this more concrete, if we're trying to collect field `id`, and the path last edge was using
+// key `id`, then we can ignore that path because this imply that there is a way to `id` "some other way" (also see
+// the `does not evaluate plans relying on a key field to fetch that same field` test in `buildPlan` for more details).
+function filterNonCollectingPathsForField<V extends Vertex>(
+  paths: OpIndirectPaths<V>,
+  field: Field,
+): OpIndirectPaths<V> {
+  // We only handle leafs. Things are more complex non-leaf.
+  if (!field.isLeafField()) {
+    return paths;
+  }
+
+  const filtered = paths.paths.filter((p) => {
+    const lastEdge = p.lastEdge();
+    if (!lastEdge || lastEdge.transition.kind !== 'KeyResolution') {
+      return true;
+    }
+
+    const conditions = lastEdge.conditions;
+    return !(conditions && conditions.containsTopLevelField(field));
+  });
+  return filtered.length === paths.paths.length
+    ? paths
+    : {
+      ...paths,
+      paths: filtered
+    };
+
+}
+


Can you add tests for this function?

Added some tests. Not directly for this function because I preferred keeping it non-exported/private, but for the function that directly call this (advanceSimultaneousPathsWithOperation) and for the behaviours that this method tackle.

Profiling of some slow query planning shows `FetchGroup.isUseless` as one of the hot path. This commit caches the result of this method for a group, and only invalid that cache when we know the result may needs to be recomputed. On the planning of some queries, this is shown to provide a 15% improvement to query planning time.

When a type has multiple keys, the query planning was sometimes considering an option where some key `x` was used to get field `y` but then key `y` was used to get that same `y` field from another subgraph. This is obviously not very useful, and we know we can ignore those paths as the 1st part of those path already does what we want. But considering those (useless) options, while harmless for correction, was in some case drastically increasing the number of plans that were evaluated, leading to long query planning times. In other words, this commit improve query planning times in some cases (and at times significantly).

pcmanus requested a review from a team as a code owner May 31, 2023 14:00

pcmanus self-assigned this May 31, 2023

clenfest approved these changes Jun 8, 2023

View reviewed changes

pcmanus added 2 commits June 20, 2023 10:12

pcmanus force-pushed the qp-optims branch from 324eeee to 97648ae Compare June 21, 2023 09:15

Review feedback

faf6e59

pcmanus force-pushed the qp-optims branch from 97648ae to faf6e59 Compare June 21, 2023 09:16

pcmanus merged commit 7ac8345 into apollographql:main Jun 21, 2023

github-actions bot mentioned this pull request Jun 21, 2023

release: on branch main #2634

Merged

github-actions bot mentioned this pull request Jun 30, 2023

release: on branch next #2597

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query planning performance improvements #2610

Query planning performance improvements #2610

pcmanus commented May 31, 2023

netlify bot commented May 31, 2023 •

edited

Loading

changeset-bot bot commented May 31, 2023 •

edited

Loading

codesandbox-ci bot commented May 31, 2023 •

edited

Loading

clenfest Jun 8, 2023

pcmanus Jun 21, 2023

clenfest Jun 8, 2023

pcmanus Jun 21, 2023

Query planning performance improvements #2610

Query planning performance improvements #2610

Conversation

pcmanus commented May 31, 2023

netlify bot commented May 31, 2023 • edited Loading

👷 Deploy request for apollo-federation-docs pending review.

changeset-bot bot commented May 31, 2023 • edited Loading

🦋 Changeset detected

codesandbox-ci bot commented May 31, 2023 • edited Loading

clenfest Jun 8, 2023

Choose a reason for hiding this comment

pcmanus Jun 21, 2023

Choose a reason for hiding this comment

clenfest Jun 8, 2023

Choose a reason for hiding this comment

pcmanus Jun 21, 2023

Choose a reason for hiding this comment

netlify bot commented May 31, 2023 •

edited

Loading

changeset-bot bot commented May 31, 2023 •

edited

Loading

codesandbox-ci bot commented May 31, 2023 •

edited

Loading