Expose sub-query DocumentNode to RemoteGraphQLDataSource#willSendRequest hook #1352

alexlopashev · 2022-01-05T13:44:53Z

Currently sub-query generated for each RemoteGraphQLDataSource needed to complete the query is passed to RemoteGraphQLDataSource as a string, which limits possible interactions with the query on #willSendRequest level, e.g. introducing custom permission checking on field/type level.

Generated DocumentNode is immediately serialized into string, see buildPlan.ts#L803, and exposing it won't cause additional performance toll.

In order to achieve this I propose:

Updating interface FetchNode to have operation field of type DocumentNode instead of string.
Change sendOperation to also accept query as DocumentNode instead of string.
Change type GraphQLDataSourceProcessOptions to have DocumentNode included:

export type GraphQLDataSourceProcessOptions<
  TContext extends Record<string, any> = Record<string, any>,
> = {
  /**
   * The request to send to the subgraph.
   */
  request: GraphQLRequestContext<TContext>['request'];
} & {
  document: GraphQLRequestContext<TContext>['document'];
} & ( ...

Finally, change GraphQLDataSource#process calls to fit new GraphQLRequestContext definition:

    const response = await service.process({
      kind: GraphQLDataSourceRequestKind.INCOMING_OPERATION,
      request: {
        query: stripIgnoredCharacters(print(source)), // updated line
        variables,
        http,
      },
      document: source, // new line
      incomingRequestContext: context.requestContext,
      context: context.requestContext.context,
    });

The text was updated successfully, but these errors were encountered:

csabakos · 2022-01-05T23:02:08Z

It's worth noting that we would also want to mutate the query (remove any fields that fail the authorization check).

pcmanus · 2022-01-10T11:12:35Z

I believe the original reason for keeping the query as a string in the query plan is to avoid having to reserialize it on every requests, as the change of point 4) above would entail. And I do think it's an "optimisation" worth preserving "in general", but of course that doesn't necessarily preclude from having different paths for different cases. That said, I'd like to dig into this a bit more first because ...

we would also want to mutate the query

I think this is a very very dangerous idea and I really don't think this is something federation should attempt to officially support.

Query plans can be somewhat complex, and the inputs of some of the queries in a plan routinely depend on the result of a previous query, so mutating the queries can extremely easily break things (and as federation evolves and we add more "smarts", this will not get better), and I can too easily see it breaking it in ways that are very hard to debug.

I could see how having the queries to services as a read-only DocumentNode could have uses for debugging, metrics collecting, etc... and I'm not strongly opposed to provide that out of the box (but for the reason above, I think that'd probably mean having both the string and document), but I'm again very much for saying that "mutating" the query is strictly unsupported.

I'll note that taking a step back, if I was trying to enforce permission/authorisation at the gateway level, I'd try to enforce it at the user-query level, before it gets passed to the gateway for query-planning/execution (the alternative being to enforce it within each services, though you can certainly mix both). But obviously you may have specific constraints I ignore and I'm happy to help dealing with those in a way that don't involve mutating the service queries :).

lennyburdette · 2022-01-10T21:06:59Z

Thanks for the detailed response @pcmanus ! I agree that mutating the subgraph request is fraught, especially when it comes to fields used in @key and @requires directives.

Taking a step back to look at all the possible Policy Enforcement Points (PEP) along the request path:

client -> network ingress -> node server -> apollo server -> query planner -> remote data source -> subgraph ingress -> subgraph server

here's my off-the-cuff evaluation of each:

client is obviously not a candidate, that would defeat the purpose.
Any operation mutation before apollo server would affect field usage reporting, which would have a negative impact in operation checks — Studio wouldn't know that redacted fields were referenced in operations and breaking change detection would fail.
- I've also thought about adding @skip(if: true) to fields instead of removing them from the operation. This might works, but any mutation at these components requires costly and redundant json and graphql serialization/deserialization work.
I've tried using an apollo server plugin to mutate operations. Not only did I have to disable the typescript checker to get around the readonly keyword, but it didn't work anyway because AS caches the operation AST.
The query planner doesn't have any functionality for this (but maybe it could?)
Any operation mutation after query planning could have undesirable and hard-to-debug effects on executing query plans (as Sylvain explains above.)
This customer mentioned that in their ideal architecture, the PEP would be in the subgraph ingress (as an Envoy plugin), because they have have clients that call subgraphs directly and don't go through the router. (I don't know how hard of a requirement this is though.)

After that exercise, it's not any clearer to me where the PEP should go 😁.

I see two options potential options for prototyping an authz solution:

Go with Sylvain's suggestion on enforcing field-level auth in the user query, but do so in a new hook that occurs after field metrics are collected and before query planning. I don't know the level of effort for creating this new hook, and whether it would live in apollo-server or @apollo/gateway.
Add field redaction to any PEP after query planning, but include special logic that avoids redacting any fields used in @requires or @key. I've worked on a redaction algorithm that supports an arbitrary predicate function. We could use the schema to build a set of Type.fieldName values that cannot be redacted because they're referenced in federation directives.

Long term, I like either option number 1, or building new functionality in the query planner for redacting fields from subgraph calls. This does not meet the requirement of enforcing authz for direct subgraph calls that skip the router, but that might just be out of scope for Apollo.

theJC · 2022-01-11T04:29:43Z

Thank you for this great analysis and feedback both Sylvain and Lenny, to help make sure we dont shoot ourselves in the foot.

I would like to +1 on the original ask for having the sub-query document AST exposed, but for read-only use only (for usage scenarios that doest need to mutate) to prevent customers having to pay the additional toll of re-parsing.

cpeacock added triage and removed triage labels Jan 26, 2022

theJC mentioned this issue May 21, 2022

feat: Expose document representation of sub-query request within GraphQLDataSourceProcessOptions so that it is available to RemoteGraphQLDataSource.process and RemoteGraphQLDataSource.willSendRequest #1878

Merged

pcmanus closed this as completed in #1878 Jul 1, 2022

pcmanus mentioned this issue Nov 2, 2022

Increase default in-memory cache size or disable AST parsing #2227

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose sub-query DocumentNode to RemoteGraphQLDataSource#willSendRequest hook #1352

Expose sub-query DocumentNode to RemoteGraphQLDataSource#willSendRequest hook #1352

alexlopashev commented Jan 5, 2022 •

edited

Loading

csabakos commented Jan 5, 2022

pcmanus commented Jan 10, 2022

lennyburdette commented Jan 10, 2022 •

edited

Loading

theJC commented Jan 11, 2022 •

edited

Loading

Expose sub-query DocumentNode to RemoteGraphQLDataSource#willSendRequest hook #1352

Expose sub-query DocumentNode to RemoteGraphQLDataSource#willSendRequest hook #1352

Comments

alexlopashev commented Jan 5, 2022 • edited Loading

csabakos commented Jan 5, 2022

pcmanus commented Jan 10, 2022

lennyburdette commented Jan 10, 2022 • edited Loading

theJC commented Jan 11, 2022 • edited Loading

alexlopashev commented Jan 5, 2022 •

edited

Loading

lennyburdette commented Jan 10, 2022 •

edited

Loading

theJC commented Jan 11, 2022 •

edited

Loading