Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Replace "descendant unions" with fields to simplify querying #209

Closed
chillu opened this issue Feb 1, 2019 · 22 comments
Closed

RFC: Replace "descendant unions" with fields to simplify querying #209

chillu opened this issue Feb 1, 2019 · 22 comments

Comments

@chillu
Copy link
Member

chillu commented Feb 1, 2019

Overview

It's hard to sync an ActiveRecord-style inheritance tree with a type system that doesn't know about inheritance. We've tried to reconcile this by creating unions in Scaffolding object type inheritance. But the proposal was flawed, because it misunderstood how unions work in GraphQL. The code examples provided illustrate the assumption that fields shared by all types in the union can be queried without defining the type context. That would make sense, and is technically possible, but not part of the GraphQL language spec.

There are two problems with the current implementation:

  • It forces usage of fragments to avoid a large amount of field repetition (made worse by every subclass you add)
  • It forces recompilation of any frontend code with embedded GraphQL queries, whenever a new subclass gets added. Even if you don't need any fields from this subclass in your own GraphQL query.

I'm proposing that we flatten subclasses, and put them on a specialised extended field instead.

  • Pro: Doesn't require changes to GraphQL query when new types are added
  • Pro: Avoids repetition of fields
  • Con: Forces null fields on elements which are of a different ancestry
  • Con: Harder to work with since GraphQL clients now return a mix of toplevel and nested fields, when conceptually they should all be toplevel. This makes it a bit awkward to pass GraphQL results into other signatures like React prop types.

Example Before

Type system:

type SiteTree {
    id
    created
    title
    content
}

type Page {
    id
    created
    title
    content
    myField
}

type RedirectorPage {
    id
    created
    title
    content
    myField
    redirectionType
    externalUrl
}

union SiteTreeWithDescendants = SiteTree | Page | RedirectorPage

type readSiteTree {
  [SiteTreeWithDescendants]
}

Query:

query readSiteTree() {
  ...on SiteTree {
    id
    created
    title
    content
  }
  ...on Page {
    created
    title
    content
    myField
  }
  ...on RedirectorPage {
    created
    title
    content
    myField
    redirectionType
    externalUrl
  }
}

Example After

Type system:

type SiteTree {
    id
    created
    title
    content
    extends: SiteTreeDescendants
}

type Page {
    myField
}

type RedirectorPage {
    redirectionType
    externalUrl
}

union SiteTreeDescendants = Page | RedirectorPage

type readSiteTree {
  [SiteTree]
}

Query:

query readSiteTree() {
  id
  created
  title
  content
  extends {
    ... on Page {
      myField
    }
    ... on RedirectorPage {
      redirectionType
      externalUrl
    }
  }
}

Using SiteTree as an example is a bit misleading, because the query above still looks awkward. Most devs regard Page as the "pivot" in their data model, and would choose to express that in their GraphQL queries as well.

query readPage() {
  id
  created
  title
  content
  myField
  extends {
    ... on RedirectorPage {
      redirectionType
      externalUrl
    }
  }
}
@chillu
Copy link
Member Author

chillu commented Feb 1, 2019

Given that the current GraphQL 3.x implementation has been broken until two days ago, I'm not sure how many people are already relying on <type>WithDescendants. We're hitting this issue with silverstripe/silverstripe-versioned-admin#98 now, and need to find a solution for the current 4.3.x release line there. At the moment, I'm proposing to revert the "fix" for this regression which re-enabled <type>WithDescendants unions in scaffolds. That reintroduces the issue, but fixes the versioned admin.

We could then introduce this new signature in GraphQL 4.x. We'd need to change existing core queries (in SilverStripe 4.4 or 4.5), and everyone else needs to change their queries as well. That's annoying, but the reality of a young module in a young ecosystem.

@chillu
Copy link
Member Author

chillu commented Feb 17, 2019

Option A: New type fields

readPages {
  Title
  Content
  typeMyPage {
    MyField
  }
  typeMyExtendedPage {
    MyExtendedField
  }
}

Option B1: __extends pseudo-union

Same as original RFC post, but using underscores

readPages {
  Title
  Content
  __extends {
    MyPage {
      MyField
    }
    MyExtendedPage {
      MyExtendedField
    }
  }
}

Option B2: __extends real union

Same as original RFC post, but using underscores

readPages {
  Title
  Content
  __extends {
    ... on MyPage {
      MyField
    }
    ... on MyExtendedPage {
      MyExtendedField
    }
  }
}

Option C: Flattened properties

Just for sake of completeness, not advertising for that option. Note that ContentFul’s GraphQL API handles it this way. There are no “sub types” in their content model to begin with, so you don’t have that issue in the GraphQL typing. But it means when you create a type such as a “content block” in ContentFul, you either have different blocks as entirely different toplevel types (with potentially overlapping fields), or you end up with the same “null field depending on implied type” issue that we’ve got here due to Active Record style inheritance.

readPages {
  Title
  Content
  MyField
  MyExtendedField
}

@chillu
Copy link
Member Author

chillu commented Feb 18, 2019

I've done a bit of work on Option C, and it's a bit of a mess. We can flatten all the fields into a "base type", but then we still need to create other types. For example, a readPages query would only generate a Page type (which contains fields for all its subclasses). But when you have a has_one: RedirectorPage, should it create a RedirectorPage type, or use the existing Page type? Either option could be confusing to API consumers.

Option D: Interface shared between all types on union

This option is pretty close to what we already have, with one distinction: All types implement the interface of the base type. In our examples, that's Page. The union still works based on types (since unions don't support interfaces). But you don't need to know the full list of types just to query fields on the base type - which would solve our current problem. It's still a bit weird to query some fields on an interface, and other fields on the type.

Aside: This might make the lives of frontend devs harder when using fragments or unions on interfaces. We're already doing that by using unions though, so interfaces+unions doesn't make it any worse. And it sounds like Apollo gracefully handles that through heuristics, it can just be more efficient with more context on those schema relationships.

Type system

interface PageInterface {
    id
    created
    title
    content
}

type Page implements PageInterface {
    id
    created
    title
    content
}

type MyPage implements PageInterface {
    id
    created
    title
    content
    myField
}

type MyExtendedPage implements PageInterface {
    id
    created
    title
    content
    myField
    myExtendedField
}

union PageWithDescendants = Page | MyPage | MyExtendedPage

type readPage {
  [PageWithDescendants]
}

Query

query readPage {
  readPage {
    ...on PageInterface {
      ID
      Content
    }
    ...on MyPage {
      MyField
    }
    ...on MyExtendedPage {
      MyExtendedField
    }
  }
}

I've created a PoC with this on the webonyx implementation, it works: https://gist.github.com/chillu/0afd4e22e3c8b58f1d848288fa07892a

@chillu
Copy link
Member Author

chillu commented Feb 19, 2019

Further complications with Option C. Take the following hypothetical scaffolding config:

scaffolding:
  types:
    Page:
      fields: ['Content', 'MyExtendedField']
      operations:
        read: true
      nestedQueries:
        MyPageRelation: true
    MyPage:
      description: MyPage description
      fields: '*'
    MyExtendedPage:
      fields: []
  • Field selection: The flattened type for Page now has conflicting instructions: Should it include only the Content field, and none of the fields on subclasses? Does MyPage.fields: '*' indicate that the flattened Page type should make an exception for fields on this particular subclass? Should MyExtendedPage.fields: [] override Page.fields = [Content, MyExtendedField]?
  • Inferred Type creation: Assuming Page.MyPageRelation refers to the MyPage subclass, should that now create a new type MyPage, or use the existing Page type? If it's creating a new type MyPage, how does it tell which fields to include from Page for flattened fields?

We could disallow defining fields on parent classes (Page.fields = [MyExtendedField]). But it's still a bit weird to list "scaffolding types" which then don't actually create types. Which means settings like MyPage.description become meaningless as well. Or we could disallow defining subclass types if type flattening is enabled, unless these subclass types are exposed with their own operations. That needs to be the case even with type flattening for create and update operations though. You can use createPage() to create a PageInputType, but need createMyPage for the subclass.

I'm leaning towards Option D for that reason. It's going to be a bit too easy to type ...on Page rather than ...on PageInterface, which results in broken queries when you add a new page subclass. Particularly since that query will still work ... until it doesn't.

A middle ground could be that we go for Option D, but implicitly support Option C by allowing listing of subclass fields on the parent type, alongside with a "don't create descendant types" setting. It makes "is valid field" calculations a bit harder (you need to loop through all subclasses), but it's possible. But given we'd still have the "Inferred Type creation" issue above, I'm a bit cautious if it's even possible to do this consistently.

@ScopeyNZ
Copy link
Contributor

I like option D. The implicit support for option C just seems like complex overhead for some hard to define behaviour. It just feels like a trap. I don't think that's worth it...

Anyone who would prefer simpler queries can define their own query resolvers still. Is that a potential option E that we implement multiple options with an adapter interface and then allow schemas (or individual scaffolding) to indicate which type they want? Maybe that's going down the crazy path.

chillu added a commit to open-sausages/silverstripe-graphql that referenced this issue Feb 19, 2019
Approximating "Option C" in silverstripe#209.
Probably will be abandoned for an alternative "Option D" approach.
chillu added a commit to open-sausages/silverstripe-versioned that referenced this issue Feb 21, 2019
chillu added a commit to open-sausages/silverstripe-graphql that referenced this issue Feb 21, 2019
This enables more stable query creation with union types.
Instead of referring to specific types (e.g. "... on Page"),
you can use the relevant interface for that type (e.g. "...on PageInterface").
Since all types in the union share those interfaces, this results in less duplication
of fields in the query. It also makes the query resilient against adding new types
to the union (or subclasses to the DataObject), which would previously result in NULL query results.
@chillu
Copy link
Member Author

chillu commented Feb 21, 2019

Alright, PoC for Option D in https://github.com/open-sausages/silverstripe-graphql/tree/pulls/3/scaffold-interface-types and https://github.com/open-sausages/silverstripe-versioned/tree/pulls/1/scaffold-type-interfaces.

A few implementation notes:

  • Ended up creating interface types for each DataObject subclass (not just the root class). Which is more schema bloat, but has the advantage that we can give consistent advice to devs: Always use interfaces in your scaffolded DataObject queries, rather than only for the base class.
  • We're increasing the schema size (and generation time) because the existing type fields are now also placed in one or more interface types. I think we can cancel out the effect by not creating peripheral operations by default
  • Each DataObject interface type contains all fields from descendant classes, rather than just the fields defined on the specific class. Since those fields can include relations and fields from DataExtension, it's quite hard to define what fields belong to a class (I didn't find much help in ClassInfo). Suggestions welcome, it would reduce the overall schema size.
  • I'm still trying to get each Versions relation to have the same union type, which would cut down the schema size. See the current $excludeFields workaround in InterfaceScaffolder
  • This is likely a backwards compatible change which we can introduce in the 3.x branch. That's assuming that we treat the current behaviour as a regression (only fields on base class, no descendant types), and the previous behaviour as the default (descendant types). We'll need to forward port this to the GraphQL schema caching PR which won't be fun at all.
  • Versioned required changes in order to fit this (we can't have the Versions relation on an interface with different return types). Which had the nice side effect of reusing the new interfaces for versions as well. The resulting type system roughly looks like below.
interface PageInterface {
    Content: String
    Versions: [PageVersionWithDescendants]
}

interface MyPageInterface {
    Content: String
    MyField: String
    Versions: [PageVersionWithDescendants]
}

interface VersionInterface {
    Version: Int
    Author: String
}

type Page implements PageInterface {
    Content: String
    Versions: [PageVersionWithDescendants]
}

type MyPage implements PageInterface, MyPageInterface{
    Content: String
    MyField: String
    Versions: [PageVersionWithDescendants]
}

type PageVersion implements PageInterface, VersionInterface {
    Content: String
    Version: Int
    Author: String
}

type MyPageVersion implements PageInterface, MyPageInterface, VersionInterface {
    Content: String
    MyField: String
    Version: Int
    Author: String
}

type PageVersionWithDescendants = PageVersion | MyPageVersion

type PageVersion = PageVersion | MyPageVersion

query readPages {
    [Page]
}

@unclecheese
Copy link

unclecheese commented Mar 30, 2020

Alright, I'm ready to crack open this chestnut again.

On the status quo

I think the core problem here is that the scaffolded queries are non-deterministic. A declaration of a readProducts query may return Product or ProductWithDescendants, depending on adjacent conditions that aren't directly related to your GraphQL schema, and that's not great. A GraphQL schema if nothing else should be predictable. Having to recompile your frontend code because you added a subclass in the backend is inexcusable.

Option D, "always use interfaces"

Good that we're to adding some determinism and convention to these queries. The idea here is that every read query returns a union, no matter what? That's a good idea in principle, and it certainly solves the aforementioned problem, but I worry about the DX. It's not intuitive at all, it adds a burden of comms and documentation, and overall I think it positions Silverstripe weakly on the landscape of headless options.

I'm afraid it might be throwing a super clever solution at an uncommon problem with a cost of steepening the learning curve for devs.

A few overly broad-based assumptions:

  • Most apps are querying dataobjects, not pages
  • Most dataobjects don't have descendants
  • Most queries are for a specific type. Extended properties are an edge case.

In other words, that I have to do this:

query {
  readProducts {
    ... on Product {
      Price
    }
  }
}

in lieu of the more intuitive

query {
  readProducts {
    Price
  }
}

simply because it buys me resilience against a future subclass of Product seems gadgety and flawed.

Further, I've been playing with this stuff for a while, particularly as it pertains to Gatsby, and I have yet to find a common use case for querying extended fields from the base type. Even just using ORM code as an analogue, this isn't very common:

foreach (SiteTree::get() as $page) {
  $page->Title;
  $page->RedirectURL;
}

Typically, when you're querying from the base, you're building menus and indexes that only need the base properties.

Option D, all types are interfaces, with all their descendant fields

This one confuses me. What's the point of separating the types if they're just all-inclusive collections of their ancestral and descendant fields? What is the difference between Page and RedirectorPage in this example? I think you're basically making the case that types are only mapped to base classes, and queries for descendants merely apply filters to base queries.

type Page {
  ID
  Content
  MyPageField
  RedirectURL
}
type RedirectorPage {
  ID
  Content
  MyPageField
  RedirectURL
}

And then your queries just become:

readPages: [Page]
resolve: () => Page::get();

readRedirectorPages: [Page]
resolve: () => RedirectorPage::get();

This seems even more weird than the last option.

The case for Option B

Querying base types and asking for their descendant fields does happen, and there's a reason our ORM supports that approach, but if and when it does, I think it's an edge case. For that reason, relying on the __extends pseudo-union seems acceptable to me. Yes, it exposes too much of the ORM to GraphQL, and yes it creates an annoying destructuring problem for your frontend code, but I'd rather see those tradeoffs on the margins than punish the mainstream with the "always-a-union, always-use-interfaces" approach or the "all inherited fields" approach that is sure to stymie every single dev who tries out graphql for the first time, and perhaps many subsequent trials thereafter. It could easily become another "you forgot to flush your private statics" black hole.

@robbieaverill
Copy link
Contributor

I largely agree with what @unclecheese wrote.

In terms of an example: we have one in elemental. In the designs, each element in the list has a little summary/preview section. That might contain a summary of the content in a content block, or a thumbnail of the image in an image block. This doesn't work with GraphQL because the type is variable, so we built in a custom type in that module for a JSON serialised object, and treated it as a workaround that we were OK with because it's an uncommon scenario.

I think that we've previously encouraged things like this in SilverStripe development:

foreach (SiteTree::get() as $page) {
    if ($page->hasMethod('getLocation')) {
        return $page->Title . ', written from ' . $page->getLocation();
    }
    return $page->Title;
}

To me this is an anti-pattern - it doesn't really work with SOLID principles. You should instead add getLocation() to SiteTree so you always know it will be there, but return null when it does. I don't think we should build our GraphQL API in a way that also supports doing what's in that example, especially if it affects the standard dev experience as @unclecheese mentioned.

The example in the main post uses readSiteTree() and queries fields on various subclasses of SiteTree. I don't think I agree that this is a valid example of how we should encourage people to use our API (personally).

If you were building a frontend with GraphQL and elemental you'd know the structure of the page you want to render, so you'd be able to write a query for it. For elemental however, it needs to be able to work with any custom element type without requiring a re-compile of the JavaScript - this requirement is unique to us as module developers.

Ideally our GraphQL's primary consumers are website/app developers, who would prefer an intuitive and standard implementation without having to learn a unique data structure too much before using it. We'll obviously have a number of module developers using it in the CMS too, but I think these people are less of our focus in this regard.

If we do support this, I'd be in favour of the option B too. When it comes to module development, I think we can demand a little more from the API consumers.

@adiwidjaja
Copy link

I am working on using Silverstripe as a headless CMS (with NextJS/React) and stumbled on the "descendant union" problem when trying to render a page. For now I went back to a REST approach, but I really would like to use GraphQL.

In my case, querying base types with the fields of the descendants is not an edge case but the main reason to use GraphQL: With it, I have a typed API and can be sure what I will get from my backend. If I am using some kind of flattening as @unclecheese did in silverstripe-gatsby, I'm getting the same as in a untyped REST approach. Also, it's ok to just include everything in a static generator scenario, but not in my NextJS situation where the first api call is on the server but api calls thereafter are from the client.

The adequate example for me is not the rendering of a menu, but the rendering of a Page with Elements in it, with a query like this:

query readElements() { id ...on Text { content } ...on TextImage { content image } ...on Gallery { # Very different things } }

Then you could iterate the elements, create components via a factory and render the components.

Now, if I understand the options correctly, I am in favour of Option D, because it looks the closest to the standard. Maybe it's possible to make it an option in the scaffolding? So that you get the base object per default but you can mark an operation as "with descendants union"?

@adiwidjaja
Copy link

In my case, querying base types with the fields of the descendants is not an edge case but the main reason to use GraphQL: With it, I have a typed API and can be sure what I will get from my backend. If I am using some kind of flattening as @unclecheese did in silverstripe-gatsby, I'm getting the same as in a untyped REST approach. Also, it's ok to just include everything in a static generator scenario, but not in my NextJS situation where the first api call is on the server but api calls thereafter are from the client.

My use case has changed in the meantime with the transition from getInitialProps to getServerSideProps in NextJS 9.3. As the API now won't be called directly from the client anymore, but from the SSR, it's more or less ok to include all properties in the API result like in silverstripe-gatsby. So the requirement to query base types with the fields of the descendants is becoming an edge case for me, too.

@chillu
Copy link
Member Author

chillu commented Aug 31, 2020

@unclecheese Can you please sanity check that we still have a path to implementing Option B or Option D on your GraphQL v4 refactor?

I've since been working a bit more with Github's GraphQL API. They don't really have "subclasses", but lots of types with overlapping fields with liberal use of interfaces and unions. Interestingly, they have very few queries, for example there's no issues() query - you have either use search(type:ISSUE), or go really highlevel with node(id:<uuid>). The search() query is only workable because they've created a powerful DSL within the query string, and don't strongly type every potential search field in the query arguments. Some targeted queries return a single type, e.g. organization(). I think that's a good way to design an API around highly related data structures. Abbreviated GraphQL types to illustrate the concept.

interface Node {
  id:ID!
}

interface RepositoryNode {
  repository: Repository!
}

type Issue implements Node,RepositoryNode {
  id: ID!
  repository: Repository
}

type PullRequest implements Node,RepositoryNode {
  id: ID!
  repository: Repository
}

type SearchResultItem = Issue | PullRequest

query nodes(id:ID!) {
  [Node]
}

query search(query:String!, type:SearchType) {
  [SearchResultItem]
}

So if you translate this back into Silverstripe, Option D is effectively the search() or nodes() query, Option B is the organization() query. Neither of these is as unstructured as the silverstripe-gatsby schema though, which is just a glorified JSON dump. While I acknowledge that it's an easy way to get data into Gatsby, I'm not sure we should take that as our guiding principle for building expressive API defaults in silverstripe/graphql. Then we might as well go back to good old hand-tailored Controller actions with json_encode() calls ;)

The main argument for Option D I've seen here is rendering of blocks (as @adiwidjaja outlined), where you actually need type-specific fields on subclasses because they drive what's rendered. But maybe that's enough of an edge case to address @unclecheese concerns about DX by implementing this as Option B (__extends pseudo union).

Regarding the CMS use case (stable GraphQL queries without recompiling JS bundles, while supporting type additions through PHP): My hunch is that strong typing will actually get in the way for CMS UIs which need to support custom types with all of their fields (GridField, edit forms). Hence this discussion is somewhat tied into React GridField :/

@unclecheese
Copy link

Yeah, all really good points. This needs to be added to the upgrade guide.

In GraphQL 4, we're using Option B, and I think it's the best fit.

Before:

query readSiteTrees {
  ... on SiteTree {
    Title
  }
  ... on RedirectorPage {
    RedirectURL
  }
}

After:

query readSiteTrees {
  title
  __extends {
    RedirectorPage {
       redirectURL
    }
  }
}

More information here https://pulls-master-schemageddon--ss-docs.netlify.app/en/4/developer_guides/graphql/working_with_dataobjects/inheritance/

Neither of these is as unstructured as the silverstripe-gatsby schema though, which is just a glorified JSON dump.

This will likely change. Now that we have a better designed GraphQL API for silverstripe, I think the Gatsby module will just use the generic gatsby-source-graphql plugin and consume the GraphQL API as is, with maybe a few extra frills here and there, but that glorified JSON dump will follow the strong-ish typing outlined above.

@unclecheese
Copy link

unclecheese commented Mar 18, 2021

Update

There is a draft PR that I've opened that changes the chosen _extends approach to something more union/interface oriented. Having worked with it a bit in in real world, particularly with Elemental, it becomes apparent very quickly that _extends is cumbersome to maintain and adds a lot of domain-specific utility code that doesn't need to be there.

Additionally, working more with the Gatsby integration and trying to achieve some cohesion between the SS schema and Gatsby, you definitely see the case for a higher level of abstraction and fewer idiosyncrasies.

So with all that said:

Presenting Option E

This recommendation is probably the most abstract approach suggested so far, which is funny because the option we chose is the least abstract, relying on pseudo everything in the interest of staying concrete, so this is a real shift of extremes.

But bear with me. It comes down to a few things:

There are two kinds of models: Inherited and Standalone

An inherited model is just what it sounds like -- anything that has ancestors or descendants.

A standalone model is one that has no ancestors (parent = DataObject), and no descendants.

Inherited models are always abstractions

This may be the most controversial pillar of this proposal, but I think inherited models should never be expressed as concrete types. Here's why:

  • The polymorphism of a class like SiteTree or Page makes it inherently abstract. It isn't always what it says on the tin.
  • The inherited properties of a class like RedirectorPage makes it inherently abstract. It's a composite of many abstractions.

Of course, using this logic, you can argue that non-inherited (standalone) models are equally abstract because they inherit from DataObject, but I think that's where we can draw the line. The small handful of properties in the base class don't need to be abstracted away rigidly. I mean, who needs this:

query {
  readSiteTrees {
     ... on DataObject { id }
  }
}

We don't need to explicitly cross-pollenate fields in the inheritance chain

Right now, we have this:

type SiteTree {
  id: ID!
  content: String
  urlSegment: String
}

type Page {
  id: ID!
  content: String
  urlSegment: String
  bannerImage: Image
}

type BlogPage {
  id: ID!
  content: String
  urlSegment: String
  bannerImage: Image
  date: String
}

While this type of thinking maintains a high level of fidelity to the ORM we're used to using in PHP, it doesn't hold up well in GraphQL. Browsing the schema documentation is confusing. Fields are copied all over the place, and types are huge, particularly as you get further down the chain, making it very hard to find what you're looking for. As a dev, you're much more used to thinking of BannerImage as a Page than you are thinking of Content as a RedirectorPage field.

A more sensible representation is to isolate all the relevant fields to interfaces.

interface SiteTree {
  content: String
  urlSegment: String
}

interface Page {
  bannerImage: Image
}

interface BlogPage {
  date: String
}

Most queries return a union of abstractions

Imagine this data model:

SiteTree:
  Page:
    BlogPage
    NewsPage:
      EventPage
Product:
  SpecialProduct
  DiscontinuedProduct
Member

This will produce the following unions:

union SiteTreeUnion = SiteTree | Page | BlogPage | NewsPage | EventPage
union PageUnion = SiteTree | Page | BlogPage | NewsPage | EventPage
union BlogPageUnion = SiteTree | Page | BlogPage 
union NewsPageUnion = SiteTree | Page | NewsPage
union EventPageUnion = SiteTree | Page | NewsPage | EventPage
union ProductUnion = Product | SpecialProduct | DiscontinuedProduct
union SpecialProductUnion = Product | SpecialProduct
union DiscontinuedProductUnion = Product |  DiscontinuedProduct
union MemberUnion <---- Doesn't exist

Notice that some of these unions are identical. That happens when a node has only one direct descendant. It's probably not worth fussing over trying to eliminate the duplication.

readSiteTrees: [SiteTreeUnion]
readBlogPages: [BlogPageUnion]
etc..

The queries

query {
  readSiteTrees {
    ... on SiteTree { content }
    ... on Page {
      bannerImage { url }
    }
    ... on BlogPage {
       date
    }
  }
  readProducts {
     ... on SpecialProduct {
       specialField
     }
   }

  readMembers {
      firstName
  }
}

The DataObject interface

In practice, all these interfaces would implement the DataObject interface, which would just be whatever base fields you define in your config (currently just id, but could also include created, lastEdited, etc). As mentioned above, we would explicitly add these to each interface to eliminate the need to actually use DataObject in query language, and rather just keep things tidy by creating a common interface for something that is clearly common to everything.

interface SiteTree implements DataObject {
  id: ID!
  content: String
}

interface Product implements DataObject {
  id: ID!
  content: String
}

About naming conventions

I've deliberately avoided using the Interface suffix here because there's no reason to distinguish between an interface and a type anymore. A model can only be one or the other. It's a binary condition. SiteTree (type) and SiteTree (interface) can no longer coexist. So there's no need to pollute queries and create a confusing dev experience with all kinds of XXXInterface references.

Fragments FTW

In addition to many other benefits, this will really open the door to the use of fragments, which will become a lot more portable when they're only bound to an interface.

fragment navigationFields on SiteTree {
  id
  link
  title
  menuTitle
}

fragment baseFields on DataObject {
  id
  created
  lastEdited
}

Implementation

It's actually just a few minor changes to the PR I've linked above. Most of this work is already done. It's just a matter of reducing clutter.

@bummzack
Copy link
Contributor

What are the key differences between the new option E and option D? E seems to just use interfaces and unions, whereas D uses types and interfaces… is that it and what is being improved with that change?

From a frontend-dev perspective, I'm not too fond of the current implementation with _extend. It can result in fields being populated from another type with the same field name and it's also very verbose, especially when dealing with elemental blocks.

I'd really like to see how a query for a page with elemental blocks would look like with Option E.

@unclecheese
Copy link

Yeah, I think we're building a consensus that the chosen _extends approach is not great, and nothing showcases it better than Elemental, which relies on a single polymorphic query made up entirely of subclasses.

Difference between D and E is two things:

  • With E those types become interfaces.
  • We don't copy fields from one member of the inheritance chain to the next

Elemental queries look like this:

query {
  readOnePage(...) {
    ... on SiteTree {
      title
    }
    ... on Page {
      elementalArea {
        elements {
           ... on BaseElement {
               id
               title
           }
           ... on FileBlock {
               file { url }
           }
           ... on CTABlock {
               callToActionLink
               callToActionText
           }
        }
      }
    }
  }
}

@bummzack
Copy link
Contributor

Yup, I'd much rather prefer the above query/output… purely from a frontend-dev perspective 😅

I'm not very well versed in the whole GraphQL Schema thing, so take this with a grain of salt. Isn't using types (Option D) slightly easier to understand? Let's say you expose a public GraphQL API, but the consumers have no Idea about SilverStripe and the inheritance chain. Do they understand the content structure by just looking at the Schema with Option E?
That being said, it's probably an edge-case… and most likely you'd expose "standalone" Models on a public API.

If Option D and E are equally understandable without lots of background-knowledge about the inner workings of SilverStripe, then I'd probably opt for E. It seems to reduce boilerplate and is composable.

@unclecheese
Copy link

Yeah, that's why I made the point about naming conventions. As long as it's called SiteTree and Page, and BaseElement, etc., I don't think it matters that they're interfaces. If we had both BaseElementInterface and BaseElement, I think that would be a lot more confusing.

I think it's important to remember that most queries are strongly guided by the autocomplete of an IDE, too. Most devs aren't writing queries freeform.

@unclecheese
Copy link

unclecheese commented Mar 18, 2021

One more quick point -- you might be wondering, if queries just return a union of abstractions, then how will graphql know what type to use? Interfaces don't have resolvers. You need a concrete type to do the resolving.

Turns out, in graphql, abstractions and concretions are quite tightly coupled. Concrete types are aware of their interfaces, and interfaces are also aware of their implementations. Interfaces must provide a way of finding out what implementation they're currently on.

In fact, Gatsby will generate queries for your interfaces. query { allSiteTreeInterface }

@unclecheese
Copy link

Well it turns out I was completely wrong about option E. You can't do unions of interfaces. That's one level of abstraction too far for the GraphQL spec.

The pull request is now ready for review: #374

What we ended up with is pretty darn close to Option D, save a few details here and there.

@unclecheese
Copy link

unclecheese commented May 7, 2021

Oh no! Here we go again! This issue never seems to die.

But here's the thing:

There's a good chance we've been WAY overthinking this.

Screen Shot 2021-05-08 at 8 06 38 AM

According to the GraphQL docs, a query that returns an interface doesn't require inline fragments until you're asking for a field that's specific to an implementation (concrete type). This means we can add subclasses without breaking APIs.

Test case

type Query {
  readPages: [PageInterface]
}
models:
  Page:
    fields: '*'
query {
  readPages {
    title
    content
  }
}

Works.

Now I add a subclass:

class TestPage extends Page
{
  private static $db = ['TestField' => 'Varchar'];
}
query {
  readPages {
    title
    content
  }
}

Works. Because the only fields I'm asking for are part of the interface.

Now let's add some TestPage fields:

models:
  Page:
    fields: '*'
  TestPage:
    fields: '*'
query {
  readPages {
    title
    content
  }
}

Still works. Now let's query that field.

query {
  readPages {
    title
    content
    ... on  TestPage {
      testField
    }
  }
}

Still works.

As long as we always return interfaces, polymorphic queries are opt-in, rather than required (as in unions).

Tradeoffs

The dev experience suffers a bit because unlike unions, the intellisense doesn't know every implementation possible (it should, though?), so you don't get autocomplete for the ...on fragment, nor the directly queried interface fields.

@chillu
Copy link
Member Author

chillu commented Jul 15, 2021

@unclecheese You've solved this by merging #393, right?

@unclecheese
Copy link

Yup. Let's close this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants