-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Make schema serialisable and cachable #192
Comments
Looking at actual implementations like DataObjectScaffolderExtension in versioned, would we be talking one resolver per field here? They're one-liners, so that's pretty clumsy.
So
Agree, hopefully most of the legwork will be in all the config parsing/merging, ORM introspection and scaffolding, rather than constructing the Schema object.
Detail niggle: I'd prefer if we kept
Yeah you'd need to update all the |
No, certainly not. For the resolvers that are simply accessing properties, I think something like the static definition of resolver methods would make sense, but it needs thinking through.
Yes, depending on the performance gains. If caching the manager doesn't make enough of an impact, we might have to look at an alternative solution using something more primitive, perhaps, like a simple array. But I really think the manager is the lion's share of the work here.
Yes, no plans on putting
It's good for rapid prototyping. You can spin up a schema in some very simple procedural code. I'm actually leaning more toward keeping it in the docs as the introduction "look how easy it is" section rather than overwhelming the reader with writing a new single-method class definition. |
BenchmarksThere's been previous research on this topic, but for good measure, I wanted to get some numbers using the latest version using multi-schema. I've done some testing on lightly populated and densely populated schemas with varying numbers of types. We clearly have a lot to gain by implementing caching. Test conditions
Test query
Test results
Explanation
|
Discussed a bit more offline with Aaron:
|
On further thought, Something like |
Would generating the cache for the schema be a required step in order to use the site? If so, I think it should be bundled into either ?flush or dev/build, otherwise people will forget. If it's not and is an opt in enhancement that you can choose, then sure =) |
OK, I have a rough POC of this working. It took on a much different direction than outlined above, but it looks promising. High level
New BenchmarksThe results for all tests above are ~200ms in a local set up (Vagrant with NFS), and 90ms on an SSP virtual stack. Ugliness
Pull requestForthcoming. |
This has taken yet another left turn. Serialisation, while effective at reducing schema build time, was not performant enough for the big stage. The new implementation uses code generation to export your schema into a monolithic class where types are registered for the schema to lazy load. Lots to discuss, review, and argue about. Please have your say. |
so i noticed one of my data object with a couple of gridfield was very slow (cf screenshot)... could this be related to this? i've tried the GridFieldLazyLoader but it doesn't seem to help much (actually, it may even be slower because it's slipping the work across multiple requests that end up being slow that one request) |
Thanks @lekoala . This is a known issue: silverstripe/silverstripe-admin#700 . Those |
Ah great sorry :) so it's only the "types" call that is relevant to this issue? It's really sad to this all these calls being made it makes that whole back end much slower than it should. It's certainly one common complaints from my customers. |
Yes. There's some configuration that can be turned on to cache this response to a file that's served from your assets folder: The Also note that sessions in PHP are blocking by default which is probably why each request takes longer than the previous (assuming they're started at the same time). The actual execution time is probably going to be the time shown minus the previous request. |
@ScopeyNZ Ok I look into the SilverStripe\GraphQL\Controller::$cache_types_in_filesystem it seems easy enough to enable without too many downsides a on standard setup. When can it be "finicky"? You are quite right about the sessions. Do you have any idea how to close session early for stateless controller endpoints by default in the cms? That would indeed help a lot ! |
This has been merged into GraphQL v4 now (although in quite a different implementation). Rather than serialising the schema, we generate the code for it. |
Pull Requests
Overview
The performance limitations of this module are well documented and long-lived. Attempts to solve the issue have shown limited success. While the recent work to cache the resolver computations has been merged and provides notable benefits, this will never be a scalable offering until we can find a way around building the schema dynamically on every request.
The most obvious and least complex way to do this is to cache the schema and invalidate it when it changes. Those are both hard, because, respectively:
Suggested approach
This can be done with three reasonably-sized changes that would not break any APIs.
1: Remove all anonymous functions from the schema
An inventory of all the anonymous functions being used in type and query creators shows that most of them are paranoid countermeasures against
Manager
race conditions. Due to the declarative nature of the schema, the manager state is not always ready to define a query, and therefore, wrapping everything in a closure is the easiest safeguard. Most of these functions do not appear to be necessary if we implement lazy loading correctly.Lazy load types properly
A recent addition to the
graphql-php
API was thetypeLoader
property, which allows users to define a global accessor for all types:The only constraint is that type instances must be singletons.
Eliminate lazy loading of fields
Most of the closures around field definitions are protections against attempting to access type instances too early. A lazy loaded type layer would eliminate the need for most of these workarounds.
Encourage concrete resolvers
Right now, nearly all resolvers that ship with the module are defined at runtime. The benefits of moving these to class definitions would be twofold:
Before:
After:
Note that it is already possible to do this with the scaffolder. You can simply point the resolver at an instance, or a FQCN of a
ResolverInterface
.Attempts to serialise a field using a closure as a resolver will throw.
2. Make type and field creators serialisable
In the case of the scaffolders, we're already halfway there. All scaffolders implement the
ConfigurationApplier
interface, which means they can hydrate themselves from a known array structure. Creating the inverse ofapplyConfig()
-- something likedumpConfig()
would suffice, provided reciprocity between the two.dumpConfig()
would be a good place to throw if the scaffolder is using anonymous functions as resolvers.For tidiness, a new interface,
ConfigurationConsumer
could extend bothConfigurationApplier
andConfigurationExporter
without breaking anything.3. Cache and invalidate
Caching the
Schema
object itself probably has diminishing returns relative to the benefit of caching theManager
instance, especially consideringtypeLoader
would be incompatible with most caching strategies.Example:
Invalidating the cache hinges primarily on config. A hash of the entire config would suffice as a cache key for most graphql implementations.
However,
TypeCreator
andScaffoldingProvider
instances mutate the schema procedurally. Therefore, a more accurate cache key may be something like:Still, though, there is a chance that some code-level change to a dataobject could alter its effect on the schema -- for instance, a
get_extra_config()
change.Hashing all dataobjects and extensions would be too aggressive and still imperfect. With that taken into consideration, schema changes would require a ?flush.
Further thought
asset-admin
andversioned-admin
References
The text was updated successfully, but these errors were encountered: