-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser for Language Services and Editor Tooling #1426
Comments
pinging @wincent @asiandrummer due to their work on the graphql-language-service package. |
@bd82 AFAIK they both switched to new projects not related to GraphQL. My 2¢: It would be great to extend the current parser to support this features if you can do that without compromising |
ContextThe reason I am asking is because I've recently created a GraphQL sample grammar
FastIt will be fast, particularly on V8, See performance benchmark. I don't know if it will be as fast as the existing parser yet but more than fast enough. MemoryI am pretty sure the memory usage would be higher, firstly because all the Tokens are lexed in advance in a single flow and secondly because the recommended workflow relies on the automatic creation of an intermediate data structure (Parse Tree) which contains many arrays. It is likely possible to mitigate and optimize memory consumption if this would be the only issue remaining. Easily PortableI believe this is the biggest potential issue, Lets look at a single parsing rule to get a better understanding: Definition Rule in the spec.
Definition Rule in Chevrotain: $.RULE("Definition", () => {
$.OR([
{ ALT: () => $.SUBRULE($.ExecutableDefinition) },
{ ALT: () => $.SUBRULE($.TypeSystemDefinition) },
{ ALT: () => $.SUBRULE($.TypeSystemExtension) }
])
}) Definition Rule in the current Parser: function parseDefinition(lexer: Lexer<*>): DefinitionNode {
if (peek(lexer, TokenKind.NAME)) {
switch (lexer.token.value) {
case 'query':
case 'mutation':
case 'subscription':
case 'fragment':
return parseExecutableDefinition(lexer);
case 'schema':
case 'scalar':
case 'type':
case 'interface':
case 'union':
case 'enum':
case 'input':
case 'directive':
return parseTypeSystemDefinition(lexer);
case 'extend':
return parseTypeSystemExtension(lexer);
}
} else if (peek(lexer, TokenKind.BRACE_L)) {
return parseExecutableDefinition(lexer);
} else if (peekDescription(lexer)) {
return parseTypeSystemDefinition(lexer);
}
throw unexpected(lexer);
} It is clear to see that Chevrotain is much closer to the spec as it is more declarative.
I don't see how to resolve this issue. 😢 |
Yeah, in this particular case, we can't force people to reimplement That said I see a lot of potential in having Few questions I have:
If
Would be super cool if you can add short descritions and URLs to a matching paragraphs in http://facebook.github.io/graphql/June2018/#sec-Language
If you have time for that would be great to see. We have some basic perfomance test here: https://github.com/graphql/graphql-js/blob/master/src/language/__tests__/parser-benchmark.js |
Minified Chevrotain it is around 140k
Yes. unless there is something really unique here, we can build whatever data structure we want
Yes, but that won't automatically solve the tokenizing the entire input in advance problem. I will comment more later... |
Yes, Chevrotain grammars can be extended using inheritance semantics. There are also some dynamic extension abilities for Lexers,
That is a good idea, I am no good with HTML, but I should be able to modify the rule name headers to include a docs link next to their name. Although that is a lower priority.
I will experiment with this a bit. |
Chiming in just to say I’ve had a pretty good experience with chevrotain. I consider it to be a hidden gem of the JavaScript ecosystem, and would love to see it adopted in a larger project like this. It‘s easily one of the best parser libraries in terms of performance, the API is intuitive, its error recovery features can’t be found in (any?) other JavaScript libraries, and the the source code is easy to understand and written in TypeScript. If graphql-js decides to use a parser library I’d love to see it using chevrotain! |
@brainkim @bd82 Just to clarify: I personally think current That said I think it definitely worth to have to have a separate NPM package with compatible (API and AST format) parser that supports advanced features outlined in the initial comment and also allows the community to experiment with graphql syntax. |
Yeah, The requirement for easy portability would be mutually exclusive with it.
Should this separate library really produce the same AST? |
Not the same, but would be great if it would be compatible so we reuse
Don't have a lot of experience with parser myself, but I think community usually settles on some AST format as standard e.g. |
I will keep this in mind, but we need to remember that the AST used in graphql-js may not be suitable for all editor purposes (I don't know, I have not checked yet...). In terms of conversions I am not worried because the most basic level structure (Parse Tree) Have a look here: https://sap.github.io/chevrotain/playground/ at the default example,
Yes, but if I end up building such a thing I'll probably start with a structure suitable for my purposes (building editor services) and only consider standardization and alignment at a later time to reduce the scope of the project. |
@IvanGoncharov Where can I get many samples of GraphQL source code? I'd like to increase the quality of the parser I wrote and need samples for it. |
@bd82 Probably the most extensive and up to date source would be lexer/parser tests: |
UpdateSo I've created a separate project graphql-advanced-parser. Currently it only produces a low level Concrete Abstract Tree (no AST yet) BenchmarkThis project also includes a benchmark you can checkout and execute. Here are the results on my laptop:
graphql-js is obviously faster, but we are dealing with a ~1,000 lines input, Also Note that the Chevrotain based parser has not been optimized (yet) Next StepsI guess creating an AST, preferably one that is a strict superset of the graphql-js ast would be the next step. QuestionI see graphql-language-service-parser package on npm is fairly popular and used in all sorts of editor tooling around graphQL. You mentioned, earlier that this package's author has moved onto other things, does this mean that that package is no longer maintained and will slowly deviate from the newest specs? |
In some applications speed is important. AFAIK Relay stores GraphQL schema as SDL file and for some API such schemas can be extremely big, e.g. Facebook claims to have more than 15000 types in their schema. But as you initially pointed out current parser lack features that are critical for some applications. It would be ideal to have a parser which is both feature-reach and a fast however if it's not possible it would make sense to have two separate ones.
👍 It would be a great step forward.
It's already deviated, e.g. its lexer doesn't support the
Disclaimer: I'm not a Facebook emploee, just a very active external contributor with commit rights. It's a complex question: Ideally I would like to add as much as possible features into existing As for replacing |
In such an extreme scenario I would expect to avoid parsing graphql text (everytime) altogether and instead optimize by serializing the huge schema AST in an easier (faster) to read format (e.g JSON). Also keep in mind that parsing is just one step in the whole process, so parsing performance regressions do not translate directly to same percentages of overall performance regressions.
It may be possible to expand the capabilities of the existing parser while avoiding major performance regressions. For example, the TypeScript Parser by Microsoft supports both regular compilation flows and language services flows. There is bound to be some performance cost, but when using Chevrotain there may be additional performance costs due to using a more generic and abstracted tool.
I will have a look at this. |
@bd82: I work at Facebook and have run benchmarks on parsing: the actual parsing of GraphQL text is not much worse than just reading in a JSON AST. It's also a more compact and readable representation, so we often serialize to GraphQL's SDL instead of JSON when we have the choice between the two. But we frequently do operations on thousands of query and fragment definitions, in addition to our ten+ thousand types schema. We will not compromise the speed of |
Interesting that the de-serialization of the GraphQL text is so fast.
I am still a little unsure about the actual speed requirements (parser vs rest of the compiler ) but I don't have these inputs to bench and profile and it does not really matter from my perspective as portability concerns makes using Chevrotain as the default parser irrelevant.
I'll be looking into this, if I can create an AST that is a strict superset it could even be plug-able to the rest of graphql-js, But I worry that this AST would have to support partial results, so the rest of the compiler could break due to assumptions that certain properties always exist. My main focus is to create an AST that can be used for language services such as auto complete / go to definition / find usages ... |
See this example for a very basic plugin system allowing to overwrite parser (and lexer) rules externally from the parser. |
Hello.
I was wondering if supporting/enabling editor tooling and language services is in the scope of this project.
More specifically I was wondering if having a Parser that has low level capabilities for supporting editor tooling is within the scope of this project.
Such capabilities may include:
I saw there is work done on the graphql-language-service And there there seems to be a separate parser implemented there.
Cheers.
Shahar.
The text was updated successfully, but these errors were encountered: