Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share query building #520

Closed
aspiwack opened this issue Jun 19, 2023 · 1 comment
Closed

Share query building #520

aspiwack opened this issue Jun 19, 2023 · 1 comment

Comments

@aspiwack
Copy link
Member

Is your feature request related to a problem? Please describe.

From past conversations, it seems that Query::new() (interpreting the query file that describes our formatter) is where we spend the most time when formatting a file (as today reminded us in a most spectacular fashion: #519 ). We should look for ways to mitigate it.

Describe the solution you'd like

In a way this ought to be expected: Tree-sitter's mode of operation is to be able to react interactively to change in the input file. Therefore, it'll be quite willing to spend a little time on startup to make calls to the grammar very snappy.

Therefore we should seek for a way to emulate this mode of operation ourselves: one compilation of the query for many calls to the query.

I can think of two alternatives:

  • We could cache the query on disk. We call the query once and for all, then each call to topiary need only read the compiled query on disk. This is a rather ideal solution, it's compatible with every potential usage pattern of Topiary (though it does require us to think about cache eviction). No startup time except that one first time. The problem is that compiled queries don't seem to have a serialised representation and exist only in memory. So it would require changes upstream, I think.
  • A less ambitious alternative is to have the command-line interface take a list of files. And format all the files in the list reusing the queries that we've already created (if we build files 10 by 10, we'll be roughly 10 times faster!). This works fine as long as we don't have to pass --language or --query which may differ per file.

Something we can probably do immediately as a mitigation is to share the query between the two formatting when checking idempotence, which would probably save quite a bit of time.

@aspiwack
Copy link
Member Author

I've just seen issue #443 . This issue is largely a duplicate of that one. It's also related to #539 .

I'll close in favour of those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant