-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query time is much worse if grammar is generated with tree-sitter-cli > 0.16.9 #109
Comments
Wow. That is quite a query. Is the slowness occurring when constructing the query, or executing it? I would think that the time to construct the query would go up with tree-sitter 0.17, because we’re doing additional analysis of the query during construction. But if you cache the query and run it repeatedly, I would hope that each execution is the same speed as before. Does that match what you’re seeing? There is probably more room for optimization in our up front query analysis. |
This aligns with what I'm seeing. By caching the query, do you mean constructing it and keeping the reference in memory? If so, that's different from what I'm doing, since I'm running from the CLI; there, the behavior is as described in the OP for repetitive runs.
Alternatively, there could room for optimizations or terseness for the patterns I'm using. A lot of the expressions are pretty much the same, only repeated for reaching the nested levels. Take this one, for instance:
Ideally I'd want to express it like this
|
There is no way to achieve that glob "however many levels deep" behavior yet, though I agree that would be a good feature. I will try to see if I can optimize query construction for queries like yours though. |
I've thought about this a bit more and refactored my approach. I was using the query to distinguish between identifier declaration from other occurrences on a "jump-to-definition" custom function I have, thus I only want to care about where it's first defined. I realized they can also be distinguished by hierarchically nesting lexical scopes (some blocks introduce scopes where identifiers are only valid inside of them, e.g. a function parameter), then noticing if the identifier occurrence is the first one on a ancestral scope (a lexical scope which contains the nearest parent scope). In other words I'm no longer bothered by this for the use-case I was using the query for, although it might have other applications (e.g. on highlighting queries). |
I am using nvim-treesitter and noticing much worse performance now compared to a week ago. Don't remember which version I was on before updating, but I noticed the slowing down after having to update tree-sitter to 0.19.3. I don't find the same performance issues with different languages like C or Python. When opening a buffer in nvim with typescript highlighting from tree-sitter, I am getting consistently 300+ ms for the |
I'm opening the issue here, specifically, because I'm not seeing such drastic slowdown with other grammars (i.e. they apparently have similar query performance regardless of the version).
I have this huge and really repetitive query which I use for querying identifiers inside specific constructs. When generating the grammar with tree-sitter-cli >
0.16.9
, the timings get much worse. I've observed that it's not related to the file size, nor due to how complex the file is, nor due to recent commits in this repository... After experimenting a bit, my gut impression is that "query optimization" got worse >0.16.9
- the timings stay pretty much the same between different versions if I remove the deeply-nested s-expressions.ts_query
For the following timings, I'm running the above query against the React Typings (source) which has 3169 lines.
If it's generated with
tree-sitter 0.16.9 (12341dbbc03075e0b3bdcbf05191efbac78731fe)
If it's generated with
tree-sitter 0.17.0 (b6fba7ca4c32207fa9b387b594a8da2ff66ee4be)
(and above)Earlier it was mentioned
On that note, I here's a query very similar to the one posted above, but for JavaScript instead
js_query
I did the same experiment running js_query against d3.js' source code (source) which has 19540 lines. In this case, regardless of the
tree-sitter-cli
version used, the timings are pretty much the same.To summarize, I'm noticing a strange behavior where (apparently only) this grammar, specifically, runs queries much slower if it's generated with tree-sitter-cli >
0.16.9
.The text was updated successfully, but these errors were encountered: