-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using tree-sitter queries for language indents instead of TOML #114
Comments
I initially tried doing that, but there's some tradeoffs: We need to traverse the nodes from the innermost one up to the root and check whether each node is considered an indentation level. tree-sitter queries can take a range to scan, but since we go all the way up to root that usually means querying the whole document. We're only able to get an iterator over all the matches on the QueryCursor so we can't traverse the tree directly in the same way we can just do it manually. We will also see nodes that could sit in a different subtree than the one we're looking at. nvim-treesitter circumvents this by holding a map of node ids, with I think the behavior also might not fully line up with nvim-treesitter's implementation, so while their queries are a good start they might require some tweaking. I tried looking at the parsed Query itself to see whether we could pull the scopes out, but that wasn't possible. I considered using the |
What are the performance implication of that. How large of a code file will it encountered an issue? Maybe it is insignificant, how much memory does it take? |
If doing this on a large file, there will be a lot of nodes. That's why |
Yeah, then we use the iterator. Will that be an issue? |
I think it's worth experimenting with at least. It's a major advantage being able to use the already written |
A cheaper way to detect indentation than traversing tree-sitter is to write a bit of language-agnostic heuristic code. In Led I do this: The basic idea is to track differences in leading whitespace between adjacent lines, and build a histogram of the increases (not decreases). And you essentially just pick the most numerous item from the histogram as your indentation. My implementation is pretty naive, and there's still a lot that can be improved about it (e.g. skipping blank lines, maybe slightly favoring more common indentations, etc.). But it works reliably even as-is, and I've basically had zero issues with it. At the very least, something like this as a fallback for languages that don't have a tree-sitter implementation would be nice. |
In kakoune https://github.com/mawww/kakoune/blob/master/rc/filetype/rust.kak#L118 is done, although it can handle most cases, there are a lot of cases still it cannot be handled, especially when the long lines are wrapped. I think a tree-sitter based approach could solve this issue. |
Yeah I think a general purpose fallback would be useful. I've left a placeholder in the code: helix/helix-core/src/indent.rs Lines 144 to 147 in c754df1
|
Awesome! Mind if I take a crack at implementing that bit? Since I already implemented it in Led, it should be pretty quick. |
Oh, looking at the actual code, that's not what I thought it was. I'm talking about detecting the indentation style of files on load (e.g. tabs vs spaces, and if spaces how many), which then get stored as a (user-modifiable) setting in the buffer.
(Edit: opened #239) |
Any update on this? This would be really convenient, since a lot of languages would get indent and highlight features |
We already use tree-sitter queries for highlights (check I explained my reasoning above: #114 (comment) Even if the queries were in .scm format they'd still be a separate set of queries from highlights.scm because they query for slightly different sets of nodes. |
This could probably be converted into a separate issue. |
@archseer I believe this can be closed, right? I don't think we will pursue this further. |
neovim solves indentation using indent queries, see Rust's indents.scm for an example. Given we're already leveraging tree-sitter for syntax highlighting, wouldn't it make sense to use it for indentation as well?
A big benefit of doing so, would be the ability to re-use already written
indents.scm
queries written by the nvim-treesitter project.The text was updated successfully, but these errors were encountered: