Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reverse tree-sitter query precedence ordering #9436

Open
the-mikedavis opened this issue Jan 26, 2024 · 16 comments · May be fixed by #9458
Open

Reverse tree-sitter query precedence ordering #9436

the-mikedavis opened this issue Jan 26, 2024 · 16 comments · May be fixed by #9458
Labels
A-language-support Area: Support for programming/text languages A-tree-sitter Area: Tree-sitter C-enhancement Category: Improvements E-medium Call for participation: Experience needed to fix: Medium / intermediate

Comments

@the-mikedavis
Copy link
Member

Currently our query precedence order follows the behavior of tree-sitter-cli (the tree-sitter-highlight crate) and is the reverse of the order followed by Neovim and now Zed. For example with a query like so:

((identifier) @constant
 (#match? @constant "^[A-Z][A-Z\\d_]*$"))
(identifier) @variable

Helix will currently treat the first pattern as a higher precedence than the second, so identifiers starting with capital letters will be highlighted as constants. In Neovim for example though, all identifiers would always be highlighted as variables since the later pattern is higher precedence.

In the next release, tree-sitter-highlight will reverse its ordering so that it follows Neovim / Zed. We should reverse our order as well so that queries are closer to being interoperable between editors. The ordering has always been a snag for language support PRs when translating queries from neovim anyways.

The actual change to helix-core/src/syntax.rs in order to reverse the precedence should be very small, and will probably be almost identical to the changes necessary upstream in tree-sitter (https://github.com/tree-sitter/tree-sitter/pull/2412/files#diff-538bdc046297d41717e7a863a54456d47829fb2e298963198cdfb32684500cdd), but we will need to modify all existing tree-sitter queries (runtime/queries/*/*.scm) to respect the new ordering.

Also see nvim-treesitter/nvim-treesitter#3944 (comment)

@the-mikedavis the-mikedavis added C-enhancement Category: Improvements A-tree-sitter Area: Tree-sitter E-medium Call for participation: Experience needed to fix: Medium / intermediate A-language-support Area: Support for programming/text languages labels Jan 26, 2024
@postsolar
Copy link
Contributor

If someone is willing to submit a patch for a grammar they're specifically accustomed with, when should that happen?

@the-mikedavis the-mikedavis linked a pull request Jan 28, 2024 that will close this issue
@the-mikedavis
Copy link
Member Author

I pushed up a branch in #9458: reverse-query-precedence-ordering. Feel free to PR against that branch for any languages you'd like to update

@postsolar
Copy link
Contributor

Thank you, I sent a PR for PureScript and might look into Haskell a bit later.
Another question though: could you give a link for this bit:

In the next release, tree-sitter-highlight will reverse its ordering so that it follows Neovim / Zed

I couldn't find that announcement in TS repo.

@the-mikedavis
Copy link
Member Author

I'm not sure if there's an announcement in the tree-sitter repo yet. I heard it in a matrix chat we have with some tree-sitter/neovim people

@tgross35
Copy link
Contributor

tgross35 commented Mar 3, 2024

Upstream TS did the release with reversed precedence order this week, at least that is my understanding of https://github.com/tree-sitter/tree-sitter/releases/tag/v0.21.0.

@daedroza
Copy link
Contributor

daedroza commented Mar 4, 2024

Hi @the-mikedavis ,

We should reverse our order as well so that queries are closer to being interoperable between editors. The ordering has always been a snag for language support PRs when translating queries from neovim anyways.

Does this mean that we can finally track upstream nvim-treesitter queries? What benefit are we bringing by doing our thing?

Please note that, there is no serious tone associated with my questions, just curious.

The primary reason I'm asking is there are issues with our limited support on treesitter, for example, make-range is not supported.

1-1 interoperability is important because that allows us to merge upstream and is supported by bigger community.

For example, in Java language,

The native functions definitions aren't considered as functions. Constructors are not considered functions. These two issues are already covered in upstream. But copying them as is not simple because we use inside/around instead of inner/outer. Additionally the make-range flexibility upstream has allows us to be more specific. I've fixed them locally but it sort of defeats the purpose of having OTB experience.

Same for the sticky context PR, writing rules for each language is a gruesome task, using the upstream rules will allow our efforts to help not just helix but every other editor using these rules...

@pascalkuthe
Copy link
Member

pascalkuthe commented Mar 4, 2024

There is some effort to unify but no it won't be a 1-1 correspondence

@tgross35
Copy link
Contributor

tgross35 commented Mar 9, 2024

The actual change to helix-core/src/syntax.rs in order to reverse the precedence should be very small, and will probably be almost identical to the changes necessary upstream in tree-sitter (https://github.com/tree-sitter/tree-sitter/pull/2412/files#diff-538bdc046297d41717e7a863a54456d47829fb2e298963198cdfb32684500cdd), but we will need to modify all existing tree-sitter queries (runtime/queries/*/*.scm) to respect the new ordering.

Where exactly does the change need to be made? It would be great to have before the next release so queries get the chance to work out kinks, I may be able to help.

Does this mean that we can finally track upstream nvim-treesitter queries? What benefit are we bringing by doing our thing?

It is getting significantly better - nvim made the change to use Helix's captures about a month ago nvim-treesitter/nvim-treesitter#5895. If this change to Helix lands then highlights.scm, locals.scm, and injections.scm will be more or less reusable, with a few exceptions (e.g. nvim uses var while helix uses variable)

@the-mikedavis
Copy link
Member Author

See #9458. I would prefer to have the change merged after a release so that the queries can have some time in master for people to catch bugs before a release. If you want to help out, feel free to open up PR(s) against that branch (for example #9476)

@hadronized
Copy link
Contributor

I’m following up on this as I want to benefit from the effort there (context here).

@sharpchen
Copy link

Would be helpful for script-porting colorschemes from neovim to helix I guess?

@the-mikedavis
Copy link
Member Author

This shouldn't have any effect on themes. Neovim uses slightly different captures than us so while this change will make it easier to port queries from Neovim, we won't be able to copy them without changes.

@sharpchen
Copy link

Apologies for the unclear description. What I intended to say is that I've noticed the treesitter scopes often vary from those in neovim at the same position in certain code, resulting in the ported theme appearing different in helix due to the selection of a different scope. Does this issue relate?

@the-mikedavis
Copy link
Member Author

Yeah that's not related: this covers how you write highlights queries (for example in runtime/queries/rust/highlights.scm) and shouldn't have any effect on the highlights we end up producing or the captures / theme keys we use. If this refactor goes well there shouldn't be any difference in the highlights before and after the change.

@sdoerner
Copy link
Contributor

See #9458. I would prefer to have the change merged after a release so that the queries can have some time in master for people to catch bugs before a release. If you want to help out, feel free to open up PR(s) against that branch (for example #9476)

24.07 has been released since. Any thoughts on merging this branch now, or do you want to wait another release?
Background is that I found some issues in master (for protobuf), and it would be nice to only have to fix them once.

@the-mikedavis
Copy link
Member Author

The branch to make this change isn't ready yet as it doesn't cover all languages. The protobuf queries are fairly simple so I think you should send a PR for the fixes you have in mind now. It should be easy to reverse the ordering for those afterwards (and #9458 doesn't reverse protobuf queries yet).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-language-support Area: Support for programming/text languages A-tree-sitter Area: Tree-sitter C-enhancement Category: Improvements E-medium Call for participation: Experience needed to fix: Medium / intermediate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants