-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
querying 'not in' #9
Comments
Hey, thanks for the detailed report! I was playing with this for a bit, and managed to narrow it down. In the end it's not even related to the use of external scanner, but just the hidden node. I will submit an issue to tree-sitter with minimal reproduction shortly :) |
Reported in tree-sitter/tree-sitter#1441! I actually found a reasonable solution, so will go ahead and use it :) |
@the-mikedavis it should works as expected now, let me know if you encounter any issues :) |
Beautiful! 🐱 |
Hi again 👋
So I'm pretty sure this is a bug in the tree-sitter query mechanism but I can't get the
not in
binary operator to match a query.I see in docs (which are lovely by the way :) that
not in
is parsed with the external scanner, so my guess is that something in the lexer (either inscanner.cc
or something in tree-sitter) is not getting the full information it needs to understand thenot in
token (byte/codepoint starts/stops maybe?). I also suspect it might be possible to fix this with extra rules in thegrammar.js
.Comparing the query results from a standard binary_operator like
in
withnot in
we see:(The
in.exs
can be replaced by other binary_operators such as++
to the same effect.)Looking at the parse trees, there's some peculiar behavior where
not in
doesn't show up but other binary operators do:And I think it's odd that
not in
is not there between the<identifier>
s 🤔. That code in thetree-sitter
CLI I think is this block:Which is why I suspect there might be some missing information about the start and stop bytes of the
$._not_in
rule. (Although as we'll see below, the parser does seem to know the start/stop bytes when$._not_in
is changed to be a non-hidden rule.)a possible but not-great workaround
One workaround which allows that
query.scm
to match (and therefore query/highlightnot in
the same as any other binary_operator) is to change thegrammar.js
's rule for$._not_in
to$.not_in
so as to unhide it. Then we see a parse result ofAnd the query matches!
$ tree-sitter query query.scm not-in.exs not-in.exs pattern: 0 capture: operator, start: (0, 2), text: "not in"
It also works as expected with arbitrary whitespace like
a not in b
.This seems pretty hacky to me though to throw an extra operator node in there just for the sake of making this match the query though.
I haven't dug too deep yet into the tree-sitter codebase yet to try to hunt down exactly what's going on. I thought I'd write out my findings here first in hopes that you might already know why this parses strangely and can't be queried.
The text was updated successfully, but these errors were encountered: