Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): add plaintext matching fallback to tree-sitter matching #4288

Merged
merged 15 commits into from
Jun 5, 2023

Conversation

alevinval
Copy link
Contributor

@alevinval alevinval commented Oct 15, 2022

This patch introduces bracket matching independent of tree-sitter grammar. This matching is introduced as a fallback in cases where the tree-sitter matcher does not match any bracket. The fallback should provide a better experience in cases like the ones reported in #3614

Relates to #3584 as well

@alevinval
Copy link
Contributor Author

Let me know your thoughts about this.

@alevinval alevinval changed the title bracket_matching: add plaintext matching fallback to tree-sitter matching match_brackets: add plaintext matching fallback to tree-sitter matching Oct 15, 2022
@alevinval alevinval force-pushed the issue-3584 branch 2 times, most recently from d446f9c to 2f38d0b Compare October 15, 2022 12:15
@the-mikedavis the-mikedavis added A-core Area: Helix core improvements S-waiting-on-review Status: Awaiting review from a maintainer. labels Oct 15, 2022
@alevinval alevinval changed the title match_brackets: add plaintext matching fallback to tree-sitter matching feat(core): add plaintext matching fallback to tree-sitter matching Oct 22, 2022
@workingj
Copy link
Contributor

hi, any progress here possible?

@alevinval
Copy link
Contributor Author

@archseer thoughts on this PR?

@archseer archseer self-requested a review February 27, 2023 01:49
helix-core/src/match_brackets.rs Outdated Show resolved Hide resolved
helix-core/src/match_brackets.rs Outdated Show resolved Hide resolved
Comment on lines 4091 to 4386
if let Some(pos) = match_brackets::find_matching_bracket_fuzzy(syntax, text, pos)
.or_else(|| match_brackets::find_matching_bracket_current_line_plaintext(text, pos))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be quite what we want. This always perform the plaintext matching if the TS matching doesn't find a match. Meanwhile for plaintext files this won't work as in that case doc.syntax() is None (condition above).

I think we can do better for treesitter than the current implementation (which gets easily broken by comments for example). That solution might partially reuse your solution but would be constrained to not leave the current TS node. You can decide whether you want to do that in this PR or a followup PR, that would be the right fix for #3357

As it's implemented right now I think it's better to only us the plaintext matching for files without a TS grammar (so doc.syntax() would be None)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this fallback will never run when there's no grammar. However, most of the users of this editor will be coding, and likely have an active TS grammar. My hope is that adding this fallback now already brings improvement, and eventually it will be useless if the TS logic improves.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this makes the worst case (no matching bracket) a lot worse: first the TS lookup will fail, then this will scan backwards through the whole document (!!), failing too.

I think this is a great improvement for languages with no tree sitter grammar though.

I agree with Pascal:

  • find_matching_bracket_current_line_plaintext should only be used if there's no TS grammar available
  • If there's a grammar available, this scan should only happen inside the current TS node to limit how far we're attempting to scan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should settle it: adf0fbd

@pascalkuthe pascalkuthe added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from a maintainer. labels Feb 27, 2023
@alevinval alevinval force-pushed the issue-3584 branch 2 times, most recently from 50e0366 to a1bf10f Compare February 27, 2023 22:17
@archseer archseer added this to the next milestone Mar 10, 2023
@archseer archseer mentioned this pull request Mar 10, 2023
38 tasks
Comment on lines 4091 to 4386
if let Some(pos) = match_brackets::find_matching_bracket_fuzzy(syntax, text, pos)
.or_else(|| match_brackets::find_matching_bracket_current_line_plaintext(text, pos))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this makes the worst case (no matching bracket) a lot worse: first the TS lookup will fail, then this will scan backwards through the whole document (!!), failing too.

I think this is a great improvement for languages with no tree sitter grammar though.

I agree with Pascal:

  • find_matching_bracket_current_line_plaintext should only be used if there's no TS grammar available
  • If there's a grammar available, this scan should only happen inside the current TS node to limit how far we're attempting to scan.

helix-core/src/match_brackets.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@A-Walrus A-Walrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few style nits, other than that and what has already been said by @pascalkuthe and @archseer looks good!

helix-core/src/match_brackets.rs Outdated Show resolved Hide resolved
helix-core/src/match_brackets.rs Outdated Show resolved Hide resolved
@pascalkuthe pascalkuthe modified the milestones: 23.03, next Mar 27, 2023
alevinval added 8 commits June 1, 2023 15:35
…hing

This patch introduces bracket matching independent of tree-sitter grammar.
For the initial iteration of this feature, only match on the current line.
This matching is introduced as a fallback in cases where the tree-sitter
matcher does not match any bracket.

This fallback should provide a better experience to users that are editing
documents without tree-sitter grammar, but also provides a better experience
in cases like the ones reported in helix-editor#3614

If we find that this feature works well, we could consider extending it
for multi-line matching, but I wanted to keep it small for the first iteration
and gather thoughts beforehand.
- Introduce new variables only when needed
- Avoid calling twice `is_valid_pair` on worst-case scenarios
- Remove checking if we've closed all brackets on each iteration, this
  check only makes sense whenever we encouter a closing bracket.
Less overlapping variable names, now it should be even more clear
to read. Also added some extra testing, specially to verify that
this feature only works on the current line.
Move the decrement operation *after* checking if this is the last
bracket to close. This avoids a useless operation. When `is_valid_pair`
is entered, we already know we are closing a bracket, when there's only
1 bracket pending to be closed, we already know we're going to return
immediately anyway, no need to decrement.
- Remove constraint to work only on current line
- Improve iteration by properly using ropes
@alevinval
Copy link
Contributor Author

Folks! I've been out of the loop for some time. I got back into this, I've rebased latest master to make sure everything kept working, the pending feedback that was requested has been applied in: adf0fbd 715b928 and f5cb382

@archseer archseer requested review from pascalkuthe and archseer June 2, 2023 08:01
Copy link
Member

@pascalkuthe pascalkuthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor thing the rest lgtm

helix-term/src/commands.rs Outdated Show resolved Hide resolved
pascalkuthe
pascalkuthe previously approved these changes Jun 5, 2023
Copy link
Member

@pascalkuthe pascalkuthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@pascalkuthe pascalkuthe added S-waiting-on-review Status: Awaiting review from a maintainer. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 5, 2023
dead10ck
dead10ck previously approved these changes Jun 5, 2023
Copy link
Member

@dead10ck dead10ck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for your contribution!


let mut open_cnt = 1;

for (i, candidate) in chars_iter.take(MAX_PLAINTEXT_SCAN).enumerate() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor thing I'm noticing is that this doesn't limit the search to the current line only, as it advertises; it simply limits the scan to a max number of chars. I actually think this is fine, because limiting matching to the current line only seems unnecessarily restrictive to me, but we should probably update the function name and doc comment.

Copy link
Member

@pascalkuthe pascalkuthe Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it was changed to a fixed limit (from only matching on the current line) after feedback from @archseer and me. I missed that the doc comment wasn't updated good catch

Copy link
Contributor Author

@alevinval alevinval Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you folks use something to embed a constant inside a Doc? I wanted to do this:

diff --git a/helix-core/src/match_brackets.rs b/helix-core/src/match_brackets.rs
index 0ec040e5..07eb261a 100644
--- a/helix-core/src/match_brackets.rs
+++ b/helix-core/src/match_brackets.rs
@@ -73,9 +73,9 @@ fn find_pair(syntax: &Syntax, doc: &Rope, pos: usize, traverse_parents: bool) ->
     }
 }
 
-/// Returns the position of the matching bracket under cursor, the search
-/// is limited to the current line. This function works on plain text, it
-/// ignores tree-sitter grammar.
+/// Returns the position of the matching bracket under cursor.
+/// This function works on plain text and ignores tree-sitter grammar.
+/// The search is limited to [`MAX_PLAINTEXT_SCAN`] characters.
 ///
 /// If the cursor is on the opening bracket, the position of
 /// the closing bracket is returned. If the cursor on the closing

But that would require making the constant public for the docs to link to it correctly. If you are okay with that anyway (it works, but does not link) then I can push this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems fine to me 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove the angle brackets SL that cargo xtask docgen doesn't complain. The link doesn't matter much. The docs are not build anyway.

@alevinval alevinval dismissed stale reviews from dead10ck and pascalkuthe via 5fefbe7 June 5, 2023 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-core Area: Helix core improvements S-waiting-on-review Status: Awaiting review from a maintainer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants