Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree-sitter rolling fixes: 1.115 edition #941

Merged

Conversation

savetheclocktower
Copy link
Contributor

@savetheclocktower savetheclocktower commented Mar 2, 2024

A bit late getting this one in! It's not that we're all good and there's nothing to fix — just that I've had a busy couple of weeks. More to come.

Changelog entries

  • Improved accuracy of indentation hinting in modern Tree-sitter grammars, especially in multi-cursor scenarios.
  • Improved the ability of the user to opt into a specific kind of grammar for a specific language.
  • Changed the behavior of the grammar-selector package so that it will show the user's preferred grammar for a specific language.
  • Updated to version 0.20.9 of web-tree-sitter. (Unsure if this is worth mentioning in the changelog; I'll leave it up to others.)
  • Improved syntax highlighting, indentation, and code folding in various languages, including TypeScript, shell scripts, Ruby, and C/C++.

@savetheclocktower savetheclocktower marked this pull request as draft March 2, 2024 01:16
…at ends of transactions — instead of falling back to a transaction-wide auto-indent quite so easily.
…but only while we can afford to spend time on re-parsing.

Synchronous indentation hinting is clearly the best choice, but it can theoretically be very costly. So we'll set a time budget per transaction — currently 10ms, but could be adjusted up or down. We'll start out doing synchronous indentation, but flip to async indentation if we run out of time in a given transaction. At the end of a transaction, the time budget resets.

This allows us to balance indentation accuracy with editor responsiveness, and would even allow us to expose this tradeoff as a setting in the future. The current threshold, 10ms, would probably result in one dropped frame if exceeded, but not two.

Right now, any one parse can exceed the budget — because we don't set a timeout on the parse the way we do with an async parse. But this could be changed in the future. The main goal here is to prevent a catastrophic scenario where a complex transaction locks up the editor.
…if the user has opted into them via _only_ a scope-specific setting.
* When set to `true`, `hideDuplicateTextMateGrammars` will hide all grammars except whichever one the user has indicated a preference for — via `useTreeSitterParsers` and `useLegacyTreeSitter` settings, whether global or scope-specific.

* When set to `false`, `hideDuplicateTextMateGrammars` will show all grammars, even Legacy Tree-sitter.
…to reflect new behavior of `GrammarRegistry#getGrammars`.
…when the user's preferred type of grammar isn't available for a particular language.
@savetheclocktower
Copy link
Contributor Author

I needed to revisit some changes I’d made to the grammar-selector package and wanted to explain them just to make sure we’re everybody’s cool with them.

When the old Tree-sitter setting (core.useExperimentalTreeSitter) was replaced with the new setting (core.useLegacyTreeSitter), I tried to invert how that logic affected grammar-selector. In effect, though, I made it so that legacy Tree-sitter grammars rarely got shown at all unless the user had them enabled globally.

I’d also been telling users that they could opt into a legacy Tree-sitter grammar on a language-specific basis, but we didn’t have enough testing around this, and this wasn’t working in most cases because we weren’t even trying to load the legacy Tree-sitter grammars unless core.useLegacyTreeSitter was enabled globally. (That’s now fixed in this PR as well.)

So let me show some screenshots to explain the before-and-after here.

Before: showing only one grammar by default

The grammar-selector.hideDuplicateTextMateGrammars setting made it so that the grammar selector showed only one grammar per language, and it was the Tree-sitter grammar if that grammar existed. Otherwise it was the TextMate grammar.

Screenshot 2024-03-03 at 11 48 37 AM

You can see that the Tree-sitter grammars have a badge marking them as such — even though there’s no real reason to show that badge when this setting is enabled, because it’s not needed to distinguish the grammar from anything.

After: showing only one grammar by default

The big change in this PR is that grammar-selector now respects your scope-specific settings. Suppose you’ve still got modern Tree-sitter grammars enabled globally, but you’ve added this block to your config:

".source.c":
  core:
    useLegacyTreeSitter: true

When compiling a list of grammars to show you, grammar-selector will now order them in terms of your stated preferences. If hideDuplicateTextMateGrammars is checked, only the one you’ve opted into will be shown.

So in this list…

Screenshot 2024-03-03 at 11 49 39 AM

  • the “C” entry in the list will refer to the legacy Tree-sitter grammar, and
  • all other entries in the list will refer to the modern Tree-sitter grammar (if it exists) or the TextMate grammar.

The other big change here is that there are no badges. Since we’re showing only one grammar for each language, the user can infer that it’s the entry for the grammar they’ve opted into, either globally or scope-specifically.

And since we have three types of grammars now instead of just two, I think we’re past due to rename the hideDuplicateTextMateGrammars setting. Except we won’t rename the setting itself — too much hassle! — but will instead give it a simpler name in the settings: “Hide Duplicate Grammars.”

Before: showing all grammars

When hideDuplicateTextMateGrammars is unchecked right now, we’re doing something that I initially thought was a good idea — but I’ve now changed my mind.

Screenshot 2024-03-03 at 11 49 00 AM

You can see here that the grammar selector shows two C grammars. But we have three such grammars, don’t we? Except that the legacy Tree-sitter grammar is not being shown.

I think my rationale was that this flipped the logic we had before — when modern Tree-sitter grammars were opt-in, we didn’t show them to you unless you told us you wanted to use them. So now that legacy Tree-sitter is opt-in, we should hide them unless you tell us you want to use them, right?

Well, no. I no longer feel this way. “Experimental” and “deprecated” are not equivalent in this comparison. And if someone reports a bug with a modern Tree-sitter grammar, I want to be able to tell them to uncheck hideDuplicateTextMateGrammars, choose a different grammar type in the selector, and show me whether the problem exists in that grammar.

After: showing all grammars

In this PR, if a user says they want to see all grammars, we’ll show them all grammars:

Screenshot 2024-03-03 at 11 49 59 AM

Right now, that means two different kinds of Tree-sitter grammars alongside the TextMate grammars, ordered the way the user has indicated through their preferences.

The screenshot above doesn’t show this, but if I had opted into the legacy Tree-sitter grammar for C files with the configuration snippet above, then that grammar would be listed first out of the three C grammars in the list.

The change I made above means grammar-selector hardly cares about the user's global values for core.useTreeSitterParsers and core.useLegacyTreeSitter, but this screenshot illustrates the one place it still matters. If you've got core.useLegacyTreeSitter enabled globally, then the “Legacy Tree-sitter” badge will be green (i.e., your UI theme’s “success” color), and the “Modern Tree-sitter” badge will be yellow/orange (i.e., your UI theme's “warning” color). When it's disabled, the colors are inverted. I felt like it was good to be consistent with this choice across all items in the list rather than change colors based on the user's preference for a particular language.

When we get rid of legacy Tree-sitter grammars for good, this logic will be a bit simpler.

Takeaway

Since the default value for hideDuplicateTextMateGrammars is true, here’s what most users will notice:

  • The grammar selector list won’t have any badges that say “Tree-sitter” anymore.

If anyone thinks this is a problem, or has concerns about other decisions I’ve made here, I’m happy to discuss them.

@mauricioszabo
Copy link
Contributor

I don't think that showing the badges matter for most users - they want a grammar, they get a grammar, and that's fine.

I might add that I liked what you did in the "show duplicated grammars". A final thing, do you believe it's a good idea to also add a badge with "TextMate" too? I think it might help (I never really understood why, TreeSitter being the default, Atom decided to just show badges for it instead of TextMate), but it's only a suggestion, really :)

@savetheclocktower
Copy link
Contributor Author

Yeah, it can certainly be argued that we should add a badge for TextMate. I wonder if they might not have added one because they didn't expect the average user to know or care what a “TextMate grammar” is, but you could argue the same thing for Tree-sitter.

Ideally, the average user doesn't care what any of them are. But perhaps it's worth adding a badge when there's a scope-specific override, as @Daeraxa suggested on Discord.

@mauricioszabo
Copy link
Contributor

I think it's worth for a different reason: I suppose the person that toggles the "show duplicated grammars" actually cares about what they are, so that's why a badge would be useful.

So, by default, we have an "user-friendly experience" (it uses the only grammar that is available on that scope) and if someone wants to really control which grammar they will use, then we show all the info we have on the grammar. WDYT?

…when `hideDuplicateTextMateGrammars` is `false`.
@savetheclocktower
Copy link
Contributor Author

Sold. Just pushed a commit that adds badges to all kinds of grammars if “Hide Duplicate Grammars” is unchecked.

…when `prefillSelectedText` is `true` and the editor has selected text when a `SelectListView` is opened.
@savetheclocktower savetheclocktower marked this pull request as ready for review March 13, 2024 23:58
* GOTO labels
* Members like `foo.bar`
* Marking of constants in C++
Copy link
Member

@confused-Techie confused-Techie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While my review here is quick, and doesn't really touch the schema changes, the grammar-selector changes and wasm-tree-sitter-language-grammar changes look fantastic!

I love how verbose the comments are in the later file, and how clear everything is explained. Truly a very helpful resource.

Otherwise for the grammar-selector package I'm very glad we are now labeling all grammars as that's a fantastic suggestion made here, and helps make things very clear.

Overall, great tests, code, and docs. So seems like a no brainer to me, awesome work!

@@ -287,10 +299,10 @@ describe('GrammarSelector', () => {
}
}

expect(cppCount).toBe(2); // ensure we actually saw two grammars
expect(cppCount).toBe(3); // ensure we actually saw two grammars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expect(cppCount).toBe(3); // ensure we actually saw two grammars
expect(cppCount).toBe(3); // ensure we actually saw three grammars

Small I know, but nice to keep things consistent

@savetheclocktower savetheclocktower merged commit 1481d39 into pulsar-edit:master Mar 20, 2024
7 checks passed
@DeeDeeG DeeDeeG mentioned this pull request Mar 22, 2024
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants