Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow defining alignment in indent queries #5355

Merged
merged 5 commits into from
Aug 11, 2023

Conversation

Triton171
Copy link
Contributor

Add an @align indent capture that aligns everything inside some node to a given @anchor. I added indent queries for aligning function parameters in C/C++ and python. In C, this makes the following possible (in python, it's essentially the same):

void func_with_many_args(int first_argument, int second_argument, |int third_argument);
// Pressing Enter results in the following, correctly aligned code:
void func_with_many_args(int first_argument, int second_argument,
                         int third_argument);

// The correct indentation is also computed when the last argument hasn't been added yet.
// Pressing Enter here results in the same indentation:
void func_with_many_args(int first_argument, int second_argument,|);

This should (partially) resolve #3196. I tried using the @align capture to write indent queries for haskell but this is a lot more complicated than for other languages. It should be possible without adding anything else to the indentation system but there's still some work to do. It's probably best to do that in a separate PR.

I also tweaked the rust indent queries a bit so that they now correctly guess the indentation for helix-term/src/commands.rs (which is nice since that's a large and fairly complicated file). It doesn't seem to require any alignment any more, probably due commands.rs having been changed since I last tested the indent queries on it. The changes to the queries are not too complicated but if it makes it easier to review, I can of course split them into a separate PR.

@kirawi kirawi added A-tree-sitter Area: Tree-sitter S-waiting-on-review Status: Awaiting review from a maintainer. labels Jan 3, 2023
@teburd
Copy link
Contributor

teburd commented Jan 4, 2023

I guess my questions here are...

can this be used to align struct members in C, escapes in multiline macros, and not indent ifdefs and such.

Some complex C macros confuse even emacs which is annoying, for example:
https://github.com/zephyrproject-rtos/zephyr/blob/main/include/zephyr/rtio/rtio.h#L426

But otherwise alignment of not only the first item on the line but also alignment of the escape for multi-line macros is quite nice to have (githubs viewer fails to show reality, but this macro has its backslashes aligned given an appropriate tab spacing)

https://github.com/zephyrproject-rtos/zephyr/blob/main/include/zephyr/rtio/rtio.h#L410

@Triton171
Copy link
Contributor Author

Aligning struct members should work; for example with the following query:

(struct_specifier
    (field_declaration_list
        .
        (field_declaration) @anchor)) @align

Setting the indent for ifdefs to 0 is not yet possible. The issue is that (in this implementation) we can only align to other tree-sitter nodes. One could write queries that align preproc_ifdef to its surrounding translation_unit but this would have to be done for every nesting level, so it would fail for a sufficiently deep syntax tree. We could of course allow to specify some constant alignment to solve this issue, or maybe only add some special capture for zero alignment.

Aligning escapes in multiline macros is also not possible and definitely the trickiest to implement: Currently the indentation system only handles leading whitespace in lines, so we would need to add a way to specify alignment inside a line (probably a new capture type). The issue is that at the moment the indentation queries are only run when inserting a new line. To allow for alignment, they would have to be run every time the syntax tree changes in a significant way. I've thought about doing something that in order to retroactively update incorrect indentation but there's some challenges to this. At the moment, it would probably be easier to implement the alignment of macro escapes in an external formatter.

@archseer archseer added this to the next milestone Jan 7, 2023
@Triton171 Triton171 force-pushed the indent-align branch 2 times, most recently from c93883a to 7f8e826 Compare March 25, 2023 13:29
@pascalkuthe pascalkuthe modified the milestones: 23.03, next Mar 29, 2023
@archseer archseer requested review from pascalkuthe and archseer June 15, 2023 05:50
helix-core/src/indent.rs Outdated Show resolved Hide resolved
helix-core/src/indent.rs Outdated Show resolved Hide resolved
helix-core/src/indent.rs Outdated Show resolved Hide resolved
helix-core/src/indent.rs Outdated Show resolved Hide resolved
helix-core/src/indent.rs Outdated Show resolved Hide resolved
@Triton171
Copy link
Contributor Author

text_to_visual_pos would definitely be a simpler solution. I decided against it because it breaks alignment for different tab-widths:

class foo {
    // Assume that this is indented with a single tab
    void bar(int parm1,
             int parm2);
}

We want int parm1 and int parm2 to align here. If someone with a different (visual) tab width setting edits the code, it should still be aligned for them (if we assume that everyone always uses the same tab width, we might as well use spaces for indentation). This means that in this case the whitespace before int parm2 should consist of a single tab, followed by 9 spaces.

If we also want this to work when the previous line starts with an arbitrary sequence of tabs and spaces, we could change the type of Indentation::align to Option<(String, usize)>: The string simply stores the leading whitespace of the line that we want to align to and the usize stores the number of spaces we have to add after it. To make it even simpler, it could also just be an Option<String> where the spaces are already included. It's a bit less efficient since we need to allocate a String but that shouldn't really be a problem (we only allocate a single string if we want to align to something, which isn't the case for most indent queries).

@pascalkuthe
Copy link
Member

Ah then I would go with a String you have to create a String when inserting the edit anyway so you need to alloc either way

@Triton171
Copy link
Contributor Author

The new commits should now align correctly even for arbitrary sequences of tabs & spaces. This requires us to take Indentation by value (to avoid cloning the string) and IndentCaptureType is no longer Copy for the same reason. Because we only iterate over each Capture once, it's not really an issue.

@pascalkuthe
Copy link
Member

except for one minor comment this looks pretty good now to me 👍

Btw. I am a bit hesitant about including copies of commands.rs and similar as testcases. Couldn't we just check that we indent the entire helix codebase correctly without keeping a copy of that file? RA has a similar testcases where they make sure that RA cad parse itself.

@Triton171
Copy link
Contributor Author

Btw. I am a bit hesitant about including copies of commands.rs and similar as testcases. Couldn't we just check that we indent the entire helix codebase correctly without keeping a copy of that file?

Both approaches have advantages and disadvantages imo. Copying the file is of course a bit redundant because we store almost the same file twice. On the other hand, if we use files for testing without copying them, this means that anyone changing these files can get test failures if the indent queries are incorrect on the new version (even if their changes are perfectly reasonable). Since we'll probably never get 100% accuracy on the indent queries, this is likely to happen at some point.

In fact, this has already happened since there are some test failures in the new version of commands.rs. I'll try to fix the queries but I'm still not sure if we should really always use the latest version of these files for indent testing.

@pascalkuthe
Copy link
Member

Btw. I am a bit hesitant about including copies of commands.rs and similar as testcases. Couldn't we just check that we indent the entire helix codebase correctly without keeping a copy of that file?

Both approaches have advantages and disadvantages imo. Copying the file is of course a bit redundant because we store almost the same file twice. On the other hand, if we use files for testing without copying them, this means that anyone changing these files can get test failures if the indent queries are incorrect on the new version (even if their changes are perfectly reasonable). Since we'll probably never get 100% accuracy on the indent queries, this is likely to happen at some point.

In fact, this has already happened since there are some test failures in the new version of commands.rs. I'll try to fix the queries but I'm still not sure if we should really always use the latest version of these files for indent testing.

Yeah that's definitely a concern too. Since we have git as a built requirement you could instead read the files from git:

Simply read the stdout from git cat-file blob <commit-hash>:<path>.I think running on the entire codebase would in fact be very useful. For example it looks like the new test case actually caught a regression in the rust query changes you made (at least the line that shows up in the test failure indents correctly on master).

Having a large test case like this that runs on a large evolving real-world codebase has the potential to nicely flush out edge cases. We can bump the pinned commit from time to time and if there are some code that truly can't be indented correctly we could probably add a way some known-bad lines

@pascalkuthe pascalkuthe added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from a maintainer. labels Jul 15, 2023
pascalkuthe
pascalkuthe previously approved these changes Aug 4, 2023
Copy link
Member

@pascalkuthe pascalkuthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great, thanks for going the extra mile with the testing! Looks like you found quite a few edgecases in the rust queries

@pascalkuthe pascalkuthe linked an issue Aug 4, 2023 that may be closed by this pull request
@pascalkuthe pascalkuthe added S-waiting-on-review Status: Awaiting review from a maintainer. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. A-tree-sitter Area: Tree-sitter labels Aug 4, 2023
@pascalkuthe pascalkuthe added C-enhancement Category: Improvements E-medium Call for participation: Experience needed to fix: Medium / intermediate A-indent Area: Indentation labels Aug 4, 2023
@Triton171
Copy link
Contributor Author

Thanks for the reviews 😄. The combination of call_expression and field_expression still needs a bit of work for some rare edge cases, but other than that the rust queries seem pretty robust now.

I just saw #7817 and added a commit that fixes it, since it needed only one simple pattern.

pascalkuthe
pascalkuthe previously approved these changes Aug 4, 2023
@Triton171
Copy link
Contributor Author

Rebased it on top of #6768 to fix the conflicts in helix-core/src/indent.rs, all other files had no/only trivial conflicts. I also cleaned up my commit history a bit while I was at it, so this should be ready to merge now.

@pascalkuthe
Copy link
Member

It looks like some of the yaml tests are still failing in CI

Copy link
Member

@archseer archseer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! LGTM

@archseer archseer merged commit ee3171c into helix-editor:master Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-indent Area: Indentation C-enhancement Category: Improvements E-medium Call for participation: Experience needed to fix: Medium / intermediate S-waiting-on-review Status: Awaiting review from a maintainer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

continuation line under-indented issue in python Add "same" or "until here" indent capture
5 participants