Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial work on document formatting support for PICO-8 Lua files #27

Merged
merged 24 commits into from
Nov 15, 2022

Conversation

beetrootpaul
Copy link
Contributor

Based on top of cherry-picked 6195b55

In context of #26

@beetrootpaul
Copy link
Contributor Author

It works in general, but there are some bugs to be fixed and features missing. See https://github.com/beetrootpaul/pico8-ls/blob/17e3ce399720b1bbb696ab4b9434468dc850bfcb/server/src/server.ts#L684-L695 for more info

@beetrootpaul
Copy link
Contributor Author

beetrootpaul commented Oct 17, 2022

(answering #26 (comment) and the next one there)

I'm not sure how best to commit exactly to your branch, since your branch is a fork of mine, so for now I figure we can
just cherry-pick commits back and forth as needed.

I also have no idea how to collaborate effectively across 2 repos, so yes, this makes sense 😄

Here's the commit I made today: 3aae15c

great, I will take a look soon! 🙂

If you want the same behavior you can install the ESLint VSCode extension, and it should format all the files you work on according to the formatting rules defined for this project in .eslintrc.js.

will do 👍 (although in WebStorm, but yes, can attach ESLint as well)

Here's my progress: (…)

great! 🎉

Still needs work: (…)

thx for info, I will take a look as well 👍

Btw, you don't have to ping me on each comment

thx for info 🙂 So many platforms, so many different logic regarding notifying 😄

I also eventually want the formatter to be smart about preserving the original author's style.

Up to you OFC and in general I think same, but let me share my 2 cents 🙂

If it is possible to not use formatter by default (I think it is, I didn't have it run automatically on save by default or something like that), then I would opt for a release which has a minimal set of features to give any added value for some people. Then improve later to make it better for more people. In other words: let's force always spaces around operators or always no spaces around them, then improve to give a choice.

I think I might be less inclined for smart features, because I got used to Prettier in JS world. Before Prettier I was 100% for customisation, then Prettier got hyped and included in projects I was working on, then I finally got used to the fact that I have little control over style and the main value is in having it consistent.

On the other hand… PICO-8 has limit of characters. If not that, I would for example opt for multiline table syntax as implemented first (and only?), but I know from my own experience such code style hits char limits. In the end I use Lua minified in my own PICO-8 games which allows me to write code which takes a lot of space, both vertically and horizontally. But it would be not nice to expect other users of pico8-ls to minimise their code as well.

OK, that was me sharing thoughts about level of strictness vs flexibility for code formatting 😄 In the end I think I agree with you, but also am aware that the lower the scope of something releasable, the easier it is for me to maintain motivation 😅

@beetrootpaul
Copy link
Contributor Author

beetrootpaul commented Oct 18, 2022

I just treated my entire project with current formatting, it's available on a commit beetrootpaul/dart-07@cb810b0 .

Some examples of errors I found:

  1. Missing parentheses: (_vs - 29 * 2) / 2 👉 _vs - 29 * 2 / 2
  2. Missing parentheses: (selected_mission - 1) % 4 👉 selected_mission - 1 % 4
  3. Missing "local": local function draw_controls(base_x, base_y) within another function got formatted into function draw_controls(base_x, base_y)
  4. Missing parentheses: (mission_number - 1) * 16 👉 mission_number - 1 * 16
  5. Missing parentheses: withinin table literal property assignment xy = start_xy.plus(\nmagnitude * (rnd() - .5),\nmagnitude * (rnd() - .5)\n), 👉 xy = start_xy.plus(magnitude * rnd() - .5, magnitude * rnd() - .5),
  6. Missing parentheses, case of calling a function chosen with or: (on_started or _noop)() 👉 on_started or _noop()
  7. Missing parentheses: invincible_after_damage_timer and invincible_after_damage_timer.ttl % (2 * invincibility_flash_duration) < invincibility_flash_duration 👉 invincible_after_damage_timer and invincible_after_damage_timer.ttl % 2 * invincibility_flash_duration < invincibility_flash_duration
  8. Missing parentheses: ceil((r_max + 4 * r_step) / speed) 👉 ceil(r_max + 4 * r_step / speed)
  9. Missing parentheses: 1 - (t-1)^2 👉 1 - t - 1 ^ 2
  10. Missing parentheses, case of accessing a property of … and … or … result: (game.health >= segment and health_bar_segment_full or health_bar_segment_empty)._draw(xy.minus(0, 4 + segment * 6)) 👉 game.health >= segment and health_bar_segment_full or health_bar_segment_empty._draw(xy.minus(0, 4 + segment * 6))

ℹ️ I want to try fix some of those calculations before you do @japhib , since it will give me opportunity to learn codebase better 😄


Other than that there were some issues with comments' placing we are aware of. Some examples on screenshots below:

screenshot 2022-10-18 at 10 59 08

screenshot 2022-10-18 at 10 59 38

screenshot 2022-10-18 at 10 59 59

screenshot 2022-10-18 at 11 04 23


One annoying (but expected) thing was that when I had a function called with 4 args, each one in a separate line because of how long it was, it got joined into a single super long line. The more interesting variant of it was when callback function got preserved multiline, but args got joined after them 😄

See screenshots below:

screenshot 2022-10-18 at 11 00 18

screenshot 2022-10-18 at 10 53 43

screenshot 2022-10-18 at 10 54 34

@beetrootpaul
Copy link
Contributor Author

and here is a commit which fixes my game's codebase for me: beetrootpaul/dart-07@af490aa (game still does not work, since wrong calculations are there due to missing parentheses, but this set of changes makes luamin not crash and makes me able to run a cart )

@beetrootpaul
Copy link
Contributor Author

I added some failing tests for errors I found

@beetrootpaul
Copy link
Contributor Author

and fixed formatting of local functions

@beetrootpaul
Copy link
Contributor Author

FYI I am implementing a fix for applying parentheses in expressions like 1 - (t - 2) ^ 3

@japhib
Copy link
Owner

japhib commented Oct 19, 2022

What is your approach for applying parentheses? Are you just going to add logic so the parser remembers which expressions have parentheses around them? That's probably the most straightforward way to do it.

@beetrootpaul
Copy link
Contributor Author

I am implementing it based on what I am used to from other formatters, which is: formatter knows better where parentheses are required and where are not. Therefore my work-in-progress fix so far bases on operator precedence:

screenshot 2022-10-20 at 11 52 07

Let me know if you see this path as not the best one 😄

BTW I pushed the wip commit to GitHub right now, but please be aware I will most likely amend it later (in other words: do not rely on referring to it by its commit SHA)

@beetrootpaul
Copy link
Contributor Author

BTW I thought that it might be a good idea to have one extra test which operates on an assumption that AST of formatted code should be same as before formatting. I know it would duplicate a lot of other unit tests, so my motivation is only to have it just expressed in form of just a single test. From a very high level integration test level point of view.

Something like:

  • take very very very complex PICO-8 Lua file fixture
  • parse it with a same function server uses, to obtain 1st AST which we will use later for assertions
  • format the file with same function server uses
  • now parse the formatted file to obtain 2nd AST
  • assert 1st and 2nd ASTs are same except loc fields (those might change and it is OK)
  • format again
  • parse again to obtain 3rd AST
  • assert 2nd and 3rd ASTs are exactly same (including loc fields)

And, if introducing at some point a separation of AST (Abstract Syntax Tree) from a FST (Full Syntax Tree), assertions would differ in a way that:

  • all 3 AST should be exactly same
  • only 2nd and 3rd FST should be exactly same

Even one more step further: have a lot of test Lua files for that test (added over time) and have this test in form of "for each file from dir". So it would be kind of a final safety net to catch that "something is wrong, it's not as useful as unit tests since assertion is super broad, but we have to dive deeper now and understand what is happening and write that potentially missing one more unit test to cover the edge case".

WDYT @japhib ?

@beetrootpaul
Copy link
Contributor Author

And for a moment back to parentheses: do I understand correctly that what you are suggesting is to for example pass an extra "isSurroundedWithParentheses = true" param to this.parseExpectedExpression(flowContext) in this place in the code? 👉

} else if (this.lexer.consume('(')) {
base = this.parseExpectedExpression(flowContext);
this.lexer.expect(')');

@beetrootpaul
Copy link
Contributor Author

beetrootpaul commented Oct 20, 2022

  • I just tested adding an extra boolean about parentheses to parsed BinaryExpression and using that boolean in formatter's visitor for BinaryExpression, my related test pass green 👍
  • I see you already have some operator priority related code in parser, so maybe I could reuse it if going forward with "let formatter decide" solution:
    binaryPrecedence(operator: string): number {
    const charCode = operator.charCodeAt(0);
    const length = operator.length;
    if (1 === length) {
    switch (charCode) {
    case 94: return 12; // ^
    case 42: case 47: case 37: case 92: return 10; // * / % \
    case 43: case 45: return 9; // + -
    case 38: return 6; // & (bitwise AND)
    case 124: return 4; // | (bitwise OR)
    case 60: case 62: return 3; // < >
    }
    } else if (2 === length) {
    switch (charCode) {
    case 46: return 8; // ..
    case 60: case 62:
    if('<<' === operator || '>>' === operator) {
    return 7;
    } // << >>
    return 3; // <= >=
    case 33: case 61: case 126: return 3; // == ~= !=
    case 111: return 1; // or
    case 94: return 5; // ^^ (bitwise XOR, pico-8 lua uses the normal bitwise XOR ~ as bitwise NOT)
    }
    } else if (3 === length) {
    switch (operator) {
    case '>>>': case '<<>': case '>><': return 7;
    case 'and': return 2;
    }
    }
    return 0;
    }

So my main question to you @japhib is whether you prefer us to follow a path where:

  1. we let formatter decide where to put parentheses (and then if you see it viable to externalise and reuse binaryPrecedence function for formatter to know where to put parentheses), or
  2. we preserve in AST info about parentheses put by user, and formatter just follows that info.

@japhib
Copy link
Owner

japhib commented Oct 20, 2022

Implementing it based on operator precedence

That's great, definitely a smarter approach. However, rather than hard-coding all the operators inside visitBinaryExpression, you should use the precedence function that already exists in parser.ts: binaryPrecedence(operator: string): number. You can pass in the current operator and the parent operator and then compare the numbers it returns to see which one is higher, and therefore whether parentheses are needed.

Right now, binaryPrecedence() is an instance method on the Parser class, but Formatter doesn't have a reference to a parser. However, that method doesn't need a reference to any of the internal state of the Parser. So I think you'd be safe to pull it out of the class, to be a standalone function.

I prefer this method rather than doing the other thing I said with adding an extra "isSurroundedWithParentheses = true" param.

Extra test comparing the AST's

That seems great! I doubt we'll implement an FST ever (I'm not sure exactly what that would entail anyways), but just testing it on the AST's for now seems great. You can probably use the deepEquals() function inside of test-utils.ts for the comparison.

@beetrootpaul
Copy link
Contributor Author

you should use the precedence function that already exists

will do exactly that! 🙂

doubt we'll implement an FST ever (I'm not sure exactly what that would entail anyways)

If I understand correctly, FST would include a lot of stylistic info (is there a new line? Is there an extra parenthesis pair? Is there a space after an operator? Is there a comment?) while AST is meant to represent bare bone functionality.

I suppose we can put everything we need to AST and it won't hurt much? I imagine it might just make some operations unnecessary complex in the future, like "hey, I want to know if both parsed trees are equivalent functionality-wise, but I need to know how to distinguish between what is important and what is not" (like my example of a test assertion which should ignore everything that is not pure functionality).

But I believe this is not something necessary for now nor for a long time 🙂

@beetrootpaul beetrootpaul force-pushed the formatter branch 4 times, most recently from cc3dd39 to 057e9cf Compare October 23, 2022 11:59
@japhib
Copy link
Owner

japhib commented Oct 29, 2022

Looking good. I'm working on adding comments inside various statement types, as well as preserving multiline table constructors/function calls.

@beetrootpaul
Copy link
Contributor Author

beetrootpaul commented Oct 29, 2022

Looking good. I'm working on adding comments inside various statement types, as well as preserving multiline table constructors/function calls.

Oh, good to know! Please take a look at my last 2 commits where I already fixed some comment insertions @japhib ! (and added more tests over last several commits). Would be good to avoid difficult merge conflicts 😄

@japhib
Copy link
Owner

japhib commented Oct 30, 2022

Nice! FYI I pushed a couple more commits for adding comments within expressions like return expressions and some other things. There's a merge commit in there so it's probably going to be hard to cherry-pick the commits. Instead, I'd recommend adding this original repo as another remote, and then you can pull directly from my branch:

# with your `formatter` branch checked out
git remote add upstream [email protected]:japhib/pico8-ls.git
git pull upstream brp-formatter

The only failing unit test now is the one with comments inside a table constructor.

Note: inside several files in this repo, there are some switch statements that look like this:
image
Please don't re-format these switch statements so that each case takes up multiple lines. When each case is functionally so similar, I find it's more readable this way.

Also, I found this comment of yours:

// TODO: move formatter (in a separate PR) to its own folder, parallel to parser,
//       then move shared statements and expressions outside parser as well.
//       Remember to update test files' locations as well.

Perhaps I don't understand what you mean, but I'm not sure this is a good idea. For the most part, I like the file structure how it is now. But, maybe you can elaborate on what you mean?

@beetrootpaul
Copy link
Contributor Author

There's a merge commit in there so it's probably going to be hard to cherry-pick the commits. Instead, I'd recommend adding this original repo as another remote, and then you can pull directly from my branch

In the end I just reset my branch on yours @japhib and added some tests from my non-merged commits. There was no reason to merge formatter changes from my commits, since I see your implementation already solves comments preservation for most of tests 🙂 FYI I was not analysing your changes nor compared whether I like my fixes for comments better or not, settling on just choosing your way as the preferred one 🙂

screenshot 2022-11-05 at 23 20 58

Please don't re-format these switch statements so that each case takes up multiple lines. When each case is functionally so similar, I find it's more readable this way.

No problem! In fact, I wanted to keep it as they are to minimise my changes, but I haven't found an IDE's formatter config for this thing in particular. And since I keep formatting files without thinking (a so called muscle memory), it was tedious to revert that change again and again. Sorry!

Also, I found this comment of yours: (…) For the most part, I like the file structure how it is now. But, maybe you can elaborate on what you mean?

Thanks for asking about that TODO marker! 👍

I left it there to bring it later to discussion, which is what happens right now 👍 So, my reasoning is:

  • I used to treat parsing separate from formatting. For example because code analysis based on parsing is a separate VS Code action and code formatting, which uses both parser and then formatter, is another thing. It feels natural to me to treat both concepts as equal and separate (but I am aware this is biased, just the way I see it).
  • Moreover, Formatter is something that is exported to be used on its own as new Formatter(…).
  • Therefore I see it misleading to have formatter under src/parser/. I would rather see formatter under src/formatter/ while parser still under src/parser/.
  • If to do that, then it would be good to extract some common shared concepts as well, to not make them part of one of another. I mean, for example AST with its nodes, operators, expressions etc. Like: src/ast/ keeps the most important code abstraction we manipulate on, then src/formatter/ and src/parser depends on it.
  • Last but not least: if to continue towards modularisation (for purpose of Expose a parser as a separate Node.js package to enable 3rd party tools to use it #29), we would probably need to keep parser separate, so another formatter could make use of it (OFC we could bundle default formatter together in a same package as parser, I am just talking about what I see more clean)

So, you see my reasoning, but I have no strong opinion here, especially that it is your project 😄 Just sharing my thoughts 🙂

Since the topic was touched, I removed that comment for now 🙂

…arenthesesIfOnTheRightSide in order to avoid confusion about associativity in terms of math/programming
@beetrootpaul
Copy link
Contributor Author

(there is my previous comment above, answering some of your questions, and now in a more general manner)

@japhib , how do you want to proceed with this PR in order to make it good enough to be merged?

Some of my thoughts:

  1. Seems like I won't be writing another PICO-8 game soon (want to do some Godot learning now), therefore I probably won't use pico8-ls this year. Which means moving PR towards merge has higher priority for me (in order to build my GitHub profile) than having all formatter features I would like to. (OFC I won't push for merge against your will 😄 Am just explaining my context).
  2. Just in case it helps you with anything, here is a diff of my game codebase, formatted with the most recent pico8-ls commit 1f77390 👉 beetrootpaul/dart-07@c0f3f65 I don't want to specify
  3. I think I can still add some integration tests I was talking about before. I mean like: parse -> format -> parse again -> check if ASTs are equivalent. I feel comfortable with implementing this 🙂
  4. But I do not feel comfortable with working on missing features like comments preservation in table constructors or single empty line preservation between lines of code. It would require to much of my time to go deep enough to deliver anything of value.

@japhib
Copy link
Owner

japhib commented Nov 6, 2022

RE: the formatting thing - no worries, I understand! Since it's being done automatically, you can feel free to leave it however your IDE is formatting it, and I'll fox the formatting as I like it at some later point when I merge this stuff in.

I see it misleading to have formatter under src/parser/. I would rather see formatter under src/formatter/ while parser still under src/parser/.

That makes sense - I had actually forgotten that all this stuff was under the parser folder. My main point in doing that was to have it separate from server.ts itself. However, there are only a couple files under server/src, and several of the files under server/src/parser also do not strictly belong to parsing. So with that in mind, I think some sort of re-organization makes sense. However, if the parser is going to eventually be extracted into its own package, let's wait until then to determine the final organization.

how do you want to proceed with this PR in order to make it good enough to be merged?

You can step out any time you like! I appreciate you giving me some motivation to move this feature forwards, and we've made a lot of good progress. I'll keep (probably pretty slowly) moving the branch forward and developing on top of your stuff, and then once all the issues & TODOs we've discussed are resolved, I'll merge it into master. I'll be sure to give you credit in the readme/changelog when the formatter is released!

I think I can still add some integration tests I was talking about before. I mean like: parse -> format -> parse again -> check if ASTs are equivalent. I feel comfortable with implementing this

This would be awesome. I'd also check that all the comments are preserved, via Chunk.comments -- their location may have changed, but there should be the same number of comments, and they should be in the same order as before. I'm a little worried about my implementation swallowing comments inside certain types of expressions, so this type of test would be helpful in quieting those fears.

@beetrootpaul
Copy link
Contributor Author

You can step out any time you like!

I have to admit I really enjoyed the no-pressure async approach we had here @japhib 🙂 👍

I appreciate you giving me some motivation to move this feature forwards, and we've made a lot of good progress. I'll keep (probably pretty slowly) moving the branch forward and developing on top of your stuff, and then once all the issues & TODOs we've discussed are resolved, I'll merge it into master. I'll be sure to give you credit in the readme/changelog when the formatter is released!

💯 👍

This would be awesome.

In commit 4328064 :

  • I added 2 tests (multiplied by N test files):
    • one to check if AST of a formatted doc seems roughly same (here I needed to ignore loc and nodes of type Comment, since comments are not there yet before formatting happens; sadly, the latter means the test does not check for comments preservation).
    • another one to check if code formatted for the 2nd time is exactly same as after 1st format
    • I added low.lua to test files: I saw you refer to low.p8 and low.lua in some ignored tests. I created low.lua by duplicating low.p8 and removing all non-Lua parts from it. Feel free to modify/remove it according to what you originally wanted.
  • I added some sample files from my projects
    • sadly, tests are running very long for some of them. But for some files they run quickly, so you have to comment out some filenames and check by yourself which ones are timing out. It might indicate some performance issues somewhere in parser or formatter or test code 🤔 And… in the end it might be even stranger, because it seems like those tests run slowly in my IDE, but they have no such issue when running from a console 🤷‍♂️

BTW In a commit d8bf459 I also added tests for how I imagine empty line preservation could work. It's just my suggestion of how formatter should work, but feel free to remove them 🙂 👍

@beetrootpaul beetrootpaul marked this pull request as ready for review November 8, 2022 14:15
@beetrootpaul beetrootpaul changed the title Add document formatting support for PICO-8 Lua files I itial work on document formatting support for PICO-8 Lua files Nov 8, 2022
@beetrootpaul beetrootpaul changed the title I itial work on document formatting support for PICO-8 Lua files Initial work on document formatting support for PICO-8 Lua files Nov 8, 2022
@beetrootpaul
Copy link
Contributor Author

beetrootpaul commented Nov 8, 2022

One more thing, apart from the message I wrote above:

What do you think about merging the PR with formatting capability disabled @japhib ?

I modified name of this PR to not indicate release ready implementation and I added commit 57e8e88 in which I turned formatting off, which means VS Code won't use formatter which is not ready for release yet (I tested if locally, that one changed line result with VS Code complaining there is no extension which can format pico-8-lua file type). Also, in commit d7c324c I ignored formatter tests that are failing, so in result merge of this PR could result with a nice non-failing build of a the plugin, which has formatter support kinda hidden behind a feature flag.

I see 2 benefits of it:

  • to prevent branch from "decaying" over time (I mean to avoid difficult to resolve merge conflicts if branch gets merged months from now and other work is made on the main branch; like, already there are some changes to AST you made on a brp-formatter branch)
  • for me to put a closure on the topic 😄

WDYT?

@japhib
Copy link
Owner

japhib commented Nov 15, 2022

Hm, I don't love the idea of merging it "just because" when it's still in an unfinished state. There's not work happening on any other branches, including the main branch, so you don't have to worry about having merge conflicts. I'll see if I can get this into a good spot soon and then merge it.

@japhib japhib self-assigned this Nov 15, 2022
@japhib
Copy link
Owner

japhib commented Nov 15, 2022

Actually I think you're right, if I'm going to be the one continuing to work on this feature, it's easier if it's contained in the repo itself. I'll go ahead and merge it.

@japhib japhib merged commit 870228a into japhib:master Nov 15, 2022
@beetrootpaul
Copy link
Contributor Author

OK, thx! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants