diff --git a/README.md b/README.md index d861790a0ff..f67fddce059 100644 --- a/README.md +++ b/README.md @@ -48,13 +48,15 @@ ## Please read first -Our repositories on GitHub are primarily for development of the project and tracking active issues. Most of the information you will find here pertains to setting up the project for development purposes and is not relevant to the end-user. +> [!IMPORTANT] +> Our repositories on GitHub are primarily for development of the project and tracking active issues. Most of the information you will find here pertains to setting up the project for development purposes and is not relevant to the end-user. For questions or additional information pertaining to the project, we have a Discord for discussion here: https://discord.gg/VZbXMHXzWv Additionally, you can find further documentation and answers to **frequently asked questions** on the project's main website: https://opengoal.dev -**Do not use this decompilation project without providing your own legally purchased copy of the game.** +> [!WARNING] +> **Do not use this decompilation project without providing your own legally purchased copy of the game.** ### Quick Start diff --git a/common/formatter/README.md b/common/formatter/README.md new file mode 100644 index 00000000000..3664753d52b --- /dev/null +++ b/common/formatter/README.md @@ -0,0 +1,186 @@ +# OpenGOAL Formatter Documentation + +> [!IMPORTANT] +> This is still in a highly experimental state, adjust your expectations accordingly. + +Despite LISP being very simple, and taking the stance of a highly opinionated formatter that doesn't concern itself with line-length in most situations writing a code formatter is still an incredibly annoying and difficult thing to do. This documentation serves as a place for me to collect my thoughts and explain how it works (or more likely, how it does not). + +Also in general I found it _really_ hard to find modern and easy to understand articles on how to get started (with the exception of zprint's usage docs, but that doesn't really cover implementation ideas). This area is either dominated by 40 year old journal articles, or discussions around really difficult formatting projects. There isn't a whole lot in the vain of _here's how I made a decent, simple formatter_, so maybe it'll help someone else (even if it's teaching them potentially what not to do!) + +## Architecture / Overview + +Atleast for me, it helps to understand the path code takes as it flows through the formatting process. Originally I tried to do as much at once for sake of efficiency but this just makes things incredibly hard to reason about and fix bugs. So like any problem that is complex, break it down, the formatter goes through many distinct phases chipping away at the problem: + +```mermaid +%%{init: {'theme': 'dark', "flowchart" : { "curve" : "basis" } } }%% +flowchart TB + subgraph top1 [Build a Formatting Tree] + direction TB + a1(AST Parse) --> a2(Consolidate AST Nodes) + a1 --> a3(Collect Metadata) + end + + subgraph top2 [Apply Formatting Configuration] + direction TB + b1(Recursively Iterate each Node) --> b2{Form head has\na predefined config?} + b2 --> |yes|b3(Fetch it and set the config on the node) + b2 --> |no|b4(Apply a default configuration) + end + + subgraph top3 [Apply Formatting] + direction TB + c1(For each node, recursively build up a\nlist of source code lines, no indentation) --> c2(Consolidate the first two lines if hanging\nor if the config demands it) + c2 --> c3(Append indentation to each line as appropriate\nand surround each form with relevant parens) + end + + subgraph top4 [Finalize Lines] + direction TB + d1(Concatenate all lines with whatever\nfile ending the file started with) --> d2(If we're formatting a file append a trailing new-line\n otherwise we are done) + end + + top1-->|FormatterTree|top2 + top2-->|FormatterTree with Configurations|top3 + top3-->|String List|top4 +``` + +### Build a Formatting Tree + +At the end of the day, formatting is about changing the location and amounts of white-space in a file. Meaning it would be ideal if we could start with a representation that is whitespace-agnostic. Some formatters (like cljfmt) choose to use the original whitespace as an input into the formatting process, essentially meaning that your code should already be somewhat-formatted. This is somewhat advantageous as it can simplify the formatter's implementation or be a bit more flexible as to what the original programmer intended (assuming they knew what they were doing). But of course it has one fatal flaw -- the code has to already be well formatted. + +For our formatter I wanted it to be able to take in horribly, ugly, minified code and output an optimal result -- especially since so much of our code gets spat out by another tool. To do that, we need to be able to format the code without any existing whitespace helping us, adding in the whitespace from the ground up. + +The easiest way to do this is to construct an AST of the language which has a lot of other benefits for building language features but for here all we care about is getting rid of the whitespace and having some basic interpretation of the source code structure. The easiest way now-a-days to make an AST for a language is to create a [treesitter grammar](https://github.com/open-goal/tree-sitter-opengoal). + +#### Consolidate AST Nodes and Collect Metadata + +One downside with using an AST, atleast with the way the grammar is currently implemented is it creates a somewhat difficult tree to work with from a formatting perspective. For example in a LISP, everything is a list. In our AST it does not look like how you may intuitively expect: + +`(+ 1 2)` + +```mermaid +flowchart TB + a("source\n(+ 1 2)") --> b("list_lit\n") + b --> c("(") --> ca("(") + b --> d("sym_lit\n+") --> da("+") + b --> e("sym_lit\n1") --> ea("1") + b --> f("sym_lit\n2") --> fa("2") + b --> g(")") --> ga(")") +``` + +vs what you would probably prefer: + +```mermaid +flowchart TB + a("top-level") --> b("list\n- surrounded in ()") + b --> da("+") + b --> ea("1") + b --> fa("2") +``` + +As you can see from even this simple example, the AST representation can be very verbose and in a sense has duplication. Each node is aware of it's original location in the parsed source-code, but it's the leaf-node tokens that we care about for formatting and perhaps the form types they were a part of (ie. were they comments or in a list, etc). + +So to keep things brief, the AST representation is a bit too cumbersome and superfluous to use for formatting directly. So the first step is recurse through it and optimize it, removing unneeded height and capturing crucial metadata that is actually relevant for formatting. + +We build a `FormatterTree` which consists of `FormatterNode`s. Each `FormatterNode` has either a single token, or a list of `refs`. A `ref` is just another `FormatterNode`. Each node also has a bunch of metadata and configuration on how it should be formatted (just initialized for now). + +Having a dead-simple representation of the code like this is really only possible because it's a LISP, but it's also future proofed for us. If OpenGOAL ever decides to support different brackets -- for example in Clojure a vector uses `[]` or a map uses `{}` it would be easy for update the formatter to wrap each list node in the right bracket type. + +### Applying Formatting Configuration + +Now that we have our simplified tree, we can get to actually formatting. The first step is to figure out _how_ we intend to format things. By separating this from the actual formatting it's a lot easier to reason about and find bugs. You can also do potentially interesting things, where you do a full traversal of your source-code tree to figure out an optimal decision without trying to build up the final formatted string at the same time. + +It is lists, or `form`s that we care about figuring out how to format. Each `form` has a head element (unless it's empty) that we can use to specify how it would be formatted. For example we cannot specify the formatting for every single function name, but we can format a `defun` or a `let` in a special way. With this mechanism, the formatter is easily extensible and configurable. There is even the ability to specify formatting configuration for a list's children by index -- for example you may want to format the first element in a `let` (the binding list) differently than the rest of the code. + +There are also some forms that don't have a head-item, that we still identify and set some default formatting rules: +- The top-level node - indentation is disabled +- Lists of only literals, called `constant lists` - These are indented with `1` space instead of the default `2` for a `flow`'d form. + - If these lists literals are paired with keywords we call these `constant pairs` - We strive to put the pairs on the same line, if they are too long we spill to the next with `2` space indentation +- TODO - cond/case pairs (similar to constant pairs) + +This brings up two important terms that you will see in the formatter implementation a lot -- `flow` and `hang`. The difference is as such: + +```clj +(println "this" + "form" + "is" + "hung / hanged") +``` + +```clj +(println + "this" + "form" + "is" + "flowed") +``` + +The difference between the two other than visually is that hanging always results in 1 less line spent _because we don't care about line-length_. This is great because it means we can always default to hanging unless overridden. Flowing trades off that 1 vertical line for horizontal space and looks great for block-style forms. Like the top-level of a `defun` or the content of a `let`. + +### Applying Formatting + +The final real-step in the process, actually using the configuration and generating the formatting code. This phase is actually broken up into 3 sub-phases as we recurse through the `FormatterTree` + +#### Build Lines + +First we build the source code lines without indentation, it's a lot easier to figure out the indentation after all tokens are on the correct lines instead of doing it in tandem. Most tokens are given their own line, but in the case of a hung form we combine the first two elements on the first line. Other forms are formatted in even more complicated ways, or comments that were originally inline should remain next to that token. + +Secondly, once we have all our consolidated lines we can apply the indentation. This is most often based on the configuration of the list node itself. For example a `hung` form must be indented the width of a `(` and the length of the head-form, plus the whitespace separator -- conversely a `flowed` form only needs to be indented by `2` spaces, regardless of token length. Because all of these algorithms are recursive we can hyper-focus on getting the indentation and formatting correct from the bottom-up. In otherwords, we don't care if a form is 200 spaces indented into a function, once the lines are returned recursively, they will get another layer of indentation added on until we hit the top-level node. + +### Finalize Lines + +Finally, we concatenate our list of strings, respecting the original file's line-endings. + +As it is best practice, if we are formatting an entire file we should terminate with a trailing new-line. + +Both of these items are technically a TODO. + +## Configuration + +TODO + +## Reference + +This formatter is highly influenced by a lot of other existing formatters and writings. The most significant of which is Clojure's [zprint](https://github.com/kkinnear/zprint/tree/main/doc/options) which has incredible documentation around it's capabilities and why it chooses what to do. + +The formatter I've built basically copies zprint in terms of basic functionality, but simplifies things by being even more opinionated (for example, doing far less in terms of trying to adhere to a line-length limit). + +### Other references + +- [The Hardest Program I’ve Ever Written - Dart Formatter](https://journal.stuffwithstuff.com/2015/09/08/the-hardest-program-ive-ever-written/) +- [cljfmt](https://github.com/weavejester/cljfmt) + +### Why not respect a line-length limit + +The OpenGOAL formatter will respect line-length in only very specific situations: +- When determining if a form can be fit entirely on a single line +- Probably in the future for multi-line comments + +Besides it adding a massive amount of complexity, even zprint explains that a line-length limit won't always be followed + +> zprint will work very hard to fit the formatted output into this width, though there are limits to its effort. For instance, it will not reduce the minimum indents in order to satisfy a particular width requirement. + +Let's illustrate with an example: + +```clj +(defun some-func + (+ 123 + 456 + (+ 123 + 456 + ;; imagine this form exceeds our width, how can you "fix" this without + ;; messing up the indentation levels of the entire expression + ;; - you can't inline it (you're already past the width) + ;; - you can't insert a line break because it'll still be at the same horizontal level + (+ 123 + 456 + 789)))) +``` + +In LISP code, we don't have many ideal options to deal with this, all options are syntactically correct, but in terms of conventional readability they are bad because when reading LISP code you want to use the indentation as a guide (which is why new-line parens are considered ugly by basically all LISP style guides) + +The other reason which is equally relevant is it is hard, it means the formatter would have to try to find an optimal solution by exploring different solutions until it finds the one that minimizes lines. + +But with an opinionated formatting style you can avoid this complexity, if you want your code to look nice don't write code like the decompiler sometimes outputs when it encounters something truly cursed: + +![](./docs/triangle-of-death.png) +> `game-save.gc` about 200+ nested `let`s! diff --git a/common/formatter/docs/triangle-of-death.png b/common/formatter/docs/triangle-of-death.png new file mode 100644 index 00000000000..f17b18f196d Binary files /dev/null and b/common/formatter/docs/triangle-of-death.png differ diff --git a/common/formatter/formatter.cpp b/common/formatter/formatter.cpp index 5eec750e4c7..f1b97d8d35c 100644 --- a/common/formatter/formatter.cpp +++ b/common/formatter/formatter.cpp @@ -16,124 +16,303 @@ extern "C" { extern const TSLanguage* tree_sitter_opengoal(); } -std::string apply_formatting( - const FormatterTreeNode& curr_node, - std::string output, - std::optional form_element_config) { - using namespace formatter_rules; - if (!curr_node.token && curr_node.refs.empty()) { - return output; +int hang_indentation_width(const FormatterTreeNode& curr_node) { + if (curr_node.token || curr_node.refs.empty()) { + return 0; } - std::string curr_form = ""; - // Print the token - if (curr_node.token) { - curr_form += curr_node.token.value(); - return curr_form; + // Get the first element of the form + const auto& first_elt = curr_node.refs.at(0); + if (first_elt.token) { + return first_elt.token->length() + + 2; // +2 because the opening paren and then the following space } - if (!curr_node.metadata.is_top_level) { - curr_form += "("; - } - // Iterate the form - - bool inline_form = false; - // Also check if the form should be constant-paired - const bool constant_pair_form = constant_pairs::form_should_be_constant_paired(curr_node); - if (!constant_pair_form) { - // Determine if the form should be inlined or hung/flowed - // TODO - this isn't entirely accurate, needs current cursor positioning (which is tricky - // because recursion!) - inline_form = indent::form_can_be_inlined(curr_form, curr_node); - } - const bool flowing = indent::should_form_flow(curr_node, inline_form); - std::optional form_config; - if (!curr_node.refs.empty() && curr_node.refs.at(0).token) { + // Otherwise, continue nesting + return 1 + hang_indentation_width(first_elt); +} + +// TODO - compute length of each node and store it +void apply_formatting_config( + FormatterTreeNode& curr_node, + std::optional> + config_from_parent = {}) { + using namespace formatter_rules; + // node is empty, base-case + if (curr_node.token || curr_node.refs.empty()) { + return; + } + // first, check to see if this form already has a predefined formatting configuration + // if it does, that simplifies things because there is only 1 way of formatting the form + std::optional predefined_config; + if (!config_from_parent && !curr_node.refs.empty() && curr_node.refs.at(0).token) { const auto& form_head = curr_node.refs.at(0).token; if (form_head && config::opengoal_form_config.find(form_head.value()) != config::opengoal_form_config.end()) { - form_config = config::opengoal_form_config.at(form_head.value()); + predefined_config = config::opengoal_form_config.at(form_head.value()); + curr_node.formatting_config = predefined_config.value(); } + } else if (config_from_parent) { + predefined_config = *config_from_parent.value(); + curr_node.formatting_config = predefined_config.value(); } - // TODO - might want to make some kind of per-form config struct, simplify the passing around of - // info below - for (int i = 0; i < (int)curr_node.refs.size(); i++) { - const auto& ref = curr_node.refs.at(i); - // Figure out if the element should be inlined or not - bool inline_element = inline_form; - if (indent::inline_form_element(curr_node, i)) { - inline_element = indent::inline_form_element(curr_node, i).value(); + // In order to keep things simple, as well as because its ineffectual in lisp code (you can only + // enforce it so much without making things unreadable), line width will not matter for deciding + // whether or not to hang or flow the form + // + // This means that a hang would ALWAYS win, because it's 1 less line break. Therefore this + // simplifies our approach there is no need to explore both braches to see which one would be + // preferred. + // + // Instead, we either use the predefined configuration (obviously) or we do some checks for some + // outlier conditions to see if things should be formatted differently + // + // Otherwise, we always default to a hang. + // + // NOTE - any modifications here to child elements could be superseeded later in the recursion + // in order to maintain your sanity, only modify things here that _arent_ touched by default + // configurations. These are explicitly prepended with `parent_mutable_` + if (!predefined_config) { + if (curr_node.metadata.is_top_level) { + curr_node.formatting_config.indentation_width = 0; + curr_node.formatting_config.hang_forms = false; + } else if (constant_list::is_constant_list(curr_node)) { + // - Check if the form is a constant list (ie. a list of numbers) + curr_node.formatting_config.indentation_width = 1; + curr_node.formatting_config.hang_forms = false; + curr_node.formatting_config.has_constant_pairs = + constant_pairs::form_should_be_constant_paired(curr_node); + // If applicable, iterate through the constant pairs, since we can potentially pair up + // non-constant second elements in a pair (like a function call), there is the potential that + // they need to spill to the next line and get indented in extra. This is an exceptional + // circumstance, we do NOT do this sort of thing when formatting normal forms (cond/case pairs + // are another similar situation) + if (curr_node.formatting_config.has_constant_pairs) { + for (int i = 0; i < curr_node.refs.size(); i++) { + auto& child_ref = curr_node.refs.at(i); + const auto type = child_ref.metadata.node_type; + if (constant_types.find(type) == constant_types.end() && + constant_pairs::is_element_second_in_constant_pair(curr_node, child_ref, i)) { + child_ref.formatting_config.parent_mutable_extra_indent = 2; + } + } + } + + } else if (curr_node.formatting_config.hang_forms && curr_node.refs.size() > 1 && + curr_node.refs.at(1).metadata.is_comment) { + // - Check if the second argument is a comment, it looks better if we flow instead + curr_node.formatting_config.hang_forms = false; } - // Append a newline if needed - // TODO - cleanup / move - bool is_binding_list = false; - bool force_newline = false; - bool override_force_flow = false; - if (form_config) { - force_newline = std::find(form_config->force_newline_at_indices.begin(), - form_config->force_newline_at_indices.end(), - i) != form_config->force_newline_at_indices.end(); - // Check if it's a small enough binding list, if so we don't force a newline if the element - // can be inlined - if (inline_element && i > 0 && form_config->bindings_at_index == i - 1 && - curr_node.refs.at(i - 1).refs.size() < form_config->allow_inlining_if_size_less_than) { - force_newline = false; - override_force_flow = true; + } + // If we are hanging, lets determine the indentation width since it is based on the form itself + if (curr_node.formatting_config.hang_forms) { + curr_node.formatting_config.indentation_width = hang_indentation_width(curr_node); + } + // iterate through the refs + for (int i = 0; i < curr_node.refs.size(); i++) { + auto& ref = curr_node.refs.at(i); + if (!ref.token) { + // If the child has a pre-defined configuration at that index, we pass it along + if (predefined_config && + predefined_config->index_configs.find(i) != predefined_config->index_configs.end()) { + apply_formatting_config(ref, predefined_config->index_configs.at(i)); + } else { + apply_formatting_config(ref); } - is_binding_list = form_config->bindings_at_index == i; } + } +} + +int get_total_form_inlined_width(const FormatterTreeNode& curr_node) { + if (curr_node.token) { + return curr_node.token->length(); + } + int width = 1; + for (const auto& ref : curr_node.refs) { + width += get_total_form_inlined_width(ref); + } + return width + 1; +} + +bool form_contains_comment(const FormatterTreeNode& curr_node) { + if (curr_node.metadata.is_comment) { + return true; + } + for (const auto& ref : curr_node.refs) { + const auto contains_comment = form_contains_comment(ref); + if (contains_comment) { + return true; + } + } + return false; +} - if (!curr_node.metadata.is_top_level && - (!inline_element || is_binding_list || force_newline || - (form_element_config && form_element_config->force_flow))) { - indent::append_newline(curr_form, ref, curr_node, i, flowing, constant_pair_form, - (form_element_config && form_element_config->force_flow)); +bool form_contains_node_that_prevents_inlining(const FormatterTreeNode& curr_node) { + if (curr_node.formatting_config.should_prevent_inlining(curr_node.formatting_config, + curr_node.refs.size())) { + return true; + } + for (const auto& ref : curr_node.refs) { + const auto prevents_inlining = form_contains_node_that_prevents_inlining(ref); + if (prevents_inlining) { + return true; } - // TODO - indent the line (or don't) - // Either print the element's token, or recursively format it as well + } + return false; +} + +bool can_node_be_inlined(const FormatterTreeNode& curr_node, int cursor_pos) { + using namespace formatter_rules; + // First off, we cannot inline the top level + if (curr_node.metadata.is_top_level) { + return false; + } + // If the config explicitly prevents inlining, or it contains a sub-node that prevents inlining + if (curr_node.formatting_config.prevent_inlining || + form_contains_node_that_prevents_inlining(curr_node)) { + return false; + } + // nor can we inline something that contains a comment in the middle + if (form_contains_comment(curr_node)) { + return false; + } + // constant pairs are not inlined! + if (curr_node.formatting_config.has_constant_pairs) { + return false; + } + // If this is set in the config, then the form is intended to be partially inlined + if (curr_node.formatting_config.inline_until_index != -1) { + return false; + } + // let's see if we can inline the form all on one line to do that, we recursively explore + // the form to find the total width + int line_width = cursor_pos + get_total_form_inlined_width(curr_node); + return line_width <= indent::line_width_target; // TODO - comments +} + +std::vector apply_formatting(const FormatterTreeNode& curr_node, + std::vector output = {}, + int cursor_pos = 0) { + using namespace formatter_rules; + if (!curr_node.token && curr_node.refs.empty()) { + return output; + } + + // If its a token, just print the token and move on + if (curr_node.token) { + return {curr_node.token.value()}; + } + + bool inline_form = can_node_be_inlined(curr_node, cursor_pos); + // TODO - also if the form is inlinable, we can skip all the complication below and just...inline + // it! + // TODO - should figure out the inlining here as well, instead of the bool above + + // Iterate the form, building up a list of the final lines but don't worry about indentation + // at this stage. Once the lines are finalized, it's easy to add the indentation later + // + // This means we may combine elements onto the same line in this step. + std::vector form_lines = {}; + for (int i = 0; i < curr_node.refs.size(); i++) { + const auto& ref = curr_node.refs.at(i); + // Add new line entry if (ref.token) { - // TODO depth hard-coded to 1, i think this can be removed, since - // forms are always done bottom-top recursively, they always act - // independently as if it was the shallowest depth - if (!inline_element || force_newline) { - indent::indent_line(curr_form, ref, curr_node, 1, i, flowing); - } - if (ref.metadata.node_type == "comment" && ref.metadata.is_inline) { - curr_form += " " + ref.token.value(); - } else if (ref.metadata.node_type == "block_comment") { - curr_form += comments::format_block_comment(ref.token.value()); - } else { - curr_form += ref.token.value(); - } - if (!curr_node.metadata.is_top_level) { - curr_form += " "; + // Cleanup block-comments + std::string val = ref.token.value(); + if (ref.metadata.node_type == "block_comment") { + // TODO - change this sanitization to return a list of lines instead of a single new-lined + // line + val = comments::format_block_comment(ref.token.value()); } + form_lines.push_back(val); } else { - // See if the item at this position has specific formatting - std::optional config = {}; - std::string formatted_form; - if (form_config && form_config->index_configs.find(i) != form_config->index_configs.end()) { - formatted_form = apply_formatting(ref, "", *form_config->index_configs.at(i)); - } else { - formatted_form = apply_formatting(ref, "", {}); + // If it's not a token, we have to recursively build up the form + // TODO - add the cursor_pos here + const auto& lines = apply_formatting(ref, {}, cursor_pos); + for (int i = 0; i < lines.size(); i++) { + const auto& line = lines.at(i); + form_lines.push_back(fmt::format( + "{}{}", str_util::repeat(ref.formatting_config.parent_mutable_extra_indent, " "), + line)); } - // TODO - align inner lines only - if (!curr_node.metadata.is_top_level) { - indent::align_lines( - formatted_form, ref, curr_node, constant_pair_form, flowing, - (!override_force_flow && form_config && i >= form_config->start_flow_at_index), - inline_element); + } + // If we are hanging forms, combine the first two forms onto the same line + if (i == curr_node.refs.size() - 1 && form_lines.size() > 1 && + (curr_node.formatting_config.hang_forms || + curr_node.formatting_config.combine_first_two_lines)) { + form_lines.at(0) += fmt::format(" {}", form_lines.at(1)); + form_lines.erase(form_lines.begin() + 1); + } else if ((i + 1) < curr_node.refs.size()) { + const auto& next_ref = curr_node.refs.at(i + 1); + // combine the next inline comment or constant pair + if ((next_ref.metadata.node_type == "comment" && next_ref.metadata.is_inline) || + (curr_node.formatting_config.has_constant_pairs && + constant_pairs::is_element_second_in_constant_pair(curr_node, next_ref, i + 1))) { + if (next_ref.token) { + form_lines.at(form_lines.size() - 1) += fmt::format(" {}", next_ref.token.value()); + i++; + } else if (can_node_be_inlined(next_ref, cursor_pos)) { + const auto& lines = apply_formatting(next_ref, {}, cursor_pos); // TODO - cursor pos + for (const auto& line : lines) { + form_lines.at(form_lines.size() - 1) += fmt::format(" {}", line); + } + i++; + } } - curr_form += formatted_form; - if (!curr_node.metadata.is_top_level) { - curr_form += " "; + } + // If we are at the top level, potential separate with a new line + if (blank_lines::should_insert_blank_line(curr_node, ref, i)) { + form_lines.at(form_lines.size() - 1) += "\n"; + } + } + + // Consolidate any lines if the configuration requires it + if (curr_node.formatting_config.inline_until_index != -1) { + std::vector new_form_lines = {}; + for (int i = 0; i < form_lines.size(); i++) { + if (i < curr_node.formatting_config.inline_until_index) { + if (new_form_lines.empty()) { + new_form_lines.push_back(form_lines.at(i)); + } else { + new_form_lines.at(0) += fmt::format(" {}", form_lines.at(i)); + } + } else { + new_form_lines.push_back(form_lines.at(i)); } } - // Handle blank lines at the top level, skip if it's the final element - blank_lines::separate_by_newline(curr_form, curr_node, ref, i); + form_lines = new_form_lines; } + + // Apply necessary indentation to each line and add parens if (!curr_node.metadata.is_top_level) { - curr_form = str_util::rtrim(curr_form) + ")"; + std::string form_surround_start = "("; + std::string form_surround_end = ")"; + form_lines[0] = fmt::format("{}{}", form_surround_start, form_lines[0]); + form_lines[form_lines.size() - 1] = + fmt::format("{}{}", form_lines[form_lines.size() - 1], form_surround_end); + } + std::string curr_form = ""; + if (curr_node.formatting_config.parent_mutable_extra_indent > 0) { + curr_form += str_util::repeat(curr_node.formatting_config.parent_mutable_extra_indent, " "); + } + if (inline_form) { + form_lines = {fmt::format("{}", fmt::join(form_lines, " "))}; + } else { + for (int i = 0; i < form_lines.size(); i++) { + if (i > 0) { + auto& line = form_lines.at(i); + line = fmt::format("{}{}", + str_util::repeat(curr_node.formatting_config.indentation_width_for_index( + curr_node.formatting_config, i), + " "), + line); + } + } } - return curr_form; + return form_lines; +} + +std::string join_formatted_lines(const std::vector lines) { + // TODO - respect original file line endings? + return fmt::format("{}", fmt::join(lines, "\n")); } std::optional formatter::format_code(const std::string& source) { @@ -155,9 +334,22 @@ std::optional formatter::format_code(const std::string& source) { } try { - const auto formatting_tree = FormatterTree(source, root_node); - std::string formatted_code = apply_formatting(formatting_tree.root, "", {}); - return formatted_code; + // There are three phases of formatting + // 1. Simplify the AST down to something that is easier to work on from a formatting perspective + // this also gathers basic metadata that can be done at this stage, like if the token is a + // comment or if the form is on the top-level + auto formatting_tree = FormatterTree(source, root_node); + // 2. Recursively iterate through this simplified FormatterTree and figure out what rules + // need to be applied to produce an optimal result + apply_formatting_config(formatting_tree.root); + // 3. Use this updated FormatterTree to print out the final source-code, while doing so + // we may deviate from the optimal result to produce something even more optimal by inlining + // forms that can fit within the line width. + const auto formatted_lines = apply_formatting(formatting_tree.root); + // 4. Now we joint he lines together, it's easier when formatting to leave all lines independent + // so adding indentation is easier + const auto formatted_source = join_formatted_lines(formatted_lines); + return formatted_source; } catch (std::exception& e) { lg::error("Unable to format code - {}", e.what()); } diff --git a/common/formatter/formatter_tree.cpp b/common/formatter/formatter_tree.cpp index 4bfbc28bcdd..8df85f9a08d 100644 --- a/common/formatter/formatter_tree.cpp +++ b/common/formatter/formatter_tree.cpp @@ -96,6 +96,11 @@ void FormatterTree::construct_formatter_tree_recursive(const std::string& source // formatting So for strings, we treat them as if they should be a single token tree_node.refs.push_back(FormatterTreeNode(source, curr_node)); return; + } else if (curr_node_type == "quoting_lit") { + // same story for quoted symbols + // TODO - expect to have to add more here + tree_node.refs.push_back(FormatterTreeNode(source, curr_node)); + return; } for (size_t i = 0; i < ts_node_child_count(curr_node); i++) { const auto child_node = ts_node_child(curr_node, i); diff --git a/common/formatter/formatter_tree.h b/common/formatter/formatter_tree.h index c3eefd4cdd2..df0ef13b0d5 100644 --- a/common/formatter/formatter_tree.h +++ b/common/formatter/formatter_tree.h @@ -5,6 +5,7 @@ #include #include +#include "rules/rule_config.h" #include "tree_sitter/api.h" // Treesitter is fantastic for validating and parsing our code into a structured tree format without @@ -39,11 +40,13 @@ class FormatterTreeNode { // eventually token node refs std::optional token; + formatter_rules::config::FormFormattingConfig formatting_config; + FormatterTreeNode() = default; FormatterTreeNode(const std::string& source, const TSNode& node); FormatterTreeNode(const Metadata& _metadata) : metadata(_metadata){}; - bool is_list() const { return token.has_value(); } + bool is_list() const { return !token.has_value(); } }; // A FormatterTree has a very simple and crude tree structure where: diff --git a/common/formatter/rules/formatting_rules.cpp b/common/formatter/rules/formatting_rules.cpp index bc737f120e8..f1564862eee 100644 --- a/common/formatter/rules/formatting_rules.cpp +++ b/common/formatter/rules/formatting_rules.cpp @@ -14,27 +14,41 @@ namespace formatter_rules { // differentiate between a quoted symbol and a quoted form const std::set constant_types = {"kwd_lit", "num_lit", "str_lit", "char_lit", "null_lit", "bool_lit"}; + +namespace constant_list { +bool is_constant_list(const FormatterTreeNode& node) { + if (!node.is_list() || node.refs.empty()) { + return false; + } + const auto& type = node.refs.at(0).metadata.node_type; + return constant_types.find(type) != constant_types.end(); +} +} // namespace constant_list + namespace blank_lines { -void separate_by_newline(std::string& curr_text, - const FormatterTreeNode& containing_node, - const FormatterTreeNode& node, - const int index) { - // We only are concerned with top level forms or elements - // Skip the last element, no trailing new-lines (let the editors handle this!) - // Also peek ahead to see if there was a comment on this line, if so don't separate things! - if (!containing_node.metadata.is_top_level || index >= (int)containing_node.refs.size() - 1 || - (containing_node.refs.at(index + 1).metadata.is_comment && - containing_node.refs.at(index + 1).metadata.is_inline)) { - return; + +bool should_insert_blank_line(const FormatterTreeNode& containing_node, + const FormatterTreeNode& node, + const int index) { + // We only do this at the top level and don't leave a trailing new-line + if (!containing_node.metadata.is_top_level || index >= (int)containing_node.refs.size() - 1) { + return false; } - curr_text += "\n"; // If it's a comment, but has no following blank lines, dont insert a blank line if (node.metadata.is_comment && node.metadata.num_blank_lines_following == 0) { - return; + return false; + } + // If the next form is a comment and is inline, don't insert a comment + if ((index + 1) < containing_node.refs.size() && + containing_node.refs.at(index + 1).metadata.is_comment && + containing_node.refs.at(index + 1).metadata.is_inline) { + return false; } - // Otherwise, add only 1 blank line - curr_text += "\n"; + + // TODO - only if the form doesn't fit on a single line + return true; } + } // namespace blank_lines namespace comments { @@ -70,6 +84,7 @@ std::string format_block_comment(const std::string& comment) { namespace constant_pairs { +// TODO - remove index, not needed, could just pass in the previous node bool is_element_second_in_constant_pair(const FormatterTreeNode& containing_node, const FormatterTreeNode& node, const int index) { @@ -85,11 +100,7 @@ bool is_element_second_in_constant_pair(const FormatterTreeNode& containing_node // not be paired return false; } - // Check the type of the element - if (constant_types.find(node.metadata.node_type) != constant_types.end()) { - return true; - } - return false; + return true; } bool form_should_be_constant_paired(const FormatterTreeNode& node) { @@ -118,276 +129,4 @@ bool form_should_be_constant_paired(const FormatterTreeNode& node) { } // namespace constant_pairs -namespace indent { - -int cursor_pos(const std::string& curr_text) { - if (curr_text.empty()) { - return 0; - } - // Get the last line of the text (which is also the line we are on!) - int pos = 0; - for (int i = curr_text.size() - 1; i >= 0; i--) { - const auto& c = curr_text.at(i); - if (c == '\n') { - break; - } - pos++; - } - return pos; -} - -int compute_form_width_after_index(const FormatterTreeNode& node, - const int index, - const int depth = 0) { - if (node.refs.empty()) { - if (node.token) { - return node.token->size(); - } else { - return 0; - } - } - int form_width = 0; - for (int i = 0; i < (int)node.refs.size(); i++) { - const auto& ref = node.refs.at(i); - if (depth == 0 && i < index) { - continue; - } - if (ref.token) { - form_width += ref.token->size() + 1; - } else { - form_width += compute_form_width_after_index(ref, index, depth + 1) + 1; - } - } - return form_width; -} - -bool form_exceed_line_width(const std::string& curr_text, - const FormatterTreeNode& containing_node, - const int index) { - // Compute length from the current cursor position on the line as this check is done for every - // element of the form and not in advance - // - // This is for a good reason, intermediate nodes may override this styling and force to be - // formatted inline - // - // We early out as soon as we exceed the width - int curr_line_pos = cursor_pos(curr_text); - if (curr_line_pos >= line_width_target) { - return true; - } - int remaining_width_required = compute_form_width_after_index(containing_node, index); - if (curr_line_pos + remaining_width_required >= line_width_target) { - return true; - } - return false; -} - -bool form_contains_comment(const FormatterTreeNode& node) { - if (node.metadata.is_comment) { - return true; - } - for (const auto& ref : node.refs) { - if (ref.metadata.is_comment) { - return true; - } else if (!node.refs.empty()) { - if (form_contains_comment(ref)) { - return true; - } - } - } - return false; -} - -bool form_can_be_inlined(const std::string& curr_text, const FormatterTreeNode& list_node) { - // is the form too long to fit on a line TODO - increase accuracy here - if (form_exceed_line_width(curr_text, list_node, 0)) { - return false; - } - // are there any comments? (inlined or not, doesn't matter) - if (form_contains_comment(list_node)) { - return false; - } - return true; -} - -bool should_form_flow(const FormatterTreeNode& list_node, const bool inlining_form) { - if (form_contains_comment(list_node)) { - return true; - } - // does the form begin with a constant (a list of content elements) - if (!inlining_form && !list_node.refs.empty() && - constant_types.find(list_node.refs.at(0).metadata.node_type) != constant_types.end()) { - return true; - } - - // TODO - make a function to make grabbing this metadata easier... - // TODO - honestly should just have an is_list metadata - if (!list_node.refs.empty() && !list_node.refs.at(0).token) { - // if the first element is a comment, force a flow - if (list_node.refs.size() > 1 && list_node.refs.at(1).metadata.is_comment) { - return true; - } - const auto& form_head = list_node.refs.at(0).token; - // See if we have any configuration for this form - if (form_head && config::opengoal_form_config.find(form_head.value()) != - config::opengoal_form_config.end()) { - const auto& form_config = config::opengoal_form_config.at(form_head.value()); - return form_config.force_flow; - } - } - - // TODO - cleanup, might be inside a let - /*if (!containing_form.refs.empty() && containing_form.refs.at(0).token) { - const auto& form_head = containing_form.refs.at(0).token; - if (form_head && config::opengoal_form_config.find(form_head.value()) != - config::opengoal_form_config.end()) { - const auto& form_config = config::opengoal_form_config.at(form_head.value()); - if (form_config.force_flow) { - return true; - } - } - }*/ - - return false; -} - -std::optional inline_form_element(const FormatterTreeNode& list_node, const int index) { - // TODO - honestly should just have an is_list metadata - if (list_node.refs.empty() || !list_node.refs.at(0).token) { - return std::nullopt; - } - const auto& form_head = list_node.refs.at(0).token; - // See if we have any configuration for this form - if (form_head && - config::opengoal_form_config.find(form_head.value()) != config::opengoal_form_config.end()) { - const auto& form_config = config::opengoal_form_config.at(form_head.value()); - if (form_config.inline_until_index != -1) { - return index < form_config.inline_until_index; - } - } - return std::nullopt; -} - -void append_newline(std::string& curr_text, - const FormatterTreeNode& node, - const FormatterTreeNode& containing_node, - const int index, - const bool flowing, - const bool constant_pair_form, - const bool force_newline) { - if ((force_newline && index >= 1) || (node.metadata.is_comment && !node.metadata.is_inline)) { - curr_text = str_util::rtrim(curr_text) + "\n"; - return; - } - if (index <= 0 || containing_node.metadata.is_top_level || - (node.metadata.is_comment && node.metadata.is_inline) || (!flowing && index <= 1)) { - return; - } - // Check if it's a constant pair - if (constant_pair_form && - constant_pairs::is_element_second_in_constant_pair(containing_node, node, index)) { - return; - } - curr_text = str_util::rtrim(curr_text) + "\n"; -} - -void indent_line(std::string& curr_text, - const FormatterTreeNode& node, - const FormatterTreeNode& containing_node, - const int depth, - const int index, - const bool flowing) { - if (node.metadata.is_top_level || (node.metadata.is_inline && node.metadata.is_comment)) { - return; - } - // If the element is the second element in a constant pair, that means we did not append a - // new-line before hand so we require no indentation (it's inline with the previous element) - if (constant_pairs::is_element_second_in_constant_pair(containing_node, node, index)) { - return; - } - // If the first element in the list is a constant, we only indent with 1 space instead - if (index > 0 && - constant_types.find(containing_node.refs.at(0).metadata.node_type) != constant_types.end()) { - curr_text += str_util::repeat(depth, " "); - } else if (index > 0 && flowing) { - curr_text += str_util::repeat(depth, " "); - } else if (index > 1 && !flowing) { - curr_text += str_util::repeat(containing_node.refs.at(0).token.value().length() + 2, " "); - } -} - -// Recursively iterate through the node until we hit a token -int length_to_hang(const FormatterTreeNode& node, int length) { - if (node.token || node.refs.at(0).token) { - return length; - } - return length_to_hang(node.refs.at(0), length + 1); -} - -void align_lines(std::string& text, - const FormatterTreeNode& node, - const FormatterTreeNode& containing_node, - const bool constant_pair_form, - const bool flowing, - const bool force_flow, - const bool inline_element) { - const auto lines = str_util::split(text); - int start_index = 0; - if (inline_element) { - start_index = 1; - } - int alignment_width = 2; - if (force_flow) { - start_index = 0; - } else if (constant_pair_form && - constant_types.find(containing_node.refs.at(0).metadata.node_type) != - constant_types.end()) { - start_index = 0; - alignment_width = 3; - } else if (!flowing) { - // If the form has a token (it's a normal list) - if (containing_node.refs.at(0).token) { - alignment_width = length_to_hang(containing_node.refs.at(1), - containing_node.refs.at(0).token.value().length()) + - 1; - if (!node.token) { - alignment_width++; - } - } else { - // otherwise, it's a list of lists - alignment_width = 1; - } - } else if (!node.token) { - // If it's a list of lists - alignment_width = 1; - } - std::string aligned_form = ""; - for (size_t i = 0; i < lines.size(); i++) { - if ((int)i >= start_index) { - aligned_form += str_util::repeat(alignment_width, " "); - } - aligned_form += lines.at(i); - if (i != lines.size() - 1) { - aligned_form += "\n"; - } - } - if (!aligned_form.empty()) { - text = aligned_form; - } -} -} // namespace indent -namespace let { - -bool can_be_inlined(const FormatterTreeNode& form) { - // Check a variety of things specific to `let` style forms (ones with bindings) - // - does the binding list have more than one binding? - const auto& bindings = form.refs.at(1); // TODO - assuming - if (bindings.refs.size() > 1) { - return false; - } - return true; -} - -} // namespace let - } // namespace formatter_rules diff --git a/common/formatter/rules/formatting_rules.h b/common/formatter/rules/formatting_rules.h index 3331cfa9123..248f74e6882 100644 --- a/common/formatter/rules/formatting_rules.h +++ b/common/formatter/rules/formatting_rules.h @@ -1,12 +1,19 @@ #pragma once +#include #include #include "common/formatter/formatter_tree.h" namespace formatter_rules { + +extern const std::set constant_types; +namespace constant_list { +bool is_constant_list(const FormatterTreeNode& node); +} + // The formatter will try to collapse as much space as possible in the top-level, this means -// separating forms by a single empty blank line +// separating forms by a single empty blank line // // The exception are comments, top level comments will retain their following blank lines from the // original source @@ -18,11 +25,10 @@ namespace formatter_rules { // // Reference - https://github.com/kkinnear/zprint/blob/main/doc/options/blank.md namespace blank_lines { -void separate_by_newline(std::string& curr_text, - const FormatterTreeNode& containing_node, - const FormatterTreeNode& node, - const int index); -} +bool should_insert_blank_line(const FormatterTreeNode& containing_node, + const FormatterTreeNode& node, + const int index); +} // namespace blank_lines // TODO: // - align consecutive comment lines @@ -93,35 +99,6 @@ bool form_should_be_constant_paired(const FormatterTreeNode& node); namespace indent { const static int line_width_target = 120; -bool form_can_be_inlined(const std::string& curr_text, const FormatterTreeNode& curr_node); - -// TODO - right now this is very primitive in that it only checks against our hard-coded config -// eventually make this explore both routes and determine which is best -// Also factor in distance from the gutter (theres some zprint rationale somewhere on this) -bool should_form_flow(const FormatterTreeNode& list_node, const bool inlining_form); -std::optional inline_form_element(const FormatterTreeNode& list_node, const int index); - -void append_newline(std::string& curr_text, - const FormatterTreeNode& node, - const FormatterTreeNode& containing_node, - const int index, - const bool flowing, - const bool constant_pair_form, - const bool force_newline); -void indent_line(std::string& curr_text, - const FormatterTreeNode& node, - const FormatterTreeNode& containing_node, - const int depth, - const int index, - const bool flowing); -void align_lines(std::string& text, - const FormatterTreeNode& node, - const FormatterTreeNode& containing_node, - const bool constant_pair_form, - const bool flowing, - const bool force_flow, - const bool inline_element); - } // namespace indent // Let forms fall into two main categories @@ -138,8 +115,5 @@ void align_lines(std::string& text, // - forms inside the let binding are flowed // // Reference - https://github.com/kkinnear/zprint/blob/main/doc/options/let.md -namespace let { -// TODO - like above, factor in current cursor position -bool can_be_inlined(const FormatterTreeNode& form); -} // namespace let +namespace let {} // namespace let } // namespace formatter_rules diff --git a/common/formatter/rules/rule_config.cpp b/common/formatter/rules/rule_config.cpp index 9a7e8d206a5..08d9920e8ce 100644 --- a/common/formatter/rules/rule_config.cpp +++ b/common/formatter/rules/rule_config.cpp @@ -1,5 +1,7 @@ #include "rule_config.h" +#include "common/formatter/formatter_tree.h" + namespace formatter_rules { namespace config { @@ -8,27 +10,40 @@ namespace config { // TODO - this could be greatly simplified with C++20's designated initialization FormFormattingConfig new_flow_rule(int start_index) { FormFormattingConfig cfg; - cfg.force_flow = true; - cfg.start_flow_at_index = start_index; + cfg.hang_forms = false; cfg.inline_until_index = start_index; return cfg; } FormFormattingConfig new_binding_rule() { FormFormattingConfig cfg; - cfg.start_flow_at_index = 2; - cfg.bindings_at_index = 1; - cfg.force_flow = true; - cfg.force_newline_at_indices = {2}; - cfg.allow_inlining_if_size_less_than = 2; + cfg.hang_forms = false; + cfg.combine_first_two_lines = true; auto binding_list_config = std::make_shared(); - binding_list_config->force_flow = true; + binding_list_config->hang_forms = false; + binding_list_config->indentation_width = 1; + binding_list_config->indentation_width_for_index = [](FormFormattingConfig cfg, int index) { + if (index == 0) { + return 0; + } + return 4; + }; + binding_list_config->should_prevent_inlining = [](FormFormattingConfig config, int num_refs) { + // Only prevent inlining a binding list, if there are more than 1 bindings + if (num_refs > 1) { + return true; + } + return false; + }; + binding_list_config->prevent_inlining = + true; // TODO - we only want to prevent inlining if there are more than 2 elements cfg.index_configs.emplace(1, binding_list_config); return cfg; } const std::unordered_map opengoal_form_config = { {"defun", new_flow_rule(3)}, + {"defmethod", new_flow_rule(4)}, {"let", new_binding_rule()}}; } // namespace config } // namespace formatter_rules diff --git a/common/formatter/rules/rule_config.h b/common/formatter/rules/rule_config.h index fc225256a0c..90140e2b64c 100644 --- a/common/formatter/rules/rule_config.h +++ b/common/formatter/rules/rule_config.h @@ -1,25 +1,31 @@ #pragma once +#include #include +#include #include #include #include -#include "common/formatter/rules/formatting_rules.h" - namespace formatter_rules { namespace config { struct FormFormattingConfig { - bool force_hang = false; - bool force_flow = false; - std::optional allow_inlining_if_size_less_than = {}; - int start_hang_at_index = 0; - int start_flow_at_index = 0; + // new + bool hang_forms = true; // TODO - remove this eventually, it's only involved in setting the + // indentation width, which we can do via the indentation_width function + int indentation_width = + 2; // 2 for a flow // TODO - also remove this, prefer storing the first node's width in the + // metadata on the first pass, that's basically all this does + std::function indentation_width_for_index = + [](FormFormattingConfig config, int index) { return config.indentation_width; }; + bool combine_first_two_lines = + false; // NOTE - basically hang, but will probably stick around after hang is gone int inline_until_index = -1; - std::optional bindings_at_index = {}; - std::optional skip_newlines_until_index = {}; - std::vector force_newline_at_indices = {}; - bool bindings_force_newlines = false; + bool has_constant_pairs = false; + bool prevent_inlining = false; + std::function should_prevent_inlining = + [](FormFormattingConfig config, int num_refs) { return config.prevent_inlining; }; + int parent_mutable_extra_indent = 0; std::unordered_map> index_configs = {}; }; diff --git a/test/common/formatter/corpus/blank-lines.test.gc b/test/common/formatter/corpus/blank-lines.test.gc index 6198cd2703b..dc321569500 100644 --- a/test/common/formatter/corpus/blank-lines.test.gc +++ b/test/common/formatter/corpus/blank-lines.test.gc @@ -1,5 +1,5 @@ === -Separate Forms +Separate Top Level === (println "test") diff --git a/test/common/formatter/corpus/constant-pairs.test.gc b/test/common/formatter/corpus/constant-pairs.test.gc index 9f175c86d3a..9894e91cf2a 100644 --- a/test/common/formatter/corpus/constant-pairs.test.gc +++ b/test/common/formatter/corpus/constant-pairs.test.gc @@ -41,7 +41,7 @@ Four Pairs :doit 789) === -Not a Valid Constant +Function Call Pair - Inlinable === (:hello @@ -53,11 +53,33 @@ Not a Valid Constant --- +(:hello "world" + :world 123 + :test 456 + :not (println "hello world") + :doit 789) + +=== +Function Call Pair - Too Long and Multiline +=== + +(:hello + +"world" :world 123 +:test 456 +:not (println "hello world" "hello world" "hello world" (println "hello world hello world hello world hello world hello world hello world hello world hello world")) +:doit 789) + +--- + (:hello "world" :world 123 :test 456 :not - (println "hello world") + (println "hello world" + "hello world" + "hello world" + (println "hello world hello world hello world hello world hello world hello world hello world hello world")) :doit 789) === diff --git a/test/common/formatter/corpus/functions.test.gc b/test/common/formatter/corpus/functions.test.gc new file mode 100644 index 00000000000..8bbff533dfc --- /dev/null +++ b/test/common/formatter/corpus/functions.test.gc @@ -0,0 +1,36 @@ +=== +Decent size and nesting +=== + +(defun can-display-query? ((arg0 process) (arg1 string) (arg2 float)) + (let ((a1-3 (gui-control-method-12 + *gui-control* + arg0 + (gui-channel query) + (gui-action play) + (if arg1 + arg1 + (symbol->string (-> arg0 type symbol)) + ) + 0 + arg2 + (new 'static 'sound-id) + ) + ) + ) + (= (get-status *gui-control* a1-3) (gui-status active)) + ) + ) + +--- + +(defun can-display-query? ((arg0 process) (arg1 string) (arg2 float)) + (let ((a1-3 (gui-control-method-12 *gui-control* + arg0 + (gui-channel query) + (gui-action play) + (if arg1 arg1 (symbol->string (-> arg0 type symbol))) + 0 + arg2 + (new 'static 'sound-id)))) + (= (get-status *gui-control* a1-3) (gui-status active)))) diff --git a/test/common/formatter/corpus/indent.test.gc b/test/common/formatter/corpus/indent.test.gc index 26ff87e5b51..676b26b4a94 100644 --- a/test/common/formatter/corpus/indent.test.gc +++ b/test/common/formatter/corpus/indent.test.gc @@ -11,6 +11,19 @@ Basic Nested Form "world2" "very-long-formvery-long-formvery-long-formvery-long-formvery-long-formvery-long-formvery-long-form")) +=== +Basic Nested Form Reversed +=== + +(println (println "world" "world2" "very-long-formvery-long-formvery-long-formvery-long-formvery-long-formvery-long-formvery-long-form") "hello") + +--- + +(println (println "world" + "world2" + "very-long-formvery-long-formvery-long-formvery-long-formvery-long-formvery-long-formvery-long-form") + "hello") + === Multiple Top Level Forms === diff --git a/test/common/formatter/corpus/symbols.test.gc b/test/common/formatter/corpus/symbols.test.gc new file mode 100644 index 00000000000..85106be8f1d --- /dev/null +++ b/test/common/formatter/corpus/symbols.test.gc @@ -0,0 +1,9 @@ +=== +Quoted Symbols +=== + +(new 'static 'sound-id) + +--- + +(new 'static 'sound-id) diff --git a/test/common/formatter/corpus/top-level.test.gc b/test/common/formatter/corpus/top-level.test.gc deleted file mode 100644 index ddde173a972..00000000000 --- a/test/common/formatter/corpus/top-level.test.gc +++ /dev/null @@ -1,14 +0,0 @@ -=== -Top Level Elements -=== - -(println "test") -(println "test") (println "test") - ---- - -(println "test") - -(println "test") - -(println "test") diff --git a/test/common/formatter/test_formatter.cpp b/test/common/formatter/test_formatter.cpp index 7ff3b5eeb1c..78b5911d67a 100644 --- a/test/common/formatter/test_formatter.cpp +++ b/test/common/formatter/test_formatter.cpp @@ -90,6 +90,8 @@ bool has_important_tests(const fs::path& file_path) { return false; } +// TODO - consider adding a test that auto-formats all of goal_src (there should be no errors) + bool run_tests(const fs::path& file_path, const bool only_important_tests) { const auto& tests = get_test_definitions(file_path); // Run the tests, report successes and failures @@ -101,6 +103,9 @@ bool run_tests(const fs::path& file_path, const bool only_important_tests) { continue; } const auto formatted_result = formatter::format_code(test.input); + if (formatted_result && str_util::starts_with(test.name, "!?")) { + fmt::print("FORMATTED RESULT:\n\n{}\n\n", formatted_result.value()); + } if (!formatted_result) { // Unable to parse, was that expected? if (test.output == "__THROWS__") { @@ -122,6 +127,7 @@ bool run_tests(const fs::path& file_path, const bool only_important_tests) { } bool find_and_run_tests() { + // TODO - fails when it finds no tests try { // Enumerate test files const auto test_files = file_util::find_files_recursively(