diff --git a/CHANGELOG.md b/CHANGELOG.md index 035fc2d7..d9427d8b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -47,10 +47,11 @@ This name should be decided amongst the team before the release. ### Added - [#705](https://github.com/tweag/topiary/pull/705) Added support for Nickel 1.7 extended pattern formatting - [#737](https://github.com/tweag/topiary/pull/737) Added the `prefetch` command, that prefetches and caches all grammars in the current configuration +- [#755](https://github.com/tweag/topiary/pull/755) Introduce measuring scopes, which can be used in conjunction with regular scopes to add expressivity to the formatting queries. ### Fixed - [#720](https://github.com/tweag/topiary/pull/720) [#722](https://github.com/tweag/topiary/pull/722) [#723](https://github.com/tweag/topiary/pull/723) [#724](https://github.com/tweag/topiary/pull/724) [#735](https://github.com/tweag/topiary/pull/735) -[#738](https://github.com/tweag/topiary/pull/738) [#739](https://github.com/tweag/topiary/pull/739) [#745](https://github.com/tweag/topiary/pull/745) Various OCaml improvements +[#738](https://github.com/tweag/topiary/pull/738) [#739](https://github.com/tweag/topiary/pull/739) [#745](https://github.com/tweag/topiary/pull/745) [#755](https://github.com/tweag/topiary/pull/755) Various OCaml improvements - [#744](https://github.com/tweag/topiary/pull/744) Nickel: fix the indentation of `in` for annotated multiline let-bindings ### Changed diff --git a/README.md b/README.md index bb94e9cb..b0ddfb0d 100644 --- a/README.md +++ b/README.md @@ -1225,6 +1225,88 @@ containing the matched nodes are is single-line (resp. multi-line). ) ``` +### `@prepend_begin_measuring_scope` / `@append_begin_measuring_scope` / `@prepend_end_measuring_scope` / `@append_end_measuring_scope` + +Sometimes, custom scopes are not enough: we may want to format a node depending on the multi-line-ness of a piece of code that does not include the node in question. For instance, consider this function application in OCaml: +```ocaml +foo bar (fun x -> qux) +``` +We may also want to format it as any of the following two, depending on the actual length of `foo`, `bar`, and `qux`: +```ocaml +foo bar (fun x -> + qux +) +``` +```ocaml +foo + bar + (fun x -> + qux + ) +``` +Consider the indentation of `(fun x -> qux)`: if `foo bar` is single-line, we don't want to indent it. But if `foo bar` is multi-line, we do want to indent it. + +Because custom scopes can only impact the behaviour of nodes inside the scope, we can't use them to solve this issue. This is why we need `measuring_scope`. + +Measuring scopes are opened/closed with a similar syntax as "regular" custom scopes, with any of the following tags, in conjunction with the `#scope_id!` predicate: + +* `@prepend_begin_measuring_scope` +* `@append_begin_measuring_scope` +* `@prepend_end_measuring_scope` +* `@prepend_begin_measuring_scope` + +Measuring scopes behave as follows: +* A measuring scope must always be contained in a regular custom scope with the same `#scope_id!`. There can't be two measuring scopes with the same `#scope_id!` inside the same regular custom scope. +* If a regular custom scope contains a measuring scope, then all tags contained in the regular scope that depend on its multi-line-ness will instead depend on the multi-line-ness of the measuring scope (hence the name: the inner, measuring scope measures the multi-line-ness of the outer, regular scope). + +#### Example + +The example below solves the problem of indenting function application in OCaml stated above, using measuring scopes. +```scheme +(application_expression + . + (_) @append_indent_start @prepend_begin_scope @prepend_begin_measuring_scope + (#scope_id! "function_application") + (_) @append_end_scope + . +) +; The end of the measuring scope depends on the last argument: if it's a function, +; end it before the function, otherwise end it after the last argument. In that case, +; it's the same as the regular scope. +(application_expression + (#scope_id! "function_application") + (_ + [ + (fun_expression) + (function_expression) + ]? @do_nothing + ) @append_end_measuring_scope + . +) +(application_expression + (#scope_id! "function_application") + (_ + [ + (fun_expression) + (function_expression) + ] @prepend_end_measuring_scope + ) + . +) +; If the measuring scope is single-line, end indentation _before_ the last node. +; Otherwise, end the indentation after the last node. +(application_expression + (#multi_line_scope_only! "function_application") + (_) @append_indent_end + . +) +(application_expression + (#single_line_scope_only! "function_application") + (_) @prepend_indent_end + . +) +``` + ## Suggested workflow In order to work productively on query files, the following is one diff --git a/topiary-cli/tests/samples/expected/ocaml.ml b/topiary-cli/tests/samples/expected/ocaml.ml index cb1be63e..bc4fd90d 100644 --- a/topiary-cli/tests/samples/expected/ocaml.ml +++ b/topiary-cli/tests/samples/expected/ocaml.ml @@ -1187,8 +1187,8 @@ let () = let () = foo x (fun y -> - zzzzzzzzzz - ) + zzzzzzzzzz + ) let foo x = function | y -> zzzzzzzzzz @@ -1206,9 +1206,9 @@ let () = let () = foo x (function - | y -> zzzzzzzzzz - | u -> vvvvvvvvv - ) + | y -> zzzzzzzzzz + | u -> vvvvvvvvv + ) (* #727 proper formatting of multi-lined typed function argument *) let foo diff --git a/topiary-core/src/atom_collection.rs b/topiary-core/src/atom_collection.rs index 56d1e00c..a3a71a9c 100644 --- a/topiary-core/src/atom_collection.rs +++ b/topiary-core/src/atom_collection.rs @@ -1,5 +1,6 @@ use std::{ borrow::Cow, + cmp::Ordering, collections::{HashMap, HashSet}, mem, ops::Deref, @@ -307,6 +308,34 @@ impl AtomCollection { predicates, ); } + "prepend_begin_measuring_scope" => { + self.prepend( + Atom::MeasuringScopeBegin(scope_information_prepend()?), + node, + predicates, + ); + } + "append_begin_measuring_scope" => { + self.append( + Atom::MeasuringScopeBegin(scope_information_append()?), + node, + predicates, + ); + } + "prepend_end_measuring_scope" => { + self.prepend( + Atom::MeasuringScopeEnd(scope_information_prepend()?), + node, + predicates, + ); + } + "append_end_measuring_scope" => { + self.append( + Atom::MeasuringScopeEnd(scope_information_append()?), + node, + predicates, + ); + } // Scoped softlines "append_empty_scoped_softline" => { let id = self.next_id(); @@ -404,10 +433,29 @@ impl AtomCollection { pub fn apply_prepends_and_appends(&mut self) { let mut expanded: Vec = Vec::new(); + // We sort the prepends/appends so that: + // * BeginScope(s) will always be the first element(s) + // * EndScope(s) will always be the last element(s) + // This permits proper processing of measuring scopes and scoped atoms + // that are added at the same place as Begin/EndScopes. + fn compare_atoms(a1: &Atom, a2: &Atom) -> Ordering { + match (a1, a2) { + (Atom::ScopeBegin(_), Atom::ScopeBegin(_)) => Ordering::Equal, + (Atom::ScopeBegin(_), _) => Ordering::Less, + (_, Atom::ScopeBegin(_)) => Ordering::Greater, + (Atom::ScopeEnd(_), Atom::ScopeEnd(_)) => Ordering::Equal, + (Atom::ScopeEnd(_), _) => Ordering::Greater, + (_, Atom::ScopeEnd(_)) => Ordering::Less, + (_, _) => Ordering::Equal, + } + } + for atom in &mut self.atoms { if let Atom::Leaf { id, .. } = atom { let prepends = self.prepend.entry(*id).or_default(); + prepends.sort_by(compare_atoms); let appends = self.append.entry(*id).or_default(); + appends.sort_by(compare_atoms); // Rather than cloning the atom from the old vector, we // simply take it. This will leave a default (empty) atom @@ -605,10 +653,16 @@ impl AtomCollection { type ScopeId = String; type LineIndex = u32; type ScopedNodeId = usize; - // `opened_scopes` maintains stacks of opened scopes, - // the line at which they started, - // and the list of `ScopedSoftline` they contain. - let mut opened_scopes: HashMap<&ScopeId, Vec<(LineIndex, Vec<&Atom>)>> = HashMap::new(); + type OpenedScopeInfo<'a> = (LineIndex, Vec<&'a Atom>, Option); + // `opened_scopes` maintains stacks of opened scopes. + // For each scope, we record: + // * the line at which they started (LineIndex), + // * the list of `ScopedSoftline` and `ScopedConditional` they contain (Vec<&Atom>), + // * if they contain a measuring scope, whether it is multi-line (Option). + let mut opened_scopes: HashMap<&ScopeId, Vec> = HashMap::new(); + // `opened_measuring_scopes` maintains stacks of opened measuring scopes, + // and the line at which they started. + let mut opened_measuring_scopes: HashMap<&ScopeId, Vec> = HashMap::new(); // We can't process `ScopedSoftline` in-place as we encounter them in the list of // atoms: we need to know when their encompassing scope ends to decide what to // replace them with. Instead of in-place modifications, we associate a replacement @@ -631,16 +685,20 @@ impl AtomCollection { opened_scopes .entry(scope_id) .or_default() - .push((*line_start, Vec::new())); + .push((*line_start, Vec::new(), None)); } else if let Atom::ScopeEnd(ScopeInformation { line_number: line_end, scope_id, }) = atom { - if let Some((line_start, atoms)) = + if let Some((line_start, atoms, measuring_scope)) = opened_scopes.get_mut(scope_id).and_then(Vec::pop) { - let multiline = line_start != *line_end; + let multiline = if let Some(mult) = measuring_scope { + mult + } else { + line_start != *line_end + }; for atom in atoms { if let Atom::ScopedSoftline { id, spaced, .. } = atom { let new_atom = if multiline { @@ -671,9 +729,58 @@ impl AtomCollection { log::warn!("Closing unopened scope {scope_id:?}"); force_apply_modifications = true; } + // Open measuring scope + } else if let Atom::MeasuringScopeBegin(ScopeInformation { + line_number: line_start, + scope_id, + }) = atom + { + if opened_scopes.entry(scope_id).or_default().is_empty() { + log::warn!( + "Opening measuring scope with no associated regular scope {scope_id:?}" + ); + force_apply_modifications = true; + } else { + opened_measuring_scopes + .entry(scope_id) + .or_default() + .push(*line_start) + } + // Close measuring scope and register multi-line-ness in the appropriate regular scope + } else if let Atom::MeasuringScopeEnd(ScopeInformation { + line_number: line_end, + scope_id, + }) = atom + { + if let Some(line_start) = + opened_measuring_scopes.get_mut(scope_id).and_then(Vec::pop) + { + let multi_line = line_start != *line_end; + if let Some((regular_line_start, vec, measuring_scope)) = + opened_scopes.get_mut(scope_id).and_then(Vec::pop) + { + if measuring_scope.is_none() { + opened_scopes.entry(scope_id).or_default().push(( + regular_line_start, + vec, + Some(multi_line), + )); + } else { + log::warn!("Found several measuring scopes in a single regular scope {scope_id:?}"); + force_apply_modifications = true; + } + } else { + log::warn!("Found measuring scope outside of regular scope {scope_id:?}"); + force_apply_modifications = true; + } + } else { + log::warn!("Closing unopened measuring scope {scope_id:?}"); + force_apply_modifications = true; + } // Register the ScopedSoftline in the correct scope } else if let Atom::ScopedSoftline { scope_id, .. } = atom { - if let Some((_, vec)) = opened_scopes.get_mut(&scope_id).and_then(|v| v.last_mut()) + if let Some((_, vec, _)) = + opened_scopes.get_mut(&scope_id).and_then(|v| v.last_mut()) { vec.push(atom); } else { @@ -682,7 +789,8 @@ impl AtomCollection { } // Register the ScopedConditional in the correct scope } else if let Atom::ScopedConditional { scope_id, .. } = atom { - if let Some((_, vec)) = opened_scopes.get_mut(&scope_id).and_then(|v| v.last_mut()) + if let Some((_, vec, _)) = + opened_scopes.get_mut(&scope_id).and_then(|v| v.last_mut()) { vec.push(atom); } else { @@ -691,7 +799,7 @@ impl AtomCollection { } } } - let still_opened: Vec<&String> = opened_scopes + let mut still_opened: Vec<&String> = opened_scopes .into_iter() .filter_map(|(scope_id, vec)| if vec.is_empty() { None } else { Some(scope_id) }) .collect(); @@ -699,13 +807,25 @@ impl AtomCollection { log::warn!("Some scopes have been left opened: {:?}", still_opened); force_apply_modifications = true; } + still_opened = opened_measuring_scopes + .into_iter() + .filter_map(|(scope_id, vec)| if vec.is_empty() { None } else { Some(scope_id) }) + .collect(); + if !still_opened.is_empty() { + log::warn!( + "Some measuring scopes have been left opened: {:?}", + still_opened + ); + force_apply_modifications = true; + } // Remove scopes from the atom list for atom in &mut self.atoms { match atom { - Atom::ScopeBegin(_) | Atom::ScopeEnd(_) => { - *atom = Atom::Empty; - } + Atom::ScopeBegin(_) + | Atom::ScopeEnd(_) + | Atom::MeasuringScopeBegin(_) + | Atom::MeasuringScopeEnd(_) => *atom = Atom::Empty, _ => {} } } @@ -954,7 +1074,8 @@ fn collapse_spaces_before_antispace(v: &mut [Atom]) { antispace_mode = true; } else if *a == Atom::Space && antispace_mode { *a = Atom::Empty; - } else { + } else if *a != Atom::Empty { + // Don't change mode when encountering Empty antispace_mode = false; } } diff --git a/topiary-core/src/lib.rs b/topiary-core/src/lib.rs index 99a18e31..cea567fe 100644 --- a/topiary-core/src/lib.rs +++ b/topiary-core/src/lib.rs @@ -92,20 +92,30 @@ pub enum Atom { /// Indicates the end of a scope, use in combination with the /// ScopedSoftlines and ScopedConditionals below. ScopeEnd(ScopeInformation), + // Indicates the beginning of a *measuring* scope, that must be related to a "normal" one. + // Used in combination with ScopedSoftlines and ScopedConditionals below. + MeasuringScopeBegin(ScopeInformation), + // Indicates the end of a *measuring* scope, that must be related to a "normal" one. + // Used in combination with ScopedSoftlines and ScopedConditionals below. + MeasuringScopeEnd(ScopeInformation), /// Scoped commands - // ScopedSoftline works together with the @{prepend,append}_begin_scope and - // @{prepend,append}_end_scope query tags. To decide if a scoped softline + // ScopedSoftline works together with the @{prepend,append}_begin[_measuring]_scope and + // @{prepend,append}_end[_measuring]_scope query tags. To decide if a scoped softline // must be expanded into a hardline, we look at the innermost scope having // the corresponding `scope_id`, that encompasses it. We expand the softline - // if that scope is multi-line. The `id` value is here for technical - // reasons, it allows tracking of the atom during post-processing. + // if that scope is multi-line. + // If that scope contains a *measuring* scope with the same `scope_id`, we expand + // the node iff that *measuring* scope is multi-line. + // The `id` value is here for technical reasons, + // it allows tracking of the atom during post-processing. ScopedSoftline { id: usize, scope_id: String, spaced: bool, }, - /// Represents an atom that must only be output if the associated scope meets the condition - /// (single-line or multi-line) + /// Represents an atom that must only be output if the associated scope + /// (or its associated measuring scope, see above) meets the condition + /// (single-line or multi-line). ScopedConditional { id: usize, scope_id: String, diff --git a/topiary-queries/queries/ocaml.scm b/topiary-queries/queries/ocaml.scm index be3bab0f..9d95ec02 100644 --- a/topiary-queries/queries/ocaml.scm +++ b/topiary-queries/queries/ocaml.scm @@ -1153,19 +1153,33 @@ ; long_argument_3 ; long_argument_4 ; -; When the last argument is a (parenthesized) function application, end the scope +; When the last argument is a (parenthesized) function application, end the indentation ; _before_ the application. This allows the following to be formatted as such: ; let () = ; foo bar (fun x -> -; something horrible onto x +; something horrible onto x +; ) +; But when the function application minus the last argument is multiline, +; the whole scope still must be indented: +; let () = +; foo +; bar +; (fun x -> +; x ; ) +; +; Because of these constraints, we must use measuring scopes here: the position of the +; indent_end depends on the multi-line-ness of a subsection of the whole scope. (application_expression . - (_) @append_indent_start @prepend_begin_scope + (_) @append_indent_start @prepend_begin_scope @prepend_begin_measuring_scope (#scope_id! "function_application") - (_) @append_indent_end + (_) @append_end_scope . ) +; The end of the measuring scope depends on the last argument: if it's a function, +; end it before the function, otherwise end it after the last argument. In that case, +; it's the same as the regular scope. (application_expression (#scope_id! "function_application") (_ @@ -1173,7 +1187,7 @@ (fun_expression) (function_expression) ]? @do_nothing - ) @append_end_scope + ) @append_end_measuring_scope . ) (application_expression @@ -1182,10 +1196,24 @@ [ (fun_expression) (function_expression) - ] @prepend_end_scope + ] @prepend_end_measuring_scope ) . ) +; If the measuring scope is single-line, end indentation _before_ the last node. +; Otherwise, end the indentation after the last node. +(application_expression + (#multi_line_scope_only! "function_application") + (_) @append_indent_end + . +) +(application_expression + (#single_line_scope_only! "function_application") + (_) @prepend_indent_end + . +) +; The node to which we apply append_spaced_scoped_softline will always +; be in both scopes, regular and measuring. (application_expression (_) @append_spaced_scoped_softline .