Added an option to tidy_source() to enforce a strict maximum line length. #71

krivit · 2017-08-04T00:26:29Z

This patch adds another option, width.strict= to tidy_source(), defaulting to FALSE. If TRUE, instead of calling deparse() with width.cutoff= directly, it calls a new function, strict_deparse(..., width.max), which performs a binary search for the highest width.cutoff= value to pass to deparse() such that the longest line in its output is, after stripping trailing whitespace, at most width.max= in length.

It seems to work quite well in most situations, though it still doesn't handle inline comments very well: the magic constant %InLiNe_IdEnTiFiEr% counts towards the line length, though it shouldn't. A possible solution might be to replace this magic constant with something much shorter, perhaps involving non-ASCII characters (which are legal in variable names in R, as far as I know) to reduce chances of a collision.

…gth.

… with other functions.

…OL COMBINING SPRECHGESANG STEM character) to save space.

krivit · 2017-08-04T04:20:48Z

The behaviour with inline comments is now a little better. Unfortunately, if deparse() wraps a line just before a comment, it will get "merged" back in the unmasking code, potentially overflowing.

I've tried changing that code, but it doesn't work very well in knitr when output is interspersed with lines of code: the inline comment ends up going after the output.

yihui · 2017-08-04T06:17:36Z

Much appreciated! I'll review this tomorrow.

pablo14 · 2018-02-13T18:06:16Z

I did a simpler (and less efficient) hack to tidy_block function. It tries different width until the desired sentence width is reached. Otherwise, it returns the original value.

It works for the pdf book I'm writing in most of the cases. I don't know if this approach may conflict with other use-cases.

PS: bookdown + formatR are awesome!

# wrapper around parse() and deparse(), iterativelly it tries to re-write every line with the correct width
tidy_block = function(text, width = getOption('width'), arrow = FALSE) {
  exprs = parse_only(text)
  if (length(exprs) == 0) return(character(0))
  exprs = if (arrow) replace_assignment(exprs) else as.list(exprs)
  expr_2=sapply(exprs, function(e) paste(base::deparse(e, width), collapse = '\n'))

  expr_2ver=NULL

  # for each overflow expr, iterate until it reaches the condition
  for(i in 1:length(expr_2))
  {
    expr = parse_only(expr_2[i])

    # flagging if exceeds max
    if(nchar(expr)>width)
    {
      new_width=width-1

      flag_end_while=T
      while(flag_end_while)
      {
        #new_expr=paste(base::deparse(expr, new_width), collapse = '\n')
        new_expr=sapply(expr, function(e) paste(base::deparse(e, new_width), collapse = '\n'))

        # split in several sentences
        new_expr_split=strsplit(new_expr, "\n")[[1]]

        # trim left spaces
        new_expr_split_t=trimws(new_expr_split, which = "both")

        # calculate sentence length
        expr_len=nchar(new_expr_split_t)

        # if all expr are under original width, then add expr to ok
        if(all(expr_len <= width))
        {
          expr_2ver=rbind(expr_2ver, new_expr)
          flag_end_while=F
        }

        new_width=new_width-1

        # seting min condition
        if(new_width<10)
        {
          new_expr=sapply(expr, function(e) paste(base::deparse(e, width), collapse = '\n'))
          expr_2ver=rbind(expr_2ver, new_expr)
          flag_end_while=F
        }

      } # end while
    } else {
      new_expr=sapply(expr, function(e) paste(base::deparse(e, width), collapse = '\n'))
      expr_2ver=rbind(expr_2ver, new_expr)
    }# end if width ok
  } # end for

  rownames(expr_2ver)=NULL
  return(expr_2ver[,1])
}

krivit · 2018-03-09T13:16:00Z

Just FYI, I submitted an Enhancement request to implement this at the deparse() level. (https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17390)

krivit · 2018-04-03T05:48:51Z

The R Bugzilla (https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17390) request was closed, though they are open to the possibility of a patch.

krivit · 2018-04-09T01:21:54Z

@yihui , what about using r-lib/styler? It looks like its API might allow line width enforcement in some form. Also, I think it might handle comments more elegantly than formatR.

lorenzwalthert · 2018-08-12T20:22:16Z

For the record: styler (as of version 1.0.2) does not currently support line width enforcement, but there is WIP to achieve this (contributed by @krivit, will be finalized by @lorenzwalthert), see r-lib/styler#414.

Conflicts: DESCRIPTION R/tidy.R man/tidy_source.Rd

…ter may not work on Windows

…y search for the optimal width fails

yihui

@krivit I made a couple of changes in your PR:

Instead of introducing a new width.strict argument, I'm treating I(width) as the maximum width. This saves one argument, but I'm open to adding a new argument if you don't like the I()dea.
The binary search is actually a little tricky here, because the deparsed line width is not a monotonic function of the specified width. Sometimes the deparsed width can decrease as the desired width increases, e.g.,
```
f = function(text, width) {
  unlist(lapply(width, function(w) {
    max(nchar(deparse(parse(text = text, keep.source = FALSE)[[1]], width.cutoff = w)))
  }))
}
x = paste(c(rep('1', 20), '1234567890'), collapse = '+')
i = 20:100
plot(i, f(x, i), type = 'b', pch = 20)
```
So when the binary search fails, I use the brute-force to try all the rest of possible widths.

Please let me know if you have any feedback. I'm going to merge this PR for now, but open to suggestions as I said earlier. Thanks a lot!

krivit · 2021-03-17T22:08:25Z

R/tidy.R

+#'
+#' If the value of the argument \code{width.cutoff} is wrapped in
+#' \code{\link{I}()} (e.g., \code{I(60)}), it will be treated as the \emph{upper
+#' bound} on the line width, but this upper bound may not be satisfied. In this
+#' case, the function will perform a binary search for a width value that can
+#' make \code{deparse()} return code with line width smaller than or equal to
+#' the \code{width.cutoff} value. If the search fails to find such a value, it
+#' will emit a warning, which can be suppressed by the global option
+#' \code{options(formatR.width.warning = FALSE)}.


Thanks for merging this PR! It will be great to have this Just Work.

I think this paragraph isn't 100% clear, because it can be read as the upper bound potentially not being satisfied after I() is used. Suggested edit:

#' If the value of the argument \code{width.cutoff} is wrapped in #' \code{\link{I}()} (e.g., \code{I(60)}), it will be treated as the #' \emph{upper bound} on the line width. The corresponding argument to #' \code{deparse()} is actually a lower bound, and so the function #' will perform a binary search for a width value that can make #' \code{deparse()} return code with line width smaller than or equal #' to the \code{width.cutoff} value. If the search fails to find such #' a value, it will emit a warning, which can be suppressed by #' the global option \code{options(formatR.width.warning = FALSE)}.

Thanks for the suggestion! I'll commit that later.

krivit added 3 commits August 4, 2017 10:06

Added an option to tidy_source() to enforce a strict maximum line len…

79b9cc7

…gth.

In strict_deparse(), renamed max.width= to width.max= for consistency…

58106bb

… with other functions.

Replaced "%InLiNe_IdEnTiFiEr%" with "%\u1d166%" (Unicode MUSICAL SYMB…

b3ab028

…OL COMBINING SPRECHGESANG STEM character) to save space.

pablo14 mentioned this pull request Feb 18, 2018

width.cutoff doesn't work (problem in deparse function) #77

Closed

Merge branch 'master' into hard_deparse

9ebc560

yihui added 5 commits March 16, 2021 13:52

Merge commit '437668a46ce38642096250c80cb63598cf5b597c'

ff5d419

Conflicts: DESCRIPTION R/tidy.R man/tidy_source.Rd

try the zero-width space instead of the musical symbol, since the lat…

779f3e0

…ter may not work on Windows

the zero-width space doesn't work on Windows, either; try \b

3859e32

treat I(width) as the maximum width, and use brute-force if the binar…

c5b4a40

…y search for the optimal width fails

mention the global option formatR.width in NEWS and vignette

2cbd377

yihui approved these changes Mar 17, 2021

View reviewed changes

add @krivit to the list of ctb

a06d431

yihui merged commit 3064d8d into yihui:master Mar 17, 2021

krivit commented Mar 17, 2021

View reviewed changes

yihui added a commit that referenced this pull request Mar 18, 2021

incorporate the doc suggestion from #71

6b8366d

krivit deleted the hard_deparse branch March 19, 2021 12:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added an option to tidy_source() to enforce a strict maximum line length. #71

Added an option to tidy_source() to enforce a strict maximum line length. #71

krivit commented Aug 4, 2017 •

edited

Loading

krivit commented Aug 4, 2017

yihui commented Aug 4, 2017

pablo14 commented Feb 13, 2018

krivit commented Mar 9, 2018

krivit commented Apr 3, 2018

krivit commented Apr 9, 2018 •

edited

Loading

lorenzwalthert commented Aug 12, 2018 •

edited

Loading

yihui left a comment •

edited

Loading

krivit Mar 17, 2021

yihui Mar 18, 2021

Added an option to tidy_source() to enforce a strict maximum line length. #71

Added an option to tidy_source() to enforce a strict maximum line length. #71

Conversation

krivit commented Aug 4, 2017 • edited Loading

krivit commented Aug 4, 2017

yihui commented Aug 4, 2017

pablo14 commented Feb 13, 2018

krivit commented Mar 9, 2018

krivit commented Apr 3, 2018

krivit commented Apr 9, 2018 • edited Loading

lorenzwalthert commented Aug 12, 2018 • edited Loading

yihui left a comment • edited Loading

Choose a reason for hiding this comment

krivit Mar 17, 2021

Choose a reason for hiding this comment

yihui Mar 18, 2021

Choose a reason for hiding this comment

krivit commented Aug 4, 2017 •

edited

Loading

krivit commented Apr 9, 2018 •

edited

Loading

lorenzwalthert commented Aug 12, 2018 •

edited

Loading

yihui left a comment •

edited

Loading