Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added an option to tidy_source() to enforce a strict maximum line length. #71

Merged
merged 10 commits into from
Mar 17, 2021

Conversation

krivit
Copy link
Contributor

@krivit krivit commented Aug 4, 2017

This patch adds another option, width.strict= to tidy_source(), defaulting to FALSE. If TRUE, instead of calling deparse() with width.cutoff= directly, it calls a new function, strict_deparse(..., width.max), which performs a binary search for the highest width.cutoff= value to pass to deparse() such that the longest line in its output is, after stripping trailing whitespace, at most width.max= in length.

It seems to work quite well in most situations, though it still doesn't handle inline comments very well: the magic constant %InLiNe_IdEnTiFiEr% counts towards the line length, though it shouldn't. A possible solution might be to replace this magic constant with something much shorter, perhaps involving non-ASCII characters (which are legal in variable names in R, as far as I know) to reduce chances of a collision.

@krivit
Copy link
Contributor Author

krivit commented Aug 4, 2017

The behaviour with inline comments is now a little better. Unfortunately, if deparse() wraps a line just before a comment, it will get "merged" back in the unmasking code, potentially overflowing.

I've tried changing that code, but it doesn't work very well in knitr when output is interspersed with lines of code: the inline comment ends up going after the output.

@yihui
Copy link
Owner

yihui commented Aug 4, 2017

Much appreciated! I'll review this tomorrow.

@pablo14
Copy link

pablo14 commented Feb 13, 2018

I did a simpler (and less efficient) hack to tidy_block function. It tries different width until the desired sentence width is reached. Otherwise, it returns the original value.

It works for the pdf book I'm writing in most of the cases. I don't know if this approach may conflict with other use-cases.

PS: bookdown + formatR are awesome!

# wrapper around parse() and deparse(), iterativelly it tries to re-write every line with the correct width
tidy_block = function(text, width = getOption('width'), arrow = FALSE) {
  exprs = parse_only(text)
  if (length(exprs) == 0) return(character(0))
  exprs = if (arrow) replace_assignment(exprs) else as.list(exprs)
  expr_2=sapply(exprs, function(e) paste(base::deparse(e, width), collapse = '\n'))

  expr_2ver=NULL

  # for each overflow expr, iterate until it reaches the condition
  for(i in 1:length(expr_2))
  {
    expr = parse_only(expr_2[i])

    # flagging if exceeds max
    if(nchar(expr)>width)
    {
      new_width=width-1

      flag_end_while=T
      while(flag_end_while)
      {
        #new_expr=paste(base::deparse(expr, new_width), collapse = '\n')
        new_expr=sapply(expr, function(e) paste(base::deparse(e, new_width), collapse = '\n'))

        # split in several sentences
        new_expr_split=strsplit(new_expr, "\n")[[1]]

        # trim left spaces
        new_expr_split_t=trimws(new_expr_split, which = "both")

        # calculate sentence length
        expr_len=nchar(new_expr_split_t)

        # if all expr are under original width, then add expr to ok
        if(all(expr_len <= width))
        {
          expr_2ver=rbind(expr_2ver, new_expr)
          flag_end_while=F
        }

        new_width=new_width-1

        # seting min condition
        if(new_width<10)
        {
          new_expr=sapply(expr, function(e) paste(base::deparse(e, width), collapse = '\n'))
          expr_2ver=rbind(expr_2ver, new_expr)
          flag_end_while=F
        }

      } # end while
    } else {
      new_expr=sapply(expr, function(e) paste(base::deparse(e, width), collapse = '\n'))
      expr_2ver=rbind(expr_2ver, new_expr)
    }# end if width ok
  } # end for

  rownames(expr_2ver)=NULL
  return(expr_2ver[,1])
}

@krivit
Copy link
Contributor Author

krivit commented Mar 9, 2018

Just FYI, I submitted an Enhancement request to implement this at the deparse() level. (https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17390)

@krivit
Copy link
Contributor Author

krivit commented Apr 3, 2018

The R Bugzilla (https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17390) request was closed, though they are open to the possibility of a patch.

@krivit
Copy link
Contributor Author

krivit commented Apr 9, 2018

@yihui , what about using r-lib/styler? It looks like its API might allow line width enforcement in some form. Also, I think it might handle comments more elegantly than formatR.

@lorenzwalthert
Copy link

lorenzwalthert commented Aug 12, 2018

For the record: styler (as of version 1.0.2) does not currently support line width enforcement, but there is WIP to achieve this (contributed by @krivit, will be finalized by @lorenzwalthert), see r-lib/styler#414.

Copy link
Owner

@yihui yihui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krivit I made a couple of changes in your PR:

  1. Instead of introducing a new width.strict argument, I'm treating I(width) as the maximum width. This saves one argument, but I'm open to adding a new argument if you don't like the I()dea.

  2. The binary search is actually a little tricky here, because the deparsed line width is not a monotonic function of the specified width. Sometimes the deparsed width can decrease as the desired width increases, e.g.,

    f = function(text, width) {
      unlist(lapply(width, function(w) {
        max(nchar(deparse(parse(text = text, keep.source = FALSE)[[1]], width.cutoff = w)))
      }))
    }
    x = paste(c(rep('1', 20), '1234567890'), collapse = '+')
    i = 20:100
    plot(i, f(x, i), type = 'b', pch = 20)

    image

    So when the binary search fails, I use the brute-force to try all the rest of possible widths.

Please let me know if you have any feedback. I'm going to merge this PR for now, but open to suggestions as I said earlier. Thanks a lot!

@yihui yihui merged commit 3064d8d into yihui:master Mar 17, 2021
Comment on lines +8 to +16
#'
#' If the value of the argument \code{width.cutoff} is wrapped in
#' \code{\link{I}()} (e.g., \code{I(60)}), it will be treated as the \emph{upper
#' bound} on the line width, but this upper bound may not be satisfied. In this
#' case, the function will perform a binary search for a width value that can
#' make \code{deparse()} return code with line width smaller than or equal to
#' the \code{width.cutoff} value. If the search fails to find such a value, it
#' will emit a warning, which can be suppressed by the global option
#' \code{options(formatR.width.warning = FALSE)}.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for merging this PR! It will be great to have this Just Work.

I think this paragraph isn't 100% clear, because it can be read as the upper bound potentially not being satisfied after I() is used. Suggested edit:

#' If the value of the argument \code{width.cutoff} is wrapped in
#' \code{\link{I}()} (e.g., \code{I(60)}), it will be treated as the
#' \emph{upper bound} on the line width. The corresponding argument to
#' \code{deparse()} is actually a lower bound, and so the function
#' will perform a binary search for a width value that can make
#' \code{deparse()} return code with line width smaller than or equal
#' to the \code{width.cutoff} value. If the search fails to find such
#' a value, it will emit a warning, which can be suppressed by
#' the global option \code{options(formatR.width.warning = FALSE)}.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I'll commit that later.

yihui added a commit that referenced this pull request Mar 18, 2021
@krivit krivit deleted the hard_deparse branch March 19, 2021 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants