Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add argument to suppress whitespace #131

Merged
merged 12 commits into from
Jul 24, 2019
Merged

Add argument to suppress whitespace #131

merged 12 commits into from
Jul 24, 2019

Conversation

jcheng5
Copy link
Member

@jcheng5 jcheng5 commented Jun 11, 2019

This PR adds a .noWS option to the tag() function and to all the predefined tags. This parameter can be used to suppress whitespace around a particular tag. One such example:

> tags$span("Click my '", tags$a(href="https://rstudio.com", "link"), "'.")
<span>
  Click my '
  <a href="https://rstudio.com">link</a>
  '.
</span>

Note all the extra newlines and whitespace that got inserted around these tag elements. This would adversely affect how the tags get displayed in the browser. As of this PR, you can now do:

> tags$span("Click my '", tags$a(href="https://rstudio.com", "link", .noWS="outside"), "'.", .noWS=c("after-begin", "before-end"))
<span>Click my '<a href="https://rstudio.com">link</a>'.</span>

to render everything on one line. Note that the option is not inherited by child tags -- it only impacts the tag for which it's specified.

Valid .noWS options include:

  • before - before the opening tag
  • after - after the closing tag
  • after-begin - after the opening tag
  • before-end - before the closing tag
  • outside - before the opening tag AND after the closing tag

TODOs

  • Need to make .noWS a formal argument instead of an undocumented attribute
  • Need to document the enumerated values for .noWS
  • Validate .noWS arguments to ensure that nothing's misspelled
  • Update NEWS, bump version?
  • Why is .noWS whitespace-delimted? Shouldn't it just be a regular character vector?

Testing Notes

(Note that testing of this PR and #132 should be done at once -- no need for separate validation.)

Create tags or reference existing tags with the .noWS option specified and watch the whitespace get removed!

Try out all of the available .noWS options listed above to confirm that the behavior is as described.

Printing out code like tags$span("Click my '", tags$a(href="https://rstudio.com", "link", .noWS="outside"), "'.", on the R command line is sufficient to confirm that the whitespace is properly getting suppressed as it should.

Use the `.noWS` attribute to strip whitespace from:

- `before` the tag
- `after` the tag
- `after-begin` of the tag (no whitespace between begin tag and the
  first child)
- `before-end` of the tag (no whitespace between last child and the
  beginning of the end tag)

The .noWS attribute is valid on all tags and is a space-separated
string containing any combination of the above four directives.

Open issues/tasks:
- Needs unit tests
- API is gross
- Need to decide on default behavior (automatic based on block/inline?)
- savePosition() maybe should be done implicitly (every write() calls
  savePosition(), every writeWS() does not?)
Separated the logic of bookmarking/suppressing, and how those
features are invoked, into two distinct R6 classes that are
each easy to reason about independently. Also, the API surface
area that the (already complicated) tagWrite function needs to
worry about has been greatly decreased. Now, tagWrite just
needs to call write() or writeWS(), and call eatWS() to eat all
recent/upcoming whitespace.
Copy link
Contributor

@trestletech trestletech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly notes to myself but a few actual questions and opining about the API.

R/tags.R Show resolved Hide resolved
R/utils.R Show resolved Hide resolved
R/utils.R Outdated Show resolved Hide resolved
R/utils.R Show resolved Hide resolved
R/utils.R Outdated Show resolved Hide resolved
R/utils.R Show resolved Hide resolved
R/utils.R Show resolved Hide resolved
R/utils.R Show resolved Hide resolved
tests/testthat/test-textwriter.r Show resolved Hide resolved
tests/testthat/test-textwriter.r Show resolved Hide resolved
R/tags.R Show resolved Hide resolved
@trestletech
Copy link
Contributor

I did some quick performance tests and it looks like this slows things down quite a bit. I haven't profiled to see where the time is going, but I'm seeing 60-100% extra time taken relative to master:

size <- 10000
system.time(l <- tagList(lapply(1:size, function(x){htmltools::tags$option(value=x, .noWS="outside before")})))
system.time(s <- as.character(l))

### .noWS branch
#> size <- 10000
#> system.time(l <- tagList(lapply(1:size, function(x){htmltools::tags$option(value=x)})))
#user  system elapsed 
#0.253   0.001   0.254 
#> system.time(s <- as.character(l))
#user  system elapsed 
#2.423   0.169   2.596

### .noWS branch with `.noWS="outside before"`
# > size <- 10000
# > system.time(l <- tagList(lapply(1:size, function(x){htmltools::tags$option(value=x, .noWS="outside before")})))
# user  system elapsed 
# 0.250   0.001   0.251 
# > system.time(s <- as.character(l))
# user  system elapsed 
# 2.774   0.179   2.957 

### master
#> size <- 10000
#> system.time(l <- tagList(lapply(1:size, function(x){htmltools::tags$option(value=x)})))
#user  system elapsed 
#0.246   0.005   0.252 
#> system.time(s <- as.character(l))
#user  system elapsed 
#1.574   0.014   1.592

@trestletech trestletech force-pushed the joe/feature/nows branch 2 times, most recently from c323515 to 7c0d985 Compare July 9, 2019 15:01
@trestletech
Copy link
Contributor

After a bit of Travis thrashing followed by a force-push-fest to clean up the commit mess... ready for rereview, @jcheng5!

The one outstanding concern for me is the performance issue described in the comment above. I'm wondering if we should hold this PR back until we take on some of the C++ performance improvements. I think I could start on that work whether or not this is on master so long as we're not anticipating any serious conflicts to emerge in the coming couple of weeks.

Runtime w/o .noWS specified improves ~15%
Runtime w/ .noWS specified improves ~9%

Still <= 70% slower than the original implementation w/o whitespace considerations.
@trestletech
Copy link
Contributor

trestletech commented Jul 10, 2019

I'm doing some experimentation on performance here. Raw performance notes from running this code:

size <- 1000
l <- tagList(lapply(1:size, function(x){htmltools::tags$option(value=x)}))
bench::mark(min_time=5,{
  s <- as.character(l)
})

By the end, it looks like we have an approach that will get us back to parity with master if we clean up that branch.

master

# A tibble: 1 x 13
  expression                 min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory time 
  <bch:expr>               <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list> <lis>
1 { s <- as.character(l) } 152ms  173ms      5.71     430KB     7.38    58    75      10.2s <chr … <df[,… <bch…
# … with 1 more variable: gc <list>

binary connection-based

# A tibble: 1 x 13
  expression                 min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
  <bch:expr>               <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>
1 { s <- as.character(l) } 213ms  239ms      4.20     508KB     8.80    21    44
# … with 5 more variables: total_time <bch:tm>, result <list

buffer-based instead of connection (b09ee)

# A tibble: 1 x 13
  expression                 min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
  <bch:expr>               <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>
1 { s <- as.character(l) } 295ms  326ms      3.05     129MB     11.3    16    59
# … with 5 more variables: total_time <bch:tm>, result <list>, memory <list>,
#   time <list>, gc <list>

buffer-based with periodic string collection (345eff0)

# A tibble: 1 x 13
  expression                 min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
  <bch:expr>               <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>
1 { s <- as.character(l) } 220ms  235ms      4.28    4.39MB     9.15    22    47
# … with 5 more variables: total_time <bch:tm>, result <list>, memory <list>,
#   time <list>, gc <list>

buffer-based with string collection and double-assign to avoid copies

# A tibble: 1 x 13
  expression                 min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
  <bch:expr>               <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>
1 { s <- as.character(l) } 200ms  221ms      4.51    40.1MB     9.42    23    48
# … with 5 more variables: total_time <bch:tm>, result <list>, memory <list>,
#   time <list>, gc <list>

Same with portable=FALSE (aa8c3ad)

# A tibble: 1 x 13
  expression                 min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
  <bch:expr>               <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
1 { s <- as.character(l) } 193ms  210ms      4.74     293KB     10.1    24    51      5.07s
# … with 4 more variables: result <list>, memory <list>, time <list>, gc <list>

Same after eliminating TextConnection and just using WS directly (de64260)

# A tibble: 1 x 13
  expression                 min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory time 
  <bch:expr>               <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list> <lis>
1 { s <- as.character(l) } 159ms  181ms      5.35     656KB     7.43    54    75      10.1s <chr … <df[,… <bch…
# … with 1 more variable: gc <list>

after removing R6 and using a closure directly (5e5bab0)

# A tibble: 1 x 13
  expression                 min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result memory time 
  <bch:expr>               <bch> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list> <list> <lis>
1 { s <- as.character(l) } 142ms  166ms      5.74     271KB     7.72    58    78      10.1s <chr … <df[,… <bch…
# … with 1 more variable: gc <list>

R/tags.R Outdated Show resolved Hide resolved
R/tags.R Outdated Show resolved Hide resolved
R/tags.R Outdated
@@ -356,7 +366,8 @@ tag <- function(`_tag_name`, varArgs) {
structure(
list(name = `_tag_name`,
attribs = attribs,
children = children),
children = children,
.noWS = .noWS),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try omitting the .noWS totally if NULL and see if that hashes in the test revert so we don't change everything.

@amrrs
Copy link

amrrs commented Jan 13, 2020

@jcheng5 @wch could it be used with htmlTemplate(). It's still adding linebreaks

@jcheng5
Copy link
Member Author

jcheng5 commented Jan 13, 2020

@amrrs Oh, that's a good point. I think we'd actually need to add this feature to HTML() as well. This code:

div(HTML("a"), HTML("b"), .noWS = c("after-begin", "before-end"))

renders like:

<div>a
  b</div>

There's not currently a way AFAIK to remove the whitespace between a and b, and that's basically what we'd need to make htmlTemplate eat the whitespace.

@jcubic
Copy link

jcubic commented Jan 14, 2020

What about space around {{ ?? }}?

@jcheng5
Copy link
Member Author

jcheng5 commented Jan 15, 2020

@jcubic That's what I was referring to in my previous comment, sorry it wasn't clear. htmlTemplate is essentially implemented by stringing together HTML() segments, as soon as we support whitespace eating around those segments then {{ ... }} can be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants