A better str for simple objects #2

hadley · 2016-03-10T16:15:40Z

Starting to noodle on this idea because I think it will be useful for the data structures chapter in R4DS

# Atomic vectors ----------------------------------------------------------

# Compactly displays type and length
somethingstr(1:100)
#> int[100]
somethingstr(letters)
#> chr[26]

# Also displays attributes. 
# Class gets special handling
somethingstr(factor(letters))
#> int[26] <factor>
#> @ levels: chr[26]

# (I imagine this being used once you've taught basic data structures
# and purrr, so it's useful to see the details, instead the helpful
# lies that str() tells you.)

somethingstr(Sys.time())
#> dbl[1] <POSIXct, POSIXt>
#> @ tzone: chr[1]  

# Lists -------------------------------------------------------------------

# Shows hierarchy
x <- list(
  list(
    1, 
    2
  ),
  list(
    3,
    4
  )
)
somethingstr(x)
#> list[4]
#> - 1: list[2]
#>    - 1: dbl[1]
#>    - 2: dbl[1]
#> - 2: list[2]
#>    - 1: dbl[1]
#>    - 2: dbl[1]

# Very long lists are truncated
x <- replicate(100, list(runif(5)))
somethingstr(x)

#> list[100]
#> - 1: dbl[5]
#> - 2: dbl[5]
#> - 3: dbl[5]
#> - 4: dbl[5]
#> - 5: dbl[5]
#> ...

# So are very deep lists
x <- list()
for (i in 1:100) x$x <- list(x)
somethingstr(x)

#> list[1]
#> 1. list[1]
#>    1. list[1]
#>       1. list[1]
#>          1. ...

# And length and depth interplay in some complicated way. Maybe the way
# to think about it is that you want to (say) print at most 100 lines.  
# How should you allocate those lines to best display the structure of
# the object? I don't think simple cut-offs for length vs. depth will
# work in general. 

# Think about something() on a data frame containing models etc.
# Maybe can assume unnamed lists are generally homogeneous?

# Names get special treatment
somethingstr(mtcars)
#> list[11] <data.frame>
#> $ mpg : dbl[32]
#> $ cyl : dbl[32]
#> $ disp: dbl[32]
#> $ hp  : dbl[32]
#> $ drat: dbl[32]
#> $ wt  : dbl[32]
#> $ qsec: dbl[32]
#> $ vs  : dbl[32]
#> $ am  : dbl[32]
#> $ gear: dbl[32]
#> $ carb: dbl[32]
#> @ row.names: chr[32]

# Very long names get truncted
x <- list(this_is_a_very_very_very_very_long_name = 1:10)
somethingstr(x)
#> list[1]
#> $ this_is_a_very_...: int[10]


# Environments ----------------------------------------------------------------

# Need someway to control recursion into environments. Probably don't
# want it on by default because there are too many objects that have 
# (possibly big) environments attached (e.g. formulas)
somethingstr(globalenv())
#> env[2] [R_GlobalEnv]

somethingstr(globalenv(), show_env = 0L)
#> env[2] [R_GlobalEnv]
#> $ df: list[1] <data.frame>
#>       $x: int[100]
#>       @row.names: int[1]
#> $ i:  int[10]
#> @parent.env: env[10] [tools:rstudio]

# show_env = 0L would also show the contents of parent.env.

# Functions ---------------------------------------------------------------

somethingstr(function(x = 1:10, y = x) {})
#> func[2] 
#>   $x: `1:10
#>   $y: `x
#> * env: env[4] [R_GlobalEnv]

cc @jennybc, @lionel-

lionel- · 2016-03-10T17:11:55Z

Possible short name: info()

Also displays attributes.
Class gets special handling

Really nice!

So are very deep lists [truncated]

It would be cool to have a more flexible control of the levels to display than str()'s max.level argument. For instance specify the depth from the bottom of the list, or with a range:

info(deep, 3)        # 3 first levels
info(deep, -3)       # 3 last levels
info(deep, c(5, 10)) # levels in the range [5, 10]

And length and depth interplay in some complicated way.

Right, it really depends what the user is looking for. Maybe reduce the amount of information displayed as we go down the hierarchy? Then we can use the level argument to get more complete info about deeper levels.

#> list[1]
#> 1. list[1]
#>    1. list[1]
#>       1. list[1]
#>          1. list ...(5-96)  # Some kind of hint about remaining depth?

Need someway to control recursion into environments.

Maybe treat first level objects specially. So info(env) gives the full info while info(list(env)) doesn't. If the user specifically wants to investigate the environment, she probably wants to have the details. Linked to the idea of reducing the amount of info as we go down the hierarchy.

lionel- · 2016-03-10T17:33:15Z

Ideas for displaying functions:

Display environment if it has a name (e.g. a namespace or the globalenv)
Display only parameter names, not default arguments.
Use 4 dots to truncate long parameter lists

l <- list(data, write.table, list)

str(l)
#> List of 3
#>  $ :function (..., list = character(), package = NULL, lib.loc = NULL, verbose = getOption("verbose"), 
#>     envir = .GlobalEnv)
#>  $ :function (x, file = "", append = FALSE, quote = TRUE, sep = " ", eol = "\n", 
#>     na = "NA", dec = ".", row.names = TRUE, col.names = TRUE, qmethod = c("escape", 
#>         "double"), fileEncoding = "")
#>  $ :function (...)
#>   ..- attr(*, "class")= chr [1:2] "namespace" "roclet"

info(l)
#> list[3]
#> - 1: function[utils]
#>      @args: ..., list, package, lib.loc, ....
#> - 2: function[utils]
#>      @args: file, append, quote, sep, eol, ....
#> - 3: function[base] <namespace, roclet>
#>      @args: ...

jennybc · 2016-03-11T04:49:38Z

Long live somethingstr()!

And length and depth interplay in some complicated way. Maybe the way
to think about it is that you want to (say) print at most 100 lines.
How should you allocate those lines to best display the structure of
the object? I don't think simple cut-offs for length vs. depth will
work in general.

This rings very true. max.level and especially list.len feel like very blunt instruments. I always thought I wanted list.len to be vectorized, but maybe you're right that that wouldn't really solve the problem.

If you knew that the list had repeated structure, then you want to see detail on one element, presumably the first, and then just a note that there are 99 more things of a similar nature. But that ties back to the separate problem of detecting repeated structure. Maybe it would be good to record when you know a list has repeated structure by its very construction (i.e. df %>% group_by() %>% nest() ?%>% mutate()? or lst %>% map()). You get those for free! These sorts of lists remind me of short tandem repeats in a genome.

hadley · 2016-03-11T14:13:16Z

@jennybc I think that circles back to the purrr issue. I think we need a "homogenous" list class that just asserts that all the elements of the list are of the same type.

@lionel- what do you imagine looking at the last 3 levels in a list would look like?

For a bit more context, I'm imaging that in the near future (< 12 months) RStudio will gain an interactive widget that lets you drill down iteratively into a deeply nested list. So this function doesn't need to solve every deeply nested navigation problem, it just needs to give a decent textual output.

lionel- · 2016-03-11T15:10:38Z

what do you imagine looking at the last 3 levels in a list would look like?

The function would figure out the depth of each branch, and only show the branches deep enough:

l <-
  list(
    list(
      list(
        list(
          4
        ),
        list(
          list(
            5
          )
        )
      )
    ),
    list(
      2
    )
  )

info(l, -3)
#> ---levels 3 to 5---
#> list[2]
#> - 1: list[1]
#>    - 1: dbl[1]
#> - 2: list[1]
#>    - 1: list[1]
#>       - 1: dbl[1]

hadley · 2016-03-21T22:34:20Z

Challenging list from @jennybc at https://gist.github.com/jennybc/12d75a88edf37cc996eb

jennybc · 2016-03-28T15:44:32Z

Another good example list: foo from foo <- test_dir("tests/testthat/"). It's (arguably?) a homogenous list with a couple levels of nesting and simply playing with str(..., max.level = ?) doesn't produce great results.

hadley · 2018-04-03T18:09:52Z

Two recent ideas:

We could this function rts(), short for restrained tree structure (and also str() in reverse)
For unnnamed lists we could display the first element recursively, the 2nd-5th elements in summary form, and then display and ... x more for the rest. I think this might be a reasonable heuristic for long lists.

hadley · 2018-04-24T14:48:50Z

Also need to think this through post-BigQuery insights - is a list-col an array or a record or a repeated record? I think we could have a method to impute the "type" of a list column, and then display arrays, records, and repeated records in different ways.

hadley · 2018-05-30T14:39:39Z

This is particularly important for list-columns since there's no good way to see them currently.

wch · 2021-03-04T18:46:38Z

Here's a screenshot of the printing of a nested list structure, from something I'm working on. Some of the ideas may be useful here:

Some notes about it:

Named lists are indicated with {}, and unnamed lists are indicated with []
The tree diagram makes it easier to see what's connected to what.
For both named and unnamed lists, it shows the names/indexes of children. This makes it easy to traverse nested objects to get to the object that you want. For example, it's easy to tell what x[[2]]$c[[3]] refers to.
For lists that have a small number of atomic children, it prints it all on one line.
Atomic types are printed in a different color.
The S3 class of the objects is in the braces/brackets (like {Block}). This is useful for my use case, but I'm not sure this makes a lot of sense in general.
One thing that may be confusing is that the named entries t and c sometimes appear on the same line (when they are both atomic), but if c is another list, it is displayed on a new branch going down.

timelyportfolio mentioned this issue Mar 17, 2016

How can I save the modifications ? timelyportfolio/listviewer#5

Open

krlmlr mentioned this issue May 11, 2016

Idea: Limit height of trunc_mat() output tidyverse/tibble#73

Closed

hadley added the feature a feature request or enhancement label Dec 20, 2018

hadley mentioned this issue Apr 6, 2021

Add tree function to print nested structure list-like objects #56

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A better str for simple objects #2

A better str for simple objects #2

hadley commented Mar 10, 2016

lionel- commented Mar 10, 2016

lionel- commented Mar 10, 2016

jennybc commented Mar 11, 2016

hadley commented Mar 11, 2016

lionel- commented Mar 11, 2016

hadley commented Mar 21, 2016

jennybc commented Mar 28, 2016

hadley commented Apr 3, 2018

hadley commented Apr 24, 2018

hadley commented May 30, 2018

wch commented Mar 4, 2021

A better str for simple objects #2

A better str for simple objects #2

Comments

hadley commented Mar 10, 2016

lionel- commented Mar 10, 2016

lionel- commented Mar 10, 2016

jennybc commented Mar 11, 2016

hadley commented Mar 11, 2016

lionel- commented Mar 11, 2016

hadley commented Mar 21, 2016

jennybc commented Mar 28, 2016

hadley commented Apr 3, 2018

hadley commented Apr 24, 2018

hadley commented May 30, 2018

wch commented Mar 4, 2021