Skip to content

Commit

Permalink
fix: add pretty = TRUE
Browse files Browse the repository at this point in the history
  • Loading branch information
Layalchristine24 committed Jun 1, 2024
1 parent c4d6f5b commit 08bd24b
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{
"hash": "ddfa2080c82650a8e6fa48a3a62d62f7",
"hash": "aa967c72db2dc1199c8add9f44362403",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: What is the difference between (un)packing and (un)nesting a tibble?\nauthor:\n - name:\n given: Layal Christine\n family: Lettry\n orcid: 0009-0008-6396-0523\n affiliations:\n - id: cynkra\n - name: cynkra GmbH\n city: Zurich\n state: CH\n - id: unifr\n - name: University of Fribourg, Dept. of Informatics, ASAM Group\n city: Fribourg\n state: CH\ndate: 2024-05-30\ncategories: [nest, unnest, pack, unpack, tidyr, constructive]\nimage: image.jpg\ncitation: \n url: https://rdiscovery.netlify.app/posts/2024-05-30_pack-nest/\nformat:\n html:\n toc: true\n toc-depth: 6\n toc-title: Contents\n toc-location: right\n number-sections: false\neditor_options: \n chunk_output_type: console\n---\n\n\n*Does a nested tibble have the same structure as a packed tibble?*\n\n# Initial object\n\nLet's assume that we have the object `my_tib` which is a nested tibble containing a list, namely `my_values`, with another tibble where the variables are `my_ints` and `my_chars`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_tib <-\n tibble::tibble(\n my_values = list(tibble::tibble(\n my_ints = 1L:5L,\n my_chars = LETTERS[my_ints]\n ))\n )\nconstructive::construct(my_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(\n my_values = list(tibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\"))),\n)\n```\n\n\n:::\n:::\n\n\nWe could also use `tidyr::nest()` to create `my_tib` (please refer to [this article](https://tidyr.tidyverse.org/articles/nest.html) for more info).\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_nested_tib <-\n tibble::tribble(\n ~my_ints, ~my_chars,\n 1L, \"A\",\n 2L, \"B\",\n 3L, \"C\",\n 4L, \"D\",\n 5L, \"E\"\n ) |>\n tidyr::nest(my_values = c(my_ints, my_chars))\n\nconstructive::construct(my_nested_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(\n my_values = list(tibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\"))),\n)\n```\n\n\n:::\n:::\n\n\nAs you can see, there is no difference between `my_tib` and `my_nested_tib`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nwaldo::compare(my_tib, my_nested_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n✔ No differences\n```\n\n\n:::\n:::\n\n\n# What is the difference between a nested and a packed tibble?\n\nTo obtain a packed tibble, we should pack the variables `my_ints` and `my_chars` together so that we have a tibble in another tibble instead of a list with an element that is a tibble.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_packed_tib <-\n tibble::tribble(\n ~my_ints, ~my_chars,\n 1L, \"A\",\n 2L, \"B\",\n 3L, \"C\",\n 4L, \"D\",\n 5L, \"E\"\n ) |>\n tidyr::pack(my_values = c(my_ints, my_chars))\nconstructive::construct(my_packed_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(\n my_values = tibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\")),\n)\n```\n\n\n:::\n:::\n\n\nWe can assess the difference between `my_nested_tib` and `my_packed_tib` with `waldo::compare()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nwaldo::compare(my_nested_tib, my_packed_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n`attr(old, 'row.names')`: 1 \n`attr(new, 'row.names')`: 1 2 3 4 5\n\n`old$my_values` is a list\n`new$my_values` is an S3 object of class <tbl_df/tbl/data.frame>, a list\n```\n\n\n:::\n:::\n\n\nThis tells us that `my_nested_tib` has only one row and contains the variable `my_values` that is a list, whereas `my_packed_tib` has 5 rows and is constituted by the variable `my_values` that has, in this case, the class `data.frame`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(my_packed_tib$my_values)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"tbl_df\" \"tbl\" \"data.frame\"\n```\n\n\n:::\n:::\n\n\nFor the record, a data frame is a special list where every element has the same length.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntypeof(my_packed_tib$my_values)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"list\"\n```\n\n\n:::\n:::\n\n\n# How to unnest or unpack a tibble?\n\nTo get a tibble without any variable that is a list or a tibble, we should unnest and, respectively, unpack our nested/packed tibble.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_unnested_tib <-\n my_nested_tib |>\n tidyr::unnest(my_values)\n\nconstructive::construct(my_unnested_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\"))\n```\n\n\n:::\n:::\n\n\nNow, we have a simple tibble with two variables instead of one single variable that is a list.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_unpacked_tib <-\n my_packed_tib |>\n tidyr::unpack(my_values)\n\nconstructive::construct(my_unpacked_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\"))\n```\n\n\n:::\n:::\n\n\nHere again, we obtain a simple tibble with two variables instead of one single variable that has the class `data.frame`.\n\n# What do the packed tibble and nested tibble look like in a JSON format?\n\nThe main difference is that the instances of the variable `my_values` of the nested tibble will be written between extra square brackets to represent the `list` class of `my_values`, whereas those of the packed tibble will only be displayed between curly brackets given that `my_values` has the class `data.frame` in the packed case. \n\n\n::: {.cell}\n\n```{.r .cell-code}\njsonlite::toJSON(my_nested_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[{\"my_values\":[{\"my_ints\":1,\"my_chars\":\"A\"},{\"my_ints\":2,\"my_chars\":\"B\"},{\"my_ints\":3,\"my_chars\":\"C\"},{\"my_ints\":4,\"my_chars\":\"D\"},{\"my_ints\":5,\"my_chars\":\"E\"}]}] \n```\n\n\n:::\n\n```{.r .cell-code}\njsonlite::toJSON(my_packed_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[{\"my_values\":{\"my_ints\":1,\"my_chars\":\"A\"}},{\"my_values\":{\"my_ints\":2,\"my_chars\":\"B\"}},{\"my_values\":{\"my_ints\":3,\"my_chars\":\"C\"}},{\"my_values\":{\"my_ints\":4,\"my_chars\":\"D\"}},{\"my_values\":{\"my_ints\":5,\"my_chars\":\"E\"}}] \n```\n\n\n:::\n:::\n",
"markdown": "---\ntitle: What is the difference between (un)packing and (un)nesting a tibble?\nauthor:\n - name:\n given: Layal Christine\n family: Lettry\n orcid: 0009-0008-6396-0523\n affiliations:\n - id: cynkra\n - name: cynkra GmbH\n city: Zurich\n state: CH\n - id: unifr\n - name: University of Fribourg, Dept. of Informatics, ASAM Group\n city: Fribourg\n state: CH\ndate: 2024-05-30\ncategories: [nest, unnest, pack, unpack, tidyr, json, constructive]\nimage: image.jpg\ncitation: \n url: https://rdiscovery.netlify.app/posts/2024-05-30_pack-nest/\nformat:\n html:\n toc: true\n toc-depth: 6\n toc-title: Contents\n toc-location: right\n number-sections: false\neditor_options: \n chunk_output_type: console\n---\n\n\n*Does a nested tibble have the same structure as a packed tibble?*\n\n# Initial object\n\nLet's assume that we have the object `my_tib` which is a nested tibble containing a list, namely `my_values`, with another tibble where the variables are `my_ints` and `my_chars`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_tib <-\n tibble::tibble(\n my_values = list(tibble::tibble(\n my_ints = 1L:5L,\n my_chars = LETTERS[my_ints]\n ))\n )\nconstructive::construct(my_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(\n my_values = list(tibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\"))),\n)\n```\n\n\n:::\n:::\n\n\nWe could also use `tidyr::nest()` to create `my_tib` (please refer to [this article](https://tidyr.tidyverse.org/articles/nest.html) for more info).\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_nested_tib <-\n tibble::tribble(\n ~my_ints, ~my_chars,\n 1L, \"A\",\n 2L, \"B\",\n 3L, \"C\",\n 4L, \"D\",\n 5L, \"E\"\n ) |>\n tidyr::nest(my_values = c(my_ints, my_chars))\n\nconstructive::construct(my_nested_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(\n my_values = list(tibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\"))),\n)\n```\n\n\n:::\n:::\n\n\nAs you can see, there is no difference between `my_tib` and `my_nested_tib`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nwaldo::compare(my_tib, my_nested_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n✔ No differences\n```\n\n\n:::\n:::\n\n\n# What is the difference between a nested and a packed tibble?\n\nTo obtain a packed tibble, we should pack the variables `my_ints` and `my_chars` together so that we have a tibble in another tibble instead of a list with an element that is a tibble.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_packed_tib <-\n tibble::tribble(\n ~my_ints, ~my_chars,\n 1L, \"A\",\n 2L, \"B\",\n 3L, \"C\",\n 4L, \"D\",\n 5L, \"E\"\n ) |>\n tidyr::pack(my_values = c(my_ints, my_chars))\nconstructive::construct(my_packed_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(\n my_values = tibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\")),\n)\n```\n\n\n:::\n:::\n\n\nWe can assess the difference between `my_nested_tib` and `my_packed_tib` with `waldo::compare()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nwaldo::compare(my_nested_tib, my_packed_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n`attr(old, 'row.names')`: 1 \n`attr(new, 'row.names')`: 1 2 3 4 5\n\n`old$my_values` is a list\n`new$my_values` is an S3 object of class <tbl_df/tbl/data.frame>, a list\n```\n\n\n:::\n:::\n\n\nThis tells us that `my_nested_tib` has only one row and contains the variable `my_values` that is a list, whereas `my_packed_tib` has 5 rows and is constituted by the variable `my_values` that has, in this case, the class `data.frame`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(my_packed_tib$my_values)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"tbl_df\" \"tbl\" \"data.frame\"\n```\n\n\n:::\n:::\n\n\nFor the record, a data frame is a special list where every element has the same length.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntypeof(my_packed_tib$my_values)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"list\"\n```\n\n\n:::\n:::\n\n\n# How to unnest or unpack a tibble?\n\nTo get a tibble without any variable that is a list or a tibble, we should unnest and, respectively, unpack our nested/packed tibble.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_unnested_tib <-\n my_nested_tib |>\n tidyr::unnest(my_values)\n\nconstructive::construct(my_unnested_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\"))\n```\n\n\n:::\n:::\n\n\nNow, we have a simple tibble with two variables instead of one single variable that is a list.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_unpacked_tib <-\n my_packed_tib |>\n tidyr::unpack(my_values)\n\nconstructive::construct(my_unpacked_tib)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\ntibble::tibble(my_ints = 1:5, my_chars = c(\"A\", \"B\", \"C\", \"D\", \"E\"))\n```\n\n\n:::\n:::\n\n\nHere again, we obtain a simple tibble with two variables instead of one single variable that has the class `data.frame`.\n\n# What do the packed tibble and nested tibble look like in a JSON format?\n\nThe main difference is that the instances of the variable `my_values` of the nested tibble will be written between extra square brackets to represent the `list` class of `my_values`. On the contrary, each row of the variable `my_values` of the packed tibble will be displayed separately between curly brackets given that `my_values` has the class `data.frame` in the packed case. \n\n\n::: {.cell}\n\n```{.r .cell-code}\njsonlite::toJSON(my_nested_tib, pretty = TRUE)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[\n {\n \"my_values\": [\n {\n \"my_ints\": 1,\n \"my_chars\": \"A\"\n },\n {\n \"my_ints\": 2,\n \"my_chars\": \"B\"\n },\n {\n \"my_ints\": 3,\n \"my_chars\": \"C\"\n },\n {\n \"my_ints\": 4,\n \"my_chars\": \"D\"\n },\n {\n \"my_ints\": 5,\n \"my_chars\": \"E\"\n }\n ]\n }\n] \n```\n\n\n:::\n\n```{.r .cell-code}\njsonlite::toJSON(my_packed_tib, pretty = TRUE)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[\n {\n \"my_values\": {\n \"my_ints\": 1,\n \"my_chars\": \"A\"\n }\n },\n {\n \"my_values\": {\n \"my_ints\": 2,\n \"my_chars\": \"B\"\n }\n },\n {\n \"my_values\": {\n \"my_ints\": 3,\n \"my_chars\": \"C\"\n }\n },\n {\n \"my_values\": {\n \"my_ints\": 4,\n \"my_chars\": \"D\"\n }\n },\n {\n \"my_values\": {\n \"my_ints\": 5,\n \"my_chars\": \"E\"\n }\n }\n] \n```\n\n\n:::\n:::\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
Expand Down
4 changes: 2 additions & 2 deletions posts/2024-05-30_pack-nest/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -135,8 +135,8 @@ Here again, we obtain a simple tibble with two variables instead of one single v
The main difference is that the instances of the variable `my_values` of the nested tibble will be written between extra square brackets to represent the `list` class of `my_values`. On the contrary, each row of the variable `my_values` of the packed tibble will be displayed separately between curly brackets given that `my_values` has the class `data.frame` in the packed case.

```{r}
jsonlite::toJSON(my_nested_tib)
jsonlite::toJSON(my_nested_tib, pretty = TRUE)
jsonlite::toJSON(my_packed_tib)
jsonlite::toJSON(my_packed_tib, pretty = TRUE)
```

0 comments on commit 08bd24b

Please sign in to comment.