differences for PR #21

carpentries-incubator · Aug 26, 2024 · 50e5dd7 · 50e5dd7
1 parent cc4aaa2
commit 50e5dd7
Show file tree

Hide file tree

Showing 94 changed files with 115 additions and 135 deletions.
diff --git a/cache/unnamed-chunk-10_769b59c55406ecf609cb3503c3e1bc19.RData b/cache/unnamed-chunk-10_769b59c55406ecf609cb3503c3e1bc19.RData
diff --git a/cache/unnamed-chunk-10_769b59c55406ecf609cb3503c3e1bc19.rdx b/cache/unnamed-chunk-10_769b59c55406ecf609cb3503c3e1bc19.rdx
diff --git a/cache/unnamed-chunk-10_cdac59bf4769caa1fa7ec2f02cba14df.RData b/cache/unnamed-chunk-10_cdac59bf4769caa1fa7ec2f02cba14df.RData
diff --git a/...k-10_769b59c55406ecf609cb3503c3e1bc19.rdb → ...k-10_cdac59bf4769caa1fa7ec2f02cba14df.rdb b/...k-10_769b59c55406ecf609cb3503c3e1bc19.rdb → ...k-10_cdac59bf4769caa1fa7ec2f02cba14df.rdb
diff --git a/cache/unnamed-chunk-10_cdac59bf4769caa1fa7ec2f02cba14df.rdx b/cache/unnamed-chunk-10_cdac59bf4769caa1fa7ec2f02cba14df.rdx
diff --git a/cache/unnamed-chunk-11_01d9df065464f06ddbf37ba637f82d4e.RData b/cache/unnamed-chunk-11_01d9df065464f06ddbf37ba637f82d4e.RData
diff --git a/cache/unnamed-chunk-11_01d9df065464f06ddbf37ba637f82d4e.rdb b/cache/unnamed-chunk-11_01d9df065464f06ddbf37ba637f82d4e.rdb
diff --git a/cache/unnamed-chunk-11_01d9df065464f06ddbf37ba637f82d4e.rdx b/cache/unnamed-chunk-11_01d9df065464f06ddbf37ba637f82d4e.rdx
diff --git a/cache/unnamed-chunk-11_472a2b0a162ec1c79e8b2a15e0621b23.RData b/cache/unnamed-chunk-11_472a2b0a162ec1c79e8b2a15e0621b23.RData
diff --git a/...k-12_9581ab659d351417e31eb14a19035196.rdb → ...k-11_472a2b0a162ec1c79e8b2a15e0621b23.rdb b/...k-12_9581ab659d351417e31eb14a19035196.rdb → ...k-11_472a2b0a162ec1c79e8b2a15e0621b23.rdb
diff --git a/cache/unnamed-chunk-11_472a2b0a162ec1c79e8b2a15e0621b23.rdx b/cache/unnamed-chunk-11_472a2b0a162ec1c79e8b2a15e0621b23.rdx
diff --git a/cache/unnamed-chunk-12_9581ab659d351417e31eb14a19035196.RData b/cache/unnamed-chunk-12_9581ab659d351417e31eb14a19035196.RData
diff --git a/cache/unnamed-chunk-12_9581ab659d351417e31eb14a19035196.rdx b/cache/unnamed-chunk-12_9581ab659d351417e31eb14a19035196.rdx
diff --git a/cache/unnamed-chunk-12_a5e616dee08d04f21f12d3c94ad38ffc.RData b/cache/unnamed-chunk-12_a5e616dee08d04f21f12d3c94ad38ffc.RData
diff --git a/...k-14_99d8d86ad8cf0b18fc63559237f3f006.rdb → ...k-12_a5e616dee08d04f21f12d3c94ad38ffc.rdb b/...k-14_99d8d86ad8cf0b18fc63559237f3f006.rdb → ...k-12_a5e616dee08d04f21f12d3c94ad38ffc.rdb
diff --git a/cache/unnamed-chunk-12_a5e616dee08d04f21f12d3c94ad38ffc.rdx b/cache/unnamed-chunk-12_a5e616dee08d04f21f12d3c94ad38ffc.rdx
diff --git a/cache/unnamed-chunk-13_6da850b073bf362ae75b35f9051c4104.RData b/cache/unnamed-chunk-13_6da850b073bf362ae75b35f9051c4104.RData
diff --git a/cache/unnamed-chunk-13_6da850b073bf362ae75b35f9051c4104.rdb b/cache/unnamed-chunk-13_6da850b073bf362ae75b35f9051c4104.rdb
diff --git a/cache/unnamed-chunk-13_6da850b073bf362ae75b35f9051c4104.rdx b/cache/unnamed-chunk-13_6da850b073bf362ae75b35f9051c4104.rdx
diff --git a/cache/unnamed-chunk-13_de3b4aebd10aef75a5e23afe6bd416aa.RData b/cache/unnamed-chunk-13_de3b4aebd10aef75a5e23afe6bd416aa.RData
diff --git a/cache/unnamed-chunk-13_de3b4aebd10aef75a5e23afe6bd416aa.rdb b/cache/unnamed-chunk-13_de3b4aebd10aef75a5e23afe6bd416aa.rdb
diff --git a/cache/unnamed-chunk-13_de3b4aebd10aef75a5e23afe6bd416aa.rdx b/cache/unnamed-chunk-13_de3b4aebd10aef75a5e23afe6bd416aa.rdx
diff --git a/cache/unnamed-chunk-14_34e672abd1b8e6e536c25eae9907537c.RData b/cache/unnamed-chunk-14_34e672abd1b8e6e536c25eae9907537c.RData
diff --git a/cache/unnamed-chunk-14_34e672abd1b8e6e536c25eae9907537c.rdb b/cache/unnamed-chunk-14_34e672abd1b8e6e536c25eae9907537c.rdb
diff --git a/cache/unnamed-chunk-14_34e672abd1b8e6e536c25eae9907537c.rdx b/cache/unnamed-chunk-14_34e672abd1b8e6e536c25eae9907537c.rdx
diff --git a/cache/unnamed-chunk-14_99d8d86ad8cf0b18fc63559237f3f006.RData b/cache/unnamed-chunk-14_99d8d86ad8cf0b18fc63559237f3f006.RData
diff --git a/cache/unnamed-chunk-14_99d8d86ad8cf0b18fc63559237f3f006.rdx b/cache/unnamed-chunk-14_99d8d86ad8cf0b18fc63559237f3f006.rdx
diff --git a/cache/unnamed-chunk-15_98ce553190603201eaebedf5c122745b.RData b/cache/unnamed-chunk-15_98ce553190603201eaebedf5c122745b.RData
diff --git a/...k-15_d6c9b46960dfe6c8676cc7dd2b3bec08.rdb → ...k-15_98ce553190603201eaebedf5c122745b.rdb b/...k-15_d6c9b46960dfe6c8676cc7dd2b3bec08.rdb → ...k-15_98ce553190603201eaebedf5c122745b.rdb
diff --git a/cache/unnamed-chunk-15_98ce553190603201eaebedf5c122745b.rdx b/cache/unnamed-chunk-15_98ce553190603201eaebedf5c122745b.rdx
diff --git a/cache/unnamed-chunk-15_d6c9b46960dfe6c8676cc7dd2b3bec08.rdx b/cache/unnamed-chunk-15_d6c9b46960dfe6c8676cc7dd2b3bec08.rdx
diff --git a/cache/unnamed-chunk-16_3b6565705686b6519d81ab63bc64dc2b.rdx b/cache/unnamed-chunk-16_3b6565705686b6519d81ab63bc64dc2b.rdx
diff --git a/cache/unnamed-chunk-16_ff85c572ec1739f9049843e423c6aa1f.RData b/cache/unnamed-chunk-16_ff85c572ec1739f9049843e423c6aa1f.RData
diff --git a/cache/unnamed-chunk-16_ff85c572ec1739f9049843e423c6aa1f.rdb b/cache/unnamed-chunk-16_ff85c572ec1739f9049843e423c6aa1f.rdb
diff --git a/cache/unnamed-chunk-16_ff85c572ec1739f9049843e423c6aa1f.rdx b/cache/unnamed-chunk-16_ff85c572ec1739f9049843e423c6aa1f.rdx
diff --git a/cache/unnamed-chunk-17_4b7cfdfb37db6c0bbc316dca1014516c.RData b/cache/unnamed-chunk-17_4b7cfdfb37db6c0bbc316dca1014516c.RData
diff --git a/...k-16_3b6565705686b6519d81ab63bc64dc2b.rdb → ...k-17_4b7cfdfb37db6c0bbc316dca1014516c.rdb b/...k-16_3b6565705686b6519d81ab63bc64dc2b.rdb → ...k-17_4b7cfdfb37db6c0bbc316dca1014516c.rdb
diff --git a/cache/unnamed-chunk-17_4b7cfdfb37db6c0bbc316dca1014516c.rdx b/cache/unnamed-chunk-17_4b7cfdfb37db6c0bbc316dca1014516c.rdx
diff --git a/cache/unnamed-chunk-17_6431eda34873d1ee7185f2085ed1bcbb.rdx b/cache/unnamed-chunk-17_6431eda34873d1ee7185f2085ed1bcbb.rdx
diff --git a/cache/unnamed-chunk-18_afa8529d80cace9c28a142258afe79cf.RData b/cache/unnamed-chunk-18_afa8529d80cace9c28a142258afe79cf.RData
diff --git a/cache/unnamed-chunk-18_afa8529d80cace9c28a142258afe79cf.rdx b/cache/unnamed-chunk-18_afa8529d80cace9c28a142258afe79cf.rdx
diff --git a/...15_d6c9b46960dfe6c8676cc7dd2b3bec08.RData → ...18_f495c4ecb065091935596fe793ef7faf.RData b/...15_d6c9b46960dfe6c8676cc7dd2b3bec08.RData → ...18_f495c4ecb065091935596fe793ef7faf.RData
diff --git a/...k-17_6431eda34873d1ee7185f2085ed1bcbb.rdb → ...k-18_f495c4ecb065091935596fe793ef7faf.rdb b/...k-17_6431eda34873d1ee7185f2085ed1bcbb.rdb → ...k-18_f495c4ecb065091935596fe793ef7faf.rdb
diff --git a/cache/unnamed-chunk-18_f495c4ecb065091935596fe793ef7faf.rdx b/cache/unnamed-chunk-18_f495c4ecb065091935596fe793ef7faf.rdx
diff --git a/...16_3b6565705686b6519d81ab63bc64dc2b.RData → ...19_824a3a515c391049151ecca054c7dd10.RData b/...16_3b6565705686b6519d81ab63bc64dc2b.RData → ...19_824a3a515c391049151ecca054c7dd10.RData
diff --git a/...k-18_afa8529d80cace9c28a142258afe79cf.rdb → ...k-19_824a3a515c391049151ecca054c7dd10.rdb b/...k-18_afa8529d80cace9c28a142258afe79cf.rdb → ...k-19_824a3a515c391049151ecca054c7dd10.rdb
diff --git a/cache/unnamed-chunk-19_824a3a515c391049151ecca054c7dd10.rdx b/cache/unnamed-chunk-19_824a3a515c391049151ecca054c7dd10.rdx
diff --git a/cache/unnamed-chunk-19_ecbfbc2d305eb3bffcc1b3af1760e0bf.RData b/cache/unnamed-chunk-19_ecbfbc2d305eb3bffcc1b3af1760e0bf.RData
diff --git a/cache/unnamed-chunk-19_ecbfbc2d305eb3bffcc1b3af1760e0bf.rdx b/cache/unnamed-chunk-19_ecbfbc2d305eb3bffcc1b3af1760e0bf.rdx
diff --git a/...17_6431eda34873d1ee7185f2085ed1bcbb.RData → ...20_0a7523b916b4425a4c7a589e509759e7.RData b/...17_6431eda34873d1ee7185f2085ed1bcbb.RData → ...20_0a7523b916b4425a4c7a589e509759e7.RData
diff --git a/...k-19_ecbfbc2d305eb3bffcc1b3af1760e0bf.rdb → ...k-20_0a7523b916b4425a4c7a589e509759e7.rdb b/...k-19_ecbfbc2d305eb3bffcc1b3af1760e0bf.rdb → ...k-20_0a7523b916b4425a4c7a589e509759e7.rdb
diff --git a/cache/unnamed-chunk-20_0a7523b916b4425a4c7a589e509759e7.rdx b/cache/unnamed-chunk-20_0a7523b916b4425a4c7a589e509759e7.rdx
diff --git a/cache/unnamed-chunk-20_d04e4387374b53df6bb31b98a1772586.RData b/cache/unnamed-chunk-20_d04e4387374b53df6bb31b98a1772586.RData
diff --git a/cache/unnamed-chunk-20_d04e4387374b53df6bb31b98a1772586.rdb b/cache/unnamed-chunk-20_d04e4387374b53df6bb31b98a1772586.rdb
diff --git a/cache/unnamed-chunk-20_d04e4387374b53df6bb31b98a1772586.rdx b/cache/unnamed-chunk-20_d04e4387374b53df6bb31b98a1772586.rdx
diff --git a/cache/unnamed-chunk-21_88ed595b4f2e99f73a5cb533dbae5dad.RData b/cache/unnamed-chunk-21_88ed595b4f2e99f73a5cb533dbae5dad.RData
diff --git a/cache/unnamed-chunk-21_88ed595b4f2e99f73a5cb533dbae5dad.rdx b/cache/unnamed-chunk-21_88ed595b4f2e99f73a5cb533dbae5dad.rdx
diff --git a/cache/unnamed-chunk-21_8ad7b5653727d8b378d7320fd23d921e.RData b/cache/unnamed-chunk-21_8ad7b5653727d8b378d7320fd23d921e.RData
diff --git a/...k-21_88ed595b4f2e99f73a5cb533dbae5dad.rdb → ...k-21_8ad7b5653727d8b378d7320fd23d921e.rdb b/...k-21_88ed595b4f2e99f73a5cb533dbae5dad.rdb → ...k-21_8ad7b5653727d8b378d7320fd23d921e.rdb
diff --git a/cache/unnamed-chunk-21_8ad7b5653727d8b378d7320fd23d921e.rdx b/cache/unnamed-chunk-21_8ad7b5653727d8b378d7320fd23d921e.rdx
diff --git a/cache/unnamed-chunk-22_3e1954ec5a3b38494687d1afac6f6988.RData b/cache/unnamed-chunk-22_3e1954ec5a3b38494687d1afac6f6988.RData
diff --git a/cache/unnamed-chunk-22_3e1954ec5a3b38494687d1afac6f6988.rdb b/cache/unnamed-chunk-22_3e1954ec5a3b38494687d1afac6f6988.rdb
diff --git a/cache/unnamed-chunk-22_3e1954ec5a3b38494687d1afac6f6988.rdx b/cache/unnamed-chunk-22_3e1954ec5a3b38494687d1afac6f6988.rdx
diff --git a/cache/unnamed-chunk-22_d02a9b46b3bf3aa15717a853466b4905.RData b/cache/unnamed-chunk-22_d02a9b46b3bf3aa15717a853466b4905.RData
diff --git a/...nk-7_56a8346d969d9b6eba82e87328f198c6.rdb → ...k-22_d02a9b46b3bf3aa15717a853466b4905.rdb b/...nk-7_56a8346d969d9b6eba82e87328f198c6.rdb → ...k-22_d02a9b46b3bf3aa15717a853466b4905.rdb
diff --git a/cache/unnamed-chunk-22_d02a9b46b3bf3aa15717a853466b4905.rdx b/cache/unnamed-chunk-22_d02a9b46b3bf3aa15717a853466b4905.rdx
diff --git a/...-4_da52fb8fdce3ea47a598b2bb07a594f2.RData → ...-4_3cc6a4046ec691fc7e2d75c33a8d712d.RData b/...-4_da52fb8fdce3ea47a598b2bb07a594f2.RData → ...-4_3cc6a4046ec691fc7e2d75c33a8d712d.RData
diff --git a/cache/unnamed-chunk-4_3cc6a4046ec691fc7e2d75c33a8d712d.rdb b/cache/unnamed-chunk-4_3cc6a4046ec691fc7e2d75c33a8d712d.rdb
diff --git a/cache/unnamed-chunk-4_3cc6a4046ec691fc7e2d75c33a8d712d.rdx b/cache/unnamed-chunk-4_3cc6a4046ec691fc7e2d75c33a8d712d.rdx
diff --git a/cache/unnamed-chunk-4_da52fb8fdce3ea47a598b2bb07a594f2.rdb b/cache/unnamed-chunk-4_da52fb8fdce3ea47a598b2bb07a594f2.rdb
diff --git a/cache/unnamed-chunk-4_da52fb8fdce3ea47a598b2bb07a594f2.rdx b/cache/unnamed-chunk-4_da52fb8fdce3ea47a598b2bb07a594f2.rdx
diff --git a/cache/unnamed-chunk-6_7cd4b3f7a7e62299f7d52d9fc921d97d.RData b/cache/unnamed-chunk-6_7cd4b3f7a7e62299f7d52d9fc921d97d.RData
diff --git a/...nk-8_c3a089d5488b785d7e54e3cd850e6b29.rdb → ...nk-6_7cd4b3f7a7e62299f7d52d9fc921d97d.rdb b/...nk-8_c3a089d5488b785d7e54e3cd850e6b29.rdb → ...nk-6_7cd4b3f7a7e62299f7d52d9fc921d97d.rdb
diff --git a/cache/unnamed-chunk-6_7cd4b3f7a7e62299f7d52d9fc921d97d.rdx b/cache/unnamed-chunk-6_7cd4b3f7a7e62299f7d52d9fc921d97d.rdx
diff --git a/cache/unnamed-chunk-6_9b06138eed16ba76f5ac3bda54fad9ae.RData b/cache/unnamed-chunk-6_9b06138eed16ba76f5ac3bda54fad9ae.RData
diff --git a/cache/unnamed-chunk-6_9b06138eed16ba76f5ac3bda54fad9ae.rdb b/cache/unnamed-chunk-6_9b06138eed16ba76f5ac3bda54fad9ae.rdb
diff --git a/cache/unnamed-chunk-6_9b06138eed16ba76f5ac3bda54fad9ae.rdx b/cache/unnamed-chunk-6_9b06138eed16ba76f5ac3bda54fad9ae.rdx
diff --git a/cache/unnamed-chunk-7_56a8346d969d9b6eba82e87328f198c6.RData b/cache/unnamed-chunk-7_56a8346d969d9b6eba82e87328f198c6.RData
diff --git a/cache/unnamed-chunk-7_56a8346d969d9b6eba82e87328f198c6.rdx b/cache/unnamed-chunk-7_56a8346d969d9b6eba82e87328f198c6.rdx
diff --git a/...-8_c3a089d5488b785d7e54e3cd850e6b29.RData → ...-7_b18a3dafe0e13a0027092799a35f45c4.RData b/...-8_c3a089d5488b785d7e54e3cd850e6b29.RData → ...-7_b18a3dafe0e13a0027092799a35f45c4.RData
diff --git a/...nk-9_1ff173186cd0b159304cfe10fe6614f7.rdb → ...nk-7_b18a3dafe0e13a0027092799a35f45c4.rdb b/...nk-9_1ff173186cd0b159304cfe10fe6614f7.rdb → ...nk-7_b18a3dafe0e13a0027092799a35f45c4.rdb
diff --git a/cache/unnamed-chunk-7_b18a3dafe0e13a0027092799a35f45c4.rdx b/cache/unnamed-chunk-7_b18a3dafe0e13a0027092799a35f45c4.rdx
diff --git a/cache/unnamed-chunk-8_6cdcbb9af61a9b029a57679ea2a3e87a.RData b/cache/unnamed-chunk-8_6cdcbb9af61a9b029a57679ea2a3e87a.RData
diff --git a/cache/unnamed-chunk-8_6cdcbb9af61a9b029a57679ea2a3e87a.rdb b/cache/unnamed-chunk-8_6cdcbb9af61a9b029a57679ea2a3e87a.rdb
diff --git a/cache/unnamed-chunk-8_6cdcbb9af61a9b029a57679ea2a3e87a.rdx b/cache/unnamed-chunk-8_6cdcbb9af61a9b029a57679ea2a3e87a.rdx
diff --git a/cache/unnamed-chunk-8_c3a089d5488b785d7e54e3cd850e6b29.rdx b/cache/unnamed-chunk-8_c3a089d5488b785d7e54e3cd850e6b29.rdx
diff --git a/cache/unnamed-chunk-9_1ff173186cd0b159304cfe10fe6614f7.RData b/cache/unnamed-chunk-9_1ff173186cd0b159304cfe10fe6614f7.RData
diff --git a/cache/unnamed-chunk-9_1ff173186cd0b159304cfe10fe6614f7.rdx b/cache/unnamed-chunk-9_1ff173186cd0b159304cfe10fe6614f7.rdx
diff --git a/cache/unnamed-chunk-9_ab716a9d771322fb32f97aa5807ae452.RData b/cache/unnamed-chunk-9_ab716a9d771322fb32f97aa5807ae452.RData
diff --git a/cache/unnamed-chunk-9_ab716a9d771322fb32f97aa5807ae452.rdb b/cache/unnamed-chunk-9_ab716a9d771322fb32f97aa5807ae452.rdb
diff --git a/cache/unnamed-chunk-9_ab716a9d771322fb32f97aa5807ae452.rdx b/cache/unnamed-chunk-9_ab716a9d771322fb32f97aa5807ae452.rdx
diff --git a/hca.md b/hca.md
@@ -46,28 +46,15 @@ https://chanzuckerberg.github.io/cellxgene-census/.
 
 ## The CuratedAtlasQueryR Project
 
-To systematically characterize the immune system across tissues, demographics
-and multiple studies, single cell transcriptomics data was harmonized from the
-CELLxGENE database. Data from 28,975,366 cells that cover 156 tissues (excluding
-cell cultures), 12,981 samples, and 324 studies were collected. The metadata was
-standardized, including sample identifiers, tissue labels (based on anatomy) and
-age. Also, the gene-transcript abundance of all samples was harmonized by
-putting values on the positive natural scale (i.e. non-logarithmic).
-
-To model the immune system across studies, we adopted a consistent immune
-cell-type ontology appropriate for lymphoid and non-lymphoid tissues. We applied
-a consensus cell labeling strategy between the Seurat blueprint and Monaco[^1]
-to minimize biases in immune cell classification from
-study-specific standards.
+The `CuratedAtlasQueryR` is an alternative package that can also be used to access the CELLxGENE data from R through a tidy API. The data has also been harmonized, curated, and re-annotated across studies.
 
 `CuratedAtlasQueryR` supports data access and programmatic exploration of the
 harmonized atlas. Cells of interest can be selected based on ontology, tissue of
 origin, demographics, and disease. For example, the user can select CD4 T helper
 cells across healthy and diseased lymphoid tissue. The data for the selected
-cells can be downloaded locally into popular single-cell data containers. Pseudo
+cells can be downloaded locally into SingleCellExperiment objects. Pseudo
 bulk counts are also available to facilitate large-scale, summary analyses of
-transcriptional profiles. This platform offers a standardized workflow for
-accessing atlas-level datasets programmatically and reproducibly.
+transcriptional profiles. 
 
 <img src="https://raw.githubusercontent.com/ccb-hms/osca-workbench/main/episodes/figures/curatedAtlasQuery.png" style="display: block; margin: auto;" />
 
@@ -109,7 +96,8 @@ allows us to get a small and quick subset of the available metadata.
 
 
 ``` r
-metadata <- get_metadata(remote_url = CuratedAtlasQueryR::SAMPLE_DATABASE_URL)
+metadata <- get_metadata(remote_url = CuratedAtlasQueryR::SAMPLE_DATABASE_URL) |> 
+  collect()
 ```
 
 Get a view of the first 10 columns in the metadata with `glimpse`
@@ -142,28 +130,31 @@ $ `_sample_name`                    <chr> "BPH340PrSF_Via___transition zone of
 The vignette materials provided by `CuratedAtlasQueryR` show the use of the
 'native' R pipe (implemented after R version `4.1.0`). For those not familiar
 with the pipe operator (`|>`), it allows you to chain functions by passing the
-left-hand side (LHS) to the first input (typically) on the right-hand side
-(RHS). 
+left-hand side as the first argument to the function on the right-hand side. It is used extensively in the `tidyverse` dialect of R, especially within the [`dplyr` package](https://dplyr.tidyverse.org/).
 
-In this example, we are extracting the `iris` data set from the `datasets`
-package and 'then' taking a subset where the sepal lengths are greater than 5
-and 'then' summarizing the data for each level in the `Species` variable with a
-`mean`. The pipe operator can be read as 'then'.
+The pipe operator can be read as "and then". Thankfully, R doesn't care about whitespace, so it's common to start a new line after a pipe. Together these points enable users to "chain" complex sequences of commands into readable blocks.
 
+In this example, we start with the built-in `mtcars` dataset and then filter to rows where `cyl` is not equal to 4, and then compute the mean `disp` value by each unique `cyl` value.
 
-``` r
-data("iris", package = "datasets")
 
-iris |>
-  subset(Sepal.Length > 5) |>
-  aggregate(. ~ Species, data = _, mean)
+``` r
+mtcars |> 
+  filter(cyl != 4) |> 
+  summarise(avg_disp = mean(disp),
+            .by = cyl)
 ```
 
 ``` output
-     Species Sepal.Length Sepal.Width Petal.Length Petal.Width
-1     setosa     5.313636    3.713636     1.509091   0.2772727
-2 versicolor     5.997872    2.804255     4.317021   1.3468085
-3  virginica     6.622449    2.983673     5.573469   2.0326531
+  cyl avg_disp
+1   6 183.3143
+2   8 353.1000
+```
+
+This command is equivalent to the following:
+
+
+``` r
+summarise(filter(mtcars, cyl != 4), mean_disp = mean(disp), .by = cyl)
 ```
 
 ## Summarizing the metadata
@@ -179,21 +170,20 @@ metadata |>
 ```
 
 ``` output
-# Source:   SQL [?? x 2]
-# Database: DuckDB v0.10.2 [unknown@Linux 6.5.0-1021-azure:R 4.4.0/:memory:]
-   tissue                          n
-   <chr>                       <dbl>
- 1 heart left ventricle            7
- 2 bone marrow                     4
- 3 lung                            4
- 4 renal medulla                   6
- 5 caecum                          1
- 6 ileum                           1
- 7 lymph node                      2
- 8 transition zone of prostate     2
- 9 peripheral zone of prostate     2
-10 fovea centralis                 1
-# ℹ more rows
+# A tibble: 33 × 2
+   tissue                             n
+   <chr>                          <int>
+ 1 adrenal gland                      1
+ 2 axilla                             1
+ 3 blood                             17
+ 4 bone marrow                        4
+ 5 caecum                             1
+ 6 caudate lobe of liver              1
+ 7 cortex of kidney                   7
+ 8 dorsolateral prefrontal cortex     1
+ 9 epithelium of esophagus            1
+10 fovea centralis                    1
+# ℹ 23 more rows
 ```
 
 ## Columns available in the metadata
@@ -204,9 +194,24 @@ head(names(metadata), 10)
 ```
 
 ``` output
-[1] "src"        "lazy_query"
+ [1] "cell_"                             "sample_"                          
+ [3] "cell_type"                         "cell_type_harmonised"             
+ [5] "confidence_class"                  "cell_annotation_azimuth_l2"       
+ [7] "cell_annotation_blueprint_singler" "cell_annotation_monaco_singler"   
+ [9] "sample_id_db"                      "_sample_name"                     
+```
+
+:::: challenge
+
+Glance over the full list of metadata column names. Do any other metadata columns jump out as interesting to you for your work?
+
+
+``` r
+metadata |> names() |> sort()
 ```
 
+::::
+
 ## Available assays
 
 
@@ -217,21 +222,21 @@ metadata |>
 ```
 
 ``` output
-# Source:   SQL [?? x 2]
-# Database: DuckDB v0.10.2 [unknown@Linux 6.5.0-1021-azure:R 4.4.0/:memory:]
-   assay           n
-   <chr>       <dbl>
- 1 10x 3' v3      21
- 2 Slide-seq       4
- 3 sci-RNA-seq     1
- 4 10x 3' v1       1
- 5 Smart-seq2      1
- 6 10x 5' v2       2
- 7 scRNA-seq       4
- 8 Seq-Well        2
- 9 Drop-seq        1
-10 10x 3' v2      27
-# ℹ more rows
+# A tibble: 12 × 2
+   assay                              n
+   <chr>                          <int>
+ 1 10x 3' v1                          1
+ 2 10x 3' v2                         27
+ 3 10x 3' v3                         21
+ 4 10x 5' v1                          7
+ 5 10x 5' v2                          2
+ 6 Drop-seq                           1
+ 7 Seq-Well                           2
+ 8 Slide-seq                          4
+ 9 Smart-seq2                         1
+10 Visium Spatial Gene Expression     7
+11 scRNA-seq                          4
+12 sci-RNA-seq                        1
 ```
 
 ## Available organisms
@@ -244,10 +249,9 @@ metadata |>
 ```
 
 ``` output
-# Source:   SQL [1 x 2]
-# Database: DuckDB v0.10.2 [unknown@Linux 6.5.0-1021-azure:R 4.4.0/:memory:]
+# A tibble: 1 × 2
   organism         n
-  <chr>        <dbl>
+  <chr>        <int>
 1 Homo sapiens    63
 ```
 
@@ -258,18 +262,25 @@ by the `assays` argument in the `get_single_cell_experiment()` function. By
 default, the `SingleCellExperiment` provided will contain only the 'counts'
 data.
 
-#### Query raw counts
+For the sake of demonstration, we'll focus this small subset of samples:
 
 
 ``` r
-single_cell_counts <- 
-    metadata |>
-    dplyr::filter(
+sample_subset = metadata |>
+    filter(
         ethnicity == "African" &
         stringr::str_like(assay, "%10x%") &
         tissue == "lung parenchyma" &
         stringr::str_like(cell_type, "%CD4%")
-    ) |>
+    )
+```
+
+
+#### Query raw counts
+
+
+``` r
+single_cell_counts <- sample_subset |>
     get_single_cell_experiment()
 
 single_cell_counts
@@ -297,13 +308,7 @@ across samples.
 
 
 ``` r
-metadata |>
-  dplyr::filter(
-      ethnicity == "African" &
-      stringr::str_like(assay, "%10x%") &
-      tissue == "lung parenchyma" &
-      stringr::str_like(cell_type, "%CD4%")
-  ) |>
+sample_subset |>
   get_single_cell_experiment(assays = "cpm")
 ```
 
@@ -326,14 +331,7 @@ altExpNames(0):
 
 
 ``` r
-single_cell_counts <-
-    metadata |>
-    dplyr::filter(
-        ethnicity == "African" &
-        stringr::str_like(assay, "%10x%") &
-        tissue == "lung parenchyma" &
-        stringr::str_like(cell_type, "%CD4%")
-    ) |>
+single_cell_counts <- sample_subset |>
     get_single_cell_experiment(assays = "cpm", features = "PUM1")
 
 single_cell_counts
@@ -362,14 +360,7 @@ cells you are requesting.
 
 
 ``` r
-single_cell_counts <-
-    metadata |>
-    dplyr::filter(
-        ethnicity == "African" &
-        stringr::str_like(assay, "%10x%") &
-        tissue == "lung parenchyma" &
-        stringr::str_like(cell_type, "%CD4%")
-    ) |>
+single_cell_counts <- sample_subset |>
     get_seurat()
 
 single_cell_counts
@@ -422,8 +413,7 @@ numerous type of cells?
 
 ``` r
 metadata |>
-    group_by(tissue, cell_type) |>
-    count() |>
+    count(tissue, cell_type) |>
     arrange(-n)
 ```
 :::::::::::::::::::::::
@@ -470,7 +460,7 @@ possible.
 
 ``` r
 metadata |> 
-    dplyr::filter(
+    filter(
         sex == "female" &
         age_days > 80 * 365 &
         stringr::str_like(assay, "%10x%") &
@@ -506,4 +496,4 @@ altExpNames(0):
 
 ::::::::::::::::::::::::::::::::::::::::::::::::
 
-[^1]: [Monaco 2019](learners/reference.md#litref)
+
diff --git a/large_data.md b/large_data.md
@@ -420,19 +420,19 @@ table(exact = colLabels(sce), approx = clusters)
 ``` output
      approx
 exact   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
-   1   90   0   0   0   2   0   0   0   1   0   0   0   0   0   0
-   2    0  86  57   0   0   0   0   0   0   0   0   0   0   0   1
-   3    0   0  76   0   0   0   0   0   0   0   0   0   0   0   0
+   1   93   0   0   0   0   0   0   0   0   0   0   0   0   0   0
+   2    0 143   0   0   0   0   0   0   0   0   0   0   0   0   1
+   3    0   0  75   0   0   0   0   0   0   0   0   0   0   0   0
    4    0   0   0 341   0   0   0   0   0   0   0   0   0   0   0
-   5    0   0   1   0 350   0   0   0   0   0   0   2   0   0   0
-   6    0   0   0   0   0 203   1   0   0   0   1   0   0   0   0
-   7    0   0   0   0   0   6 244   0   0   1   0   0   0   0   0
-   8    0   0   0   0   0   0   0  95   0   0   0   0   0   0   0
-   9    1   0   0   0   1   0   0   0 106   0   0   0   0   0   0
-   10   0   0   0   0  43   0   0   0   0 113   0   1   0   0   0
-   11   0   0   0   0   0   0   0   0   0   0 153   0   0   0   0
-   12   0   0   0   0   2   0   0   0   0   0   0 211   0   0   0
-   13   0   0   0   0   0   0   0   0   0   0   0   0 146   0   0
+   5    2   0   2   0 391   0   0   1   0   0   0   0   0   0   0
+   6    0   0   0   0   0 205   5   0   0   0   0   0   0   0   0
+   7    0   0   0   0   0   0 246   0   0   0   0   0   0   0   0
+   8    0   0   0   0   2   0   0  93   0   0   0   0   0   0   0
+   9    0   0   0   0   0   0   0   0 108   0   0   0   0   0   0
+   10   0   0   0   0   0   0   0   0   0 117   0   0   0   0   0
+   11   0   0   0   0   0   1   0   0   0   3 139   0   0   0   0
+   12   0   0   0   0   9   0   0   0   0   0   0 206   0   0   0
+   13   0   0   0   0   0   0   1   0   0   0   0   0 150   0   1
    14   0   0   0   0   0   0   0   0   0   0   0   0   0  20   0
    15   0   0   0   0   0   0   0   0   0   0   0   0   0   0  56
 ```
@@ -978,43 +978,33 @@ Use the function `system.time` to obtain the runtime of each job.
 
 :::::::::::::: solution
 
-TODO
-:::::::::::::::::::::::
-
-:::::::::::::::::::::::::::::::::::::::::::::
-
-:::::::::::::::::::::::::::::::::: challenge
 
-#### Exercise 3: Conversion to Seurat
+``` r
+sce.brain = logNormCounts(sce.brain)
 
-The [scRNAseq](https://bioconductor.org/packages/scRNAseq)
-package provides gene-level counts for a collection of public scRNA-seq datasets,
-stored as `SingleCellExperiment` objects with annotated cell- and gene-level metadata.
-Consult the vignette of the [scRNAseq](https://bioconductor.org/packages/scRNAseq)
-package to inspect all available datasets and select a dataset of your choice.
-Convert the chosen dataset to a Seurat object and produce a PCA plot with cells
-colored by a cell metadata column of your choice.
+system.time({i.out <- runPCA(sce.brain, ncomponents = 20, 
+                             BSPARAM = ExactParam(),
+                             BPPARAM = SerialParam())})
 
-:::::::::::::: hint
+system.time({i.out <- runPCA(sce.brain, ncomponents = 20, 
+                             BSPARAM = ExactParam(),
+                             BPPARAM = MulticoreParam(workers = 2))})
 
-Use Seurat's `DimPlot` function.
+system.time({i.out <- runPCA(sce.brain, ncomponents = 20, 
+                             BSPARAM = ExactParam(),
+                             BPPARAM = MulticoreParam(workers = 3))})
+```
 
 :::::::::::::::::::::::
 
-:::::::::::::: solution
-
-Use Seurat's `DimPlot` function.
-
-:::::::::::::::::::::::
+:::::::::::::::::::::::::::::::::::::::::::::
 
-:::::::::::::::::::::::
 
 :::::::::::::: checklist
 ## Further Reading
 
 * OSCA book, [Chapter 14](https://bioconductor.org/books/release/OSCA.advanced/dealing-with-big-data.html): Dealing with big data 
 * The `BiocParallel` [intro vignette](https://bioconductor.org/packages/3.19/BiocParallel/vignettes/Introduction_To_BiocParallel.html). 
-
 ::::::::::::::
 
 ::::::::::::::::::::::::::::::::::::: keypoints