Skip to content

Commit

Permalink
Merge pull request #45 from ccb-hms/hca_refresh
Browse files Browse the repository at this point in the history
hca_refresh question titles
  • Loading branch information
andrewGhazi authored Sep 30, 2024
2 parents 97900c7 + 7675dab commit cd3de4f
Showing 1 changed file with 20 additions and 12 deletions.
32 changes: 20 additions & 12 deletions episodes/hca.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ metadata |>
glimpse()
```

## A note on the piping operator
## A note on the pipe operator

The vignette materials provided by `CuratedAtlasQueryR` show the use of the
'native' R pipe (implemented after R version `4.1.0`). For those not familiar
Expand Down Expand Up @@ -182,9 +182,9 @@ For the sake of demonstration, we'll focus this small subset of samples:
sample_subset = metadata |>
filter(
ethnicity == "African" &
stringr::str_like(assay, "%10x%") &
grepl("10x", assay) &
tissue == "lung parenchyma" &
stringr::str_like(cell_type, "%CD4%")
grepl("CD4", cell_type)
)
```

Expand Down Expand Up @@ -246,7 +246,7 @@ single_cell_counts |> saveHDF5SummarizedExperiment("single_cell_counts")

:::::::::::::::::::::::::::::::::: challenge

#### Exercise 1
#### Exercise 1: Basic counting + piping

Use `count` and `arrange` to get the number of cells per tissue in descending
order.
Expand All @@ -264,26 +264,28 @@ metadata |>

:::::::::::::::::::::::::::::::::: challenge

#### Exercise 2
#### Exercise 2: Tissue & type counting

Use `dplyr`-isms to group by `tissue` and `cell_type` and get a tally of the
highest number of cell types per tissue combination. What tissue has the most
numerous type of cells?
`count()` can group by multiple factors by simply adding another grouping column
as an additional argument. Get a tally of the highest number of cell types per
tissue combination. What tissue has the most numerous type of cells?

:::::::::::::: solution

```{r,eval=FALSE}
metadata |>
count(tissue, cell_type) |>
arrange(-n)
arrange(-n) |>
head(n = 1)
```
:::::::::::::::::::::::

:::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::::::: challenge

#### Exercise 3
#### Exercise 3: Comparing metadata categories

Spot some differences between the `tissue` and `tissue_harmonised` columns.
Use `count` to summarise.
Expand All @@ -299,6 +301,10 @@ metadata |>
count(tissue_harmonised) |>
arrange(-n)
```

For example you can see that `tissue_harmonised` merges the `cortex of kidney`
and `kidney` groups in `tissue`.

To see the full list of curated columns in the metadata, see the Details section
in the `?get_metadata` documentation page.

Expand All @@ -308,7 +314,7 @@ in the `?get_metadata` documentation page.

:::::::::::::::::::::::::::::::::: challenge

#### Exercise 4
#### Exercise 4: Highly specific cell groups

Now that we are a little familiar with navigating the metadata, let's obtain
a `SingleCellExperiment` of 10X scRNA-seq counts of `cd8 tem` `lung` cells for
Expand All @@ -322,13 +328,15 @@ metadata |>
filter(
sex == "female" &
age_days > 80 * 365 &
stringr::str_like(assay, "%10x%") &
grepl("10x", assay) &
disease == "COVID-19" &
tissue_harmonised == "lung" &
cell_type_harmonised == "cd8 tem"
) |>
get_single_cell_experiment()
```

You can see we don't get very many cells given the strict set of conditions we used.
:::::::::::::::::::::::

:::::::::::::::::::::::::::::::::::::::::::::
Expand Down

0 comments on commit cd3de4f

Please sign in to comment.