Skip to content

Commit

Permalink
markdown source builds
Browse files Browse the repository at this point in the history
Auto-generated via {sandpaper}
Source  : ce47918
Branch  : main
Author  : Andrew Ghazi <[email protected]>
Time    : 2024-09-09 03:05:01 +0000
Message : Merge pull request #27 from ccb-hms/add_exercises

Add exercises
  • Loading branch information
actions-user committed Sep 9, 2024
1 parent 976e4c9 commit 6e0421b
Show file tree
Hide file tree
Showing 39 changed files with 64 additions and 33 deletions.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
16 changes: 0 additions & 16 deletions hca.md
Original file line number Diff line number Diff line change
Expand Up @@ -239,22 +239,6 @@ metadata |>
12 sci-RNA-seq 1
```

## Available organisms


``` r
metadata |>
distinct(organism, dataset_id) |>
count(organism)
```

``` output
# A tibble: 1 × 2
organism n
<chr> <int>
1 Homo sapiens 63
```

### Download single-cell RNA sequencing counts

The data can be provided as either "counts" or counts per million "cpm" as given
Expand Down
77 changes: 62 additions & 15 deletions large_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,9 @@ set, as provided by the

``` r
library(TENxBrainData)

sce.brain <- TENxBrainData20k()

sce.brain
```

Expand Down Expand Up @@ -150,7 +152,9 @@ new file at every operation, which would unnecessarily require time-consuming di

``` r
tmp <- counts(sce.brain)

tmp <- log2(tmp + 1)

tmp
```

Expand Down Expand Up @@ -183,8 +187,11 @@ function that we used in the other workflows.

``` r
library(scater)

is.mito <- grepl("^mt-", rowData(sce.brain)$Symbol)

qcstats <- perCellQCMetrics(sce.brain, subsets = list(Mt = is.mito))

qcstats
```

Expand Down Expand Up @@ -253,9 +260,10 @@ by indicating the `BPPARAM` argument in `bplapply`.

``` r
param <- MulticoreParam(workers = 1)

bplapply(
X = c(4, 9, 16, 25),
FUN = function(x) { sqrt(x) },
FUN = sqrt,
BPPARAM = param
)
```
Expand Down Expand Up @@ -286,10 +294,15 @@ calculations on a Unix system:

``` r
library(MouseGastrulationData)

library(scran)

sce <- WTChimeraData(samples = 5, type = "processed")

sce <- logNormCounts(sce)

dec.mc <- modelGeneVar(sce, BPPARAM = MulticoreParam(2))

dec.mc
```

Expand Down Expand Up @@ -342,6 +355,7 @@ details).
``` r
# 2 hours, 8 GB, 1 CPU per task, for 10 tasks.
rs <- list(walltime = 7200, memory = 8000, ncpus = 1)

bpp <- BatchtoolsParam(10, cluster = "slurm", resources = rs)
```

Expand Down Expand Up @@ -393,7 +407,9 @@ graph-based clustering using the Louvain algorithm for community detection:

``` r
library(bluster)

sce <- runPCA(sce)

colLabels(sce) <- clusterCells(sce, use.dimred = "PCA",
BLUSPARAM = NNGraphParam(cluster.fun = "louvain"))
```
Expand All @@ -410,37 +426,41 @@ approximation can be largely ignored.

``` r
library(scran)

library(BiocNeighbors)

clusters <- clusterCells(sce, use.dimred = "PCA",
BLUSPARAM = NNGraphParam(cluster.fun = "louvain",
BNPARAM = AnnoyParam()))

table(exact = colLabels(sce), approx = clusters)
```

``` output
approx
exact 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 90 0 0 0 0 0 0 0 1 0 0 0 0 0 0
2 0 143 0 1 0 0 0 0 0 0 0 0 0 0 0
2 0 143 0 0 0 0 0 0 0 0 0 0 0 0 1
3 0 0 75 0 0 0 0 0 0 0 0 0 0 0 0
4 0 0 0 253 0 0 0 0 0 0 0 0 144 0 0
5 0 0 2 0 391 1 0 0 0 1 0 3 0 0 0
6 0 0 0 0 0 206 51 0 0 0 1 0 0 0 0
7 0 0 0 0 0 3 194 0 0 1 0 0 0 0 0
8 0 0 0 0 2 0 0 91 0 0 0 2 0 0 0
4 0 0 0 342 0 0 0 0 0 0 0 0 0 0 55
5 0 0 0 0 74 0 0 0 0 0 0 198 0 0 0
6 0 0 0 0 0 210 0 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 245 0 0 1 0 0 0 0 0
8 0 0 0 0 1 0 0 95 0 0 0 0 0 0 0
9 1 0 0 0 1 0 0 0 106 0 0 0 0 0 0
10 0 0 0 0 0 0 0 0 0 113 8 0 0 0 0
11 0 0 0 0 0 0 0 0 0 0 144 0 0 0 0
12 0 0 0 0 2 0 0 0 0 15 0 199 0 0 0
13 0 0 0 0 0 0 0 0 0 0 0 0 0 146 0
14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20
10 0 0 0 0 0 0 0 0 0 113 0 16 0 0 0
11 0 0 0 0 0 0 0 0 0 0 153 0 0 0 0
12 0 0 2 0 321 0 0 0 0 1 0 0 0 0 0
13 0 0 0 0 0 0 0 0 0 0 0 0 146 0 0
14 0 0 0 0 0 0 0 0 0 0 0 0 0 20 0
```

The similarity of the two clusterings can be quantified by calculating the pairwise Rand index:


``` r
rand <- pairwiseRand(colLabels(sce), clusters, mode = "index")

stopifnot(rand > 0.8)
```

Expand All @@ -455,11 +475,17 @@ the biological conclusions.

``` r
set.seed(1000)

y1 <- matrix(rnorm(50000), nrow = 1000)

y2 <- matrix(rnorm(50000), nrow = 1000)

Y <- rbind(y1, y2)

exact <- findKNN(Y, k = 20)

approx <- findKNN(Y, k = 20, BNPARAM = AnnoyParam())

mean(exact$index != approx$index)
```

Expand Down Expand Up @@ -487,7 +513,9 @@ library(BiocSingular)

# As the name suggests, it is random, so we need to set the seed.
set.seed(101000)

r.out <- runPCA(sce, ncomponents = 20, BSPARAM = RandomParam())

str(reducedDim(r.out, "PCA"))
```

Expand All @@ -506,7 +534,9 @@ str(reducedDim(r.out, "PCA"))

``` r
set.seed(101001)

i.out <- runPCA(sce, ncomponents = 20, BSPARAM = IrlbaParam())

str(reducedDim(i.out, "PCA"))
```

Expand Down Expand Up @@ -546,7 +576,9 @@ This code block calculates the exact PCA coordinates. Another thing to note: PC

``` r
set.seed(123)

e.out <- runPCA(sce, ncomponents = 20, BSPARAM = ExactParam())

str(reducedDim(e.out, "PCA"))
```

Expand Down Expand Up @@ -661,6 +693,7 @@ We then proceed by loading all required packages and installing the PBMC dataset

``` r
library(SeuratData)

InstallData("pbmc3k")
```

Expand All @@ -671,9 +704,13 @@ We then load the dataset as an `SeuratObject` and convert it to a
``` r
# Use PBMC3K from SeuratData
pbmc <- LoadData(ds = "pbmc3k", type = "pbmc3k.final")

pbmc <- UpdateSeuratObject(pbmc)

pbmc

pbmc.sce <- as.SingleCellExperiment(pbmc)

pbmc.sce
```

Expand All @@ -683,8 +720,11 @@ we demonstrate this here on the wild-type chimera mouse gastrulation dataset.

``` r
sce <- WTChimeraData(samples = 5, type = "processed")

assay(sce) <- as.matrix(assay(sce))

sce <- logNormCounts(sce)

sce
```

Expand All @@ -694,7 +734,9 @@ the `as.Seurat` function.

``` r
sobj <- as.Seurat(sce)

Idents(sobj) <- "celltype.mapped"

sobj
```

Expand Down Expand Up @@ -734,6 +776,7 @@ package.
``` r
example_h5ad <- system.file("extdata", "krumsiek11.h5ad",
package = "zellkonverter")

readH5AD(example_h5ad)
```

Expand All @@ -758,6 +801,7 @@ chimera mouse gastrulation dataset.

``` r
out.file <- tempfile(fileext = ".h5ad")

writeH5AD(sce, file = out.file)
```

Expand Down Expand Up @@ -985,15 +1029,18 @@ Use the function `system.time` to obtain the runtime of each job.
``` r
sce.brain = logNormCounts(sce.brain)

system.time({i.out <- runPCA(sce.brain, ncomponents = 20,
system.time({i.out <- runPCA(sce.brain,
ncomponents = 20,
BSPARAM = ExactParam(),
BPPARAM = SerialParam())})

system.time({i.out <- runPCA(sce.brain, ncomponents = 20,
system.time({i.out <- runPCA(sce.brain,
ncomponents = 20,
BSPARAM = ExactParam(),
BPPARAM = MulticoreParam(workers = 2))})

system.time({i.out <- runPCA(sce.brain, ncomponents = 20,
system.time({i.out <- runPCA(sce.brain,
ncomponents = 20,
BSPARAM = ExactParam(),
BPPARAM = MulticoreParam(workers = 3))})
```
Expand Down
4 changes: 2 additions & 2 deletions md5sum.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
"episodes/eda_qc.Rmd" "1e88f395d30778f4526532deea43eb03" "site/built/eda_qc.md" "2024-09-06"
"episodes/cell_type_annotation.Rmd" "66af56b730aaa88e937bc1743afb471a" "site/built/cell_type_annotation.md" "2024-09-08"
"episodes/multi-sample.Rmd" "2d38d9903358ea8a8067abd82a1f1f54" "site/built/multi-sample.md" "2024-09-08"
"episodes/large_data.Rmd" "bbe443f474a0823122658effa2beb57e" "site/built/large_data.md" "2024-09-06"
"episodes/hca.Rmd" "6db220495ae4ae56d33e4ca5b5f9b8ae" "site/built/hca.md" "2024-09-06"
"episodes/large_data.Rmd" "b9710492c6792ea435778c4e42f27e02" "site/built/large_data.md" "2024-09-09"
"episodes/hca.Rmd" "e01d3fd1e07f158bed08b72d657ae1d1" "site/built/hca.md" "2024-09-09"
"instructors/instructor-notes.md" "cae72b6712578d74a49fea7513099f8c" "site/built/instructor-notes.md" "2024-09-06"
"learners/reference.md" "40fc1d0be2412d2d9d434a5bc84e4de8" "site/built/reference.md" "2024-09-06"
"learners/setup.md" "25772142a26fe3c0cebbe650f5683269" "site/built/setup.md" "2024-09-06"
Expand Down

0 comments on commit 6e0421b

Please sign in to comment.