diff --git a/aio.html b/aio.html index f45d2bb..a7585f8 100644 --- a/aio.html +++ b/aio.html @@ -1149,7 +1149,7 @@
Whenever your code involves the generation of random numbers, it’s a good practice to set the random seed in R with @@ -1457,7 +1457,7 @@
You set nmads = 4
like so:
modelGeneVar()
can take a block
argument.
Use the block
argument in the call to
modelGeneVar()
like so:
The first few lines here read the data from ExperimentHub and the mitochondrial genes are identified by gene symbols in the row data. @@ -2416,7 +2416,7 @@
Spike-ins are deliberately-introduced exogeneous RNA from an exotic or synthetic source at a known concentration. This provides a known @@ -2460,7 +2460,7 @@
Mathematically, this would require the data to fall on a two-dimensional plane (for linear methods like PCA) or a smooth 2D @@ -2734,7 +2734,7 @@
We see in the help documentation for ?clusterCells
that
all of the clustering algorithm details are handled through the
@@ -2931,7 +2931,7 @@
You can see that at least among the top markers, cluster 6 (pale green) tends to have the least separation from cluster 1.
@@ -3229,7 +3229,7 @@The NNGraphParam
constructor has an argument
cluster.args
. This allows to specify arguments passed on to
@@ -3660,7 +3660,7 @@
One important reason why is because averages over all other clusters can be sensitive to the cell type composition. If a rare cell type shows @@ -3806,7 +3806,7 @@
Use BiocParallel
and the BPPARAM
argument!
This example will set it to use four cores on your laptop, but you can
@@ -3848,7 +3848,7 @@
The example that jumps out most strongly to the eye is ExE endoderm, which doesn’t show clear separate modes. Simultaneously, Endothelium @@ -4141,7 +4141,7 @@
Samples 5 and 6 were from the same “pool” of cells. Looking at the
documentation for the dataset under ?WTChimeraData
we see
@@ -4206,7 +4206,7 @@
False. Batch-level data can be retained through confounding with experimental factors or poor ability to distinguish experimental effects @@ -4659,7 +4659,7 @@
“logFC” stands for log fold-change. edgeR
uses a log2
convention. Rather than reporting e.g. a 5-fold increase, it’s better to
@@ -4985,7 +4985,7 @@
You can simply hand pheatmap()
a matrix as its only
argument. pheatmap()
has a million options you can tweak,
@@ -4999,7 +4999,7 @@
After running the second pseudobulk DGE, you can join the two
DataFrame
s of Erythroid3
statistics using the
@@ -5049,7 +5049,7 @@
It’s important to have multiple samples within each experimental group because it helps the batch effect correction algorithm distinguish @@ -5209,7 +5209,7 @@
Content from Working with large data
Last updated on 2024-10-02 | +
Last updated on 2024-10-04 | Edit this page
From ?MulticoreParam
:
@@ -5700,21 +5700,22 @@ROUTPUT
+exact 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + 1 88 0 0 0 0 0 0 2 0 0 0 1 0 0 0 + 2 0 143 0 1 0 0 0 0 0 0 0 0 0 0 0 + 3 0 0 75 0 0 0 0 0 0 0 0 2 0 0 0 + 4 0 0 0 198 0 0 0 0 0 0 0 0 143 0 0 + 5 0 0 0 0 71 0 0 0 1 0 3 322 0 0 0 + 6 0 0 0 0 0 210 0 0 0 0 0 0 0 0 0 + 7 0 0 0 0 0 0 245 0 1 0 0 0 0 0 0 + 8 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 + 9 1 0 0 0 0 0 0 106 0 0 0 1 0 0 0 + 10 0 0 0 0 0 0 0 0 112 0 10 0 0 0 0 + 11 0 0 0 0 0 0 0 0 0 153 0 0 0 0 0 + 12 0 0 0 0 8 0 0 0 0 0 195 1 0 0 0 + 13 0 0 0 0 0 0 0 0 1 0 0 0 0 146 0 + 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20 + 15 0 0 0 56 0 0 0 0 0 0 0 0 0 0 0approx -exact 1 2 3 4 5 6 7 8 9 10 11 12 13 14 - 1 90 0 0 0 0 0 0 0 1 0 0 0 0 0 - 2 0 143 0 1 0 0 0 0 0 0 0 0 0 0 - 3 0 0 77 0 0 0 0 0 0 0 0 0 0 0 - 4 0 0 0 397 0 0 0 0 0 0 0 0 0 0 - 5 0 0 0 0 393 0 0 2 0 0 0 5 0 0 - 6 0 0 0 0 0 204 6 0 0 0 1 0 0 0 - 7 0 0 0 0 0 0 245 0 0 1 0 0 0 0 - 8 0 0 0 0 1 0 0 93 0 0 0 0 0 0 - 9 1 0 0 0 1 0 0 0 106 0 0 0 0 0 - 10 0 0 0 0 0 0 0 0 0 116 2 1 0 0 - 11 0 0 0 0 0 0 0 0 0 2 139 0 6 0 - 12 0 0 0 0 1 0 0 0 0 0 0 210 0 0 - 13 0 0 0 0 0 0 0 0 0 0 0 0 146 0 - 14 0 0 0 0 0 0 0 0 0 0 0 0 0 20
The similarity of the two clusterings can be quantified by calculating the pairwise Rand index:
@@ -5845,12 +5846,12 @@The uncertainty from approximation error is sometimes psychologically
-objectionable. “Why can’t my computer just give me the right answer?”
-One way to alleviate this feeling is to quantify the approximation error
-on a small test set like the sce we have here. Using the
-ExactParam()
class, visualize the error in PC1 coordinates
-compared to the RSVD results.
The uncertainty from approximation error is sometimes aggravating.
+“Why can’t my computer just give me the right answer?” One way to
+alleviate this feeling is to quantify the approximation error on a small
+test set like the sce we have here. Using the ExactParam()
+class, visualize the error in PC1 coordinates compared to the RSVD
+results.
This code block calculates the exact PCA coordinates. Another thing to note: PC vectors are only identified up to a sign flip. We can see @@ -6270,7 +6271,7 @@
See the HDF5Array
function for reading from HDF5 and the
writeHDF5Array
function for writing to HDF5 from the HDF5Array
@@ -6284,7 +6285,7 @@
wt_out <- tempfile(fileext = ".h5")
-wt_counts <- counts(WTChimeraData())
-Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'counts': failed to load resource
- name: EH2973
- title: WT chimera processed counts (sample 9)
- reason: 1 resources failed to download
-
-writeHDF5Array(wt_counts,
+wt_counts <- counts(WTChimeraData())
+
+writeHDF5Array(wt_counts,
name = "wt_counts",
file = wt_out)
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'is_sparse': object 'wt_counts' not found
-
-oom_wt <- HDF5Array(wt_out, "wt_counts")
-Error in file_path_as_absolute(path): file '/tmp/Rtmp6Q9gmL/file1e892f562866.h5' does not exist
+<29453 x 30703> sparse HDF5Matrix object of type "double":
+ cell_1 cell_2 cell_3 ... cell_30702 cell_30703
+ENSMUSG00000051951 0 0 0 . 0 0
+ENSMUSG00000089699 0 0 0 . 0 0
+ENSMUSG00000102343 0 0 0 . 0 0
+ENSMUSG00000025900 0 0 0 . 0 0
+ENSMUSG00000025902 0 0 0 . 0 0
+ ... . . . . . .
+ENSMUSG00000095041 0 1 2 . 0 0
+ENSMUSG00000063897 0 0 0 . 0 0
+ENSMUSG00000096730 0 0 0 . 0 0
+ENSMUSG00000095742 0 0 0 . 0 0
+ tomato-td 1 0 1 . 0 0
-object.size(wt_counts)
+oom_wt <- HDF5Array(wt_out, "wt_counts")
+
+object.size(wt_counts)
Error in eval(expr, envir, enclos): object 'wt_counts' not found
+1520366960 bytes
object.size(oom_wt)
Error in eval(expr, envir, enclos): object 'oom_wt' not found
+2488 bytes
Use the function system.time
to obtain the runtime of
each job.
diff --git a/cell_type_annotation.html b/cell_type_annotation.html index c1aedd0..7d15de7 100644 --- a/cell_type_annotation.html +++ b/cell_type_annotation.html @@ -516,7 +516,7 @@Challenge
-+We see in the help documentation for
?clusterCells
that all of the clustering algorithm details are handled through the @@ -710,7 +710,7 @@Challenge
-+You can see that at least among the top markers, cluster 6 (pale green) tends to have the least separation from cluster 1.
@@ -1005,7 +1005,7 @@Challenge
-+R @@ -1268,7 +1268,7 @@
Challenge
-+R @@ -1413,7 +1413,7 @@
Exercise 1: Clustering
-+The
NNGraphParam
constructor has an argumentcluster.args
. This allows to specify arguments passed on to @@ -1430,7 +1430,7 @@Give me a hint
-+R @@ -1468,7 +1468,7 @@
Exercise 2: Reference marker genes
-+R @@ -1541,7 +1541,7 @@
Extension Challenge 1: Group pair comparisons
-+One important reason why is because averages over all other clusters can be sensitive to the cell type composition. If a rare cell type shows @@ -1575,7 +1575,7 @@
Extension Challenge 2: Parallelizing SingleR
-+Use
BiocParallel
and theBPPARAM
argument! This example will set it to use four cores on your laptop, but you can @@ -1617,7 +1617,7 @@Extension Challenge 3: Critical inspection of diagnost -
+The example that jumps out most strongly to the eye is ExE endoderm, which doesn’t show clear separate modes. Simultaneously, Endothelium diff --git a/eda_qc.html b/eda_qc.html index a378a95..6615786 100644 --- a/eda_qc.html +++ b/eda_qc.html @@ -497,7 +497,7 @@
Challenge
-+R @@ -578,7 +578,7 @@
OUTPUT< -
+Whenever your code involves the generation of random numbers, it’s a good practice to set the random seed in R with @@ -797,7 +797,7 @@
Challenge
-+You set
nmads = 4
like so:@@ -1069,7 +1069,7 @@R
Show me the solution
-+R @@ -1181,7 +1181,7 @@
Challenge
-+@@ -1194,7 +1194,7 @@
modelGeneVar()
can take ablock
argument.Give me a hint
-+Use the
@@ -1405,7 +1405,7 @@block
argument in the call tomodelGeneVar()
like so:Challenge
-+R @@ -1569,7 +1569,7 @@
Exercise 1: Normalization
-+R @@ -1628,7 +1628,7 @@
Exercise 2: PBMC Data
-+The first few lines here read the data from ExperimentHub and the mitochondrial genes are identified by gene symbols in the row data. @@ -1736,7 +1736,7 @@
Extension challenge 1: Spike-ins
-+Spike-ins are deliberately-introduced exogeneous RNA from an exotic or synthetic source at a known concentration. This provides a known @@ -1780,7 +1780,7 @@
Extension challenge 3: Reduced dimensionality represen -
+Mathematically, this would require the data to fall on a two-dimensional plane (for linear methods like PCA) or a smooth 2D diff --git a/index.html b/index.html index 4755282..9d519b2 100644 --- a/index.html +++ b/index.html @@ -372,7 +372,7 @@
If you already have R and RStudio installed
-+
- Open RStudio, and click on “Help” > “Check for updates”. If a new version is available, quit RStudio, and download the latest version for @@ -398,7 +398,7 @@
If you don’t have R and RStudio installed
-+
- Download R from the CRAN website.
@@ -429,7 +429,7 @@If you already have R and RStudio installed
-+
- Open RStudio, and click on “Help” > “Check for updates”. If a new version is available, quit RStudio, and download the latest version for @@ -453,7 +453,7 @@
If you don’t have R and RStudio installed
-+
- Download R from the CRAN website.
- Select the
@@ -486,7 +486,7 @@.pkg
file for the latest R versionInstall R using your package manager and RStudio
-+
- Follow the instructions for your distribution from CRAN, they provide information to get the most recent version of R for common diff --git a/instructor/aio.html b/instructor/aio.html index 933d53d..8120207 100644 --- a/instructor/aio.html +++ b/instructor/aio.html @@ -1153,7 +1153,7 @@
Challenge
-+R @@ -1235,7 +1235,7 @@
OUTPUT< -
+Whenever your code involves the generation of random numbers, it’s a good practice to set the random seed in R with @@ -1461,7 +1461,7 @@
Challenge