From d85b18720f1f5cfef26b683e188cc21c88ed6046 Mon Sep 17 00:00:00 2001
From: Artur Sannikov <40318410+artur-sannikov@users.noreply.github.com>
Date: Mon, 29 Jul 2024 21:02:27 +0000
Subject: [PATCH] Remove trailing whitespaces (#588)

---
 DESCRIPTION                                   |   7 +-
 inst/pages/acknowledgments.qmd                |  18 +-
 inst/pages/agglomeration.qmd                  |  20 +-
 inst/pages/alpha_diversity.qmd                |  18 +-
 inst/pages/beta_diversity.qmd                 | 102 +++----
 inst/pages/clustering.qmd                     | 167 ++++++------
 inst/pages/containers.qmd                     |  38 +--
 inst/pages/convert.qmd                        |  10 +-
 inst/pages/correlation.qmd                    |   9 +-
 inst/pages/cross_correlation.qmd              |  22 +-
 inst/pages/differential_abundance.qmd         | 100 +++----
 inst/pages/exercises.qmd                      | 248 +++++++++---------
 inst/pages/extra_material.qmd                 |  56 ++--
 inst/pages/import.qmd                         |  32 +--
 inst/pages/integrated_learner.qmd             | 129 ++++-----
 inst/pages/intro.qmd                          |   4 +-
 inst/pages/introductory_workflow.qmd          |  54 ++--
 .../introductory_workflow_dutch_version.qmd   |  24 +-
 .../introductory_workflow_french_version.qmd  |  24 +-
 inst/pages/machine_learning.qmd               |   8 +-
 inst/pages/miaverse.qmd                       |   8 +-
 inst/pages/mmuphin_meta_analysis.qmd          |  76 +++---
 inst/pages/msea.qmd                           | 112 ++++----
 inst/pages/multiassay_ordination.qmd          |  12 +-
 inst/pages/network_comparison.qmd             |  64 ++---
 inst/pages/network_learning.qmd               | 115 ++++----
 inst/pages/quality_control.qmd                |  76 +++---
 inst/pages/resources.qmd                      |  76 +++---
 inst/pages/session_info.qmd                   |  12 +-
 inst/pages/taxonomy.qmd                       |  10 +-
 inst/pages/training.qmd                       |   8 +-
 inst/pages/transformation.qmd                 |  25 +-
 inst/pages/visualization.qmd                  |  62 ++---
 inst/pages/wrangling.qmd                      |  38 +--
 34 files changed, 902 insertions(+), 882 deletions(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 7ffe63013..260799172 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -2,7 +2,7 @@ Package: OMA
 Title: Orchestrating Microbiome Analysis with Bioconductor
 Version: 0.98.25
 Date: 2024-04-26
-Authors@R: 
+Authors@R:
     c(person("Leo", "Lahti", role = c("aut"),
              comment = c(ORCID = "0000-0001-5537-637X")),
       person(given = "Tuomas", family = "Borman", role = c("aut", "cre"),
@@ -11,11 +11,11 @@ Authors@R:
       person("Felix GM", "Ernst", email = "felix.gm.ernst@outlook.com",
              role = c("aut"),
              comment = c(ORCID = "0000-0001-5064-0928")),
-      person("and others", "(see the full list of contributors)", 
+      person("and others", "(see the full list of contributors)",
              role = c("ctb"))
          )
 Description:
-    This is a reference cookbook for **Microbiome Data Science** with 
+    This is a reference cookbook for **Microbiome Data Science** with
     R and Bioconductor.
 License: CC BY-NC-SA 4.0
 Encoding: UTF-8
@@ -59,6 +59,7 @@ Suggests:
     glmnet,
     glue,
     grid,
+    gtools,
     gsEasy,
     igraph,
     kableExtra,
diff --git a/inst/pages/acknowledgments.qmd b/inst/pages/acknowledgments.qmd
index 5860eb504..c85b281d8 100644
--- a/inst/pages/acknowledgments.qmd
+++ b/inst/pages/acknowledgments.qmd
@@ -31,7 +31,7 @@ coordinated by:
    and runs regular training workshops in microbiome data science.
 
 - *Tuomas Borman*, PhD researcher and the lead developer of OMA/mia at
-   the Department of Computing, University of Turku. 
+   the Department of Computing, University of Turku.
 
 ### Contributors {-}
 
@@ -70,7 +70,7 @@ and the OMA book
 
 - *Matti Ruuskanen, PhD*, added machine learning techniques for
    microbiome analysis
-   
+
 - *Stefanie Peschel* has contributed chapters on the construction, analysis,
 and comparison of microbial association networks.
 
@@ -80,15 +80,15 @@ and a Professor for Biomedical Statistics and Data Science at
 [LMU Munich](https://www.en.statistik.uni-muenchen.de/index.html). He assisted
 in writing the chapters on network learning and comparison.
 
-- *Shigdel Rajesh, PhD* 
+- *Shigdel Rajesh, PhD*
 
-- *Artur Sannikov* 
+- *Artur Sannikov*
 
-- *Akewak Jeba* 
+- *Akewak Jeba*
 
 - *Himmi Lindgren*
 
-- *Lu Yang* 
+- *Lu Yang*
 
 - *Katariina Pärnänen*
 
@@ -114,8 +114,8 @@ in writing the chapters on network learning and comparison.
 
 -*Matteo Calgaro*
 
-- *Jacques Serizay* converted the _OMA_ book to the `BiocBook` format. This 
-allows the _OMA_ book to be built and distributed by Bioconductor. 
+- *Jacques Serizay* converted the _OMA_ book to the `BiocBook` format. This
+allows the _OMA_ book to be built and distributed by Bioconductor.
 
 - *Himel Mallick, PhD, FASA*, principal investigator and tenure-track faculty
 at Cornell University’s Department of Population Health Sciences and an adjunct
@@ -124,7 +124,7 @@ Information Science. He contributed to the chapters on meta-analyses, microbe
 set enrichment analysis (MSEA) and multi-omics prediction and classification.
 
 - *Yihan Liu*, assisted Dr. Mallick in writing the chapters on meta-anlayses,
-MSEA and multi-omics prediction and classification. 
+MSEA and multi-omics prediction and classification.
 
 ### Acknowledgments {-}
 
diff --git a/inst/pages/agglomeration.qmd b/inst/pages/agglomeration.qmd
index 2553b84f7..9ba96a97a 100644
--- a/inst/pages/agglomeration.qmd
+++ b/inst/pages/agglomeration.qmd
@@ -21,7 +21,7 @@ with counts aggregated from the lower-level taxa associated with them.
 ## Agglomerate data to certain rank {#sec-data-agglomeration}
 
 One of the main applications of taxonomic information in regards to count data
-is to agglomerate count data on taxonomic levels and track the influence of 
+is to agglomerate count data on taxonomic levels and track the influence of
 changing conditions through these levels. For this `mia` contains the
 `agglomerateByRank` function. The ideal location to store the agglomerated data
 is as an alternative experiment.
@@ -45,12 +45,12 @@ assayNames(altExp(tse, "Family"))
 ```{r}
 assay(altExp(tse, "Family"), "relabundance")[1:5, 1:7]
 ```
-  
+
 ```{r taxinfo_altexp_example}
 assay(altExp(tse, "Family"), "counts")[1:5, 1:7]
 ```
 
-`altExpNames` now consists of `Family` level data. This can be extended to use 
+`altExpNames` now consists of `Family` level data. This can be extended to use
 any taxonomic level listed in `taxonomyRanks(tse)`.
 
 We can also aggregate the data across all available ranks in one step using
@@ -102,18 +102,18 @@ dim(tse_sub)
 
 ## Agglomerate based on prevalence
 
-Rare taxa can also be aggregated into a single group "Other" instead of 
+Rare taxa can also be aggregated into a single group "Other" instead of
 filtering them out. A suitable function for this is `agglomerateByPrevalence`.
-The number of rare taxa is higher on the species level, which causes the need 
+The number of rare taxa is higher on the species level, which causes the need
 for data agglomeration by prevalence.
 
 ```{r}
 altExp(tse, "Species_byPrevalence") <- agglomerateByPrevalence(
-    tse, 
-    rank = "Species", 
-    other.label = "Other", 
-    prevalence = 5 / 100, 
-    detection = 1 / 100, 
+    tse,
+    rank = "Species",
+    other.label = "Other",
+    prevalence = 5 / 100,
+    detection = 1 / 100,
     as.relative = TRUE)
 altExp(tse, "Species_byPrevalence")
 
diff --git a/inst/pages/alpha_diversity.qmd b/inst/pages/alpha_diversity.qmd
index 381bddf30..7e61b5f13 100644
--- a/inst/pages/alpha_diversity.qmd
+++ b/inst/pages/alpha_diversity.qmd
@@ -53,7 +53,7 @@ indices of dominance or rarity:
   a high share of the total species abundance in the community. Note
   that dominance indices are generally inversely correlated with other
   alpha diversity indices.
-  
+
 **Rarity** indices characterize the concentration of species at low
   abundance.  Prevalence and detection thresholds determine rare
   species whose total concentration will determine the value of a
@@ -112,11 +112,11 @@ type and final barcode).
 ```{r plot-div-obs, message=FALSE, fig.cap="Shannon diversity estimates plotted grouped by sample type with colour-labeled barcode.", cache=TRUE}
 library(scater)
 plotColData(
-    tse, 
-    "observed", 
-    "SampleType", 
+    tse,
+    "observed",
+    "SampleType",
     colour_by = "Final_Barcode") +
-    theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
+    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
     labs(expression(Richness[Observed]))
 ```
 
@@ -187,7 +187,7 @@ plots <- lapply(
 
 # Fine-tune visual appearance
 plots <- lapply(
-    plots, "+", 
+    plots, "+",
     theme(axis.text.x = element_blank(),
           axis.title.x = element_blank(),
           axis.ticks.x = element_blank()))
@@ -197,7 +197,7 @@ wrap_plots(plots, ncol = 1) +
   plot_layout(guides = "collect")
 ```
 
-## Visualizing significance in group-wise comparisons 
+## Visualizing significance in group-wise comparisons
 
 Next, let's compare the Shannon index between sample groups and visualize the
 statistical significance. Using the `stat_compare_means` function from the
@@ -233,8 +233,8 @@ pvals <- reshape(
     pvals,
     direction = "long",
     varying = colnames(pvals)[ !colnames(pvals) %in% varname ],
-    times = colnames(pvals)[ !colnames(pvals) %in% varname ], 
-    v.names = "p", 
+    times = colnames(pvals)[ !colnames(pvals) %in% varname ],
+    v.names = "p",
     timevar = "group2",
     idvar = "group1"
     ) |>
diff --git a/inst/pages/beta_diversity.qmd b/inst/pages/beta_diversity.qmd
index 270e5ae5d..dbdebb56c 100644
--- a/inst/pages/beta_diversity.qmd
+++ b/inst/pages/beta_diversity.qmd
@@ -34,7 +34,7 @@ terms dissimilarity and beta diversity are preferred.
 
 | Method description          | Assay type          | Beta diversity metric |
 |:---------------------------:|:-------------------:|:---------------------:|
-| Quantitative profiling      | Absolute counts     | Bray-Curtis           | 
+| Quantitative profiling      | Absolute counts     | Bray-Curtis           |
 | Relative profiling          | Relative abundances | Bray-Curtis           |
 | Aitchison distance          | Absolute counts     | Aitchison             |
 | Aitchison distance          | clr                 | Euclidean             |
@@ -43,9 +43,9 @@ terms dissimilarity and beta diversity are preferred.
 | Presence/Absence similarity | Absolute counts     | Jaccard               |
 | Phylogenetic distance       | Rarefied counts     | Unifrac               |
 
-In practice, beta diversity is usually represented as a `dist` object. Such an 
+In practice, beta diversity is usually represented as a `dist` object. Such an
 object is a triangular matrix where the distance between each pair of samples
-is encoded by a specific cell. 
+is encoded by a specific cell.
 This distance matrix can then undergo ordination, which is an
 important ecological tool to reduce the dimensionality of data for a more
 efficient analysis and visualization. Ordination techniques aim to capture as
@@ -56,7 +56,7 @@ similarities in an optimal way, which is defined in different ways by different
 methods.
 
 Based on the type of algorithm, ordination methods in microbiome research can
-be generally divided in two categories: 
+be generally divided in two categories:
 
 - unsupervised ordination
 - supervised ordination
@@ -75,7 +75,7 @@ library(mia)
 data("GlobalPatterns", package = "mia")
 tse <- GlobalPatterns
 
-# Beta diversity metrics like Bray-Curtis are often 
+# Beta diversity metrics like Bray-Curtis are often
 # applied to relabundances
 tse <- transformAssay(
     tse, assay.type = "counts", method = "relabundance")
@@ -242,11 +242,11 @@ plotReducedDim(tse, "Unifrac", colour_by = "Group")
 ### Rarefaction to mitigate impacts of uneven sequencing effort
 
 The sequencing depth of a sample refers to the number of metagenomic reads
-obtained from the sequencing process. A variation in sequencing depth across the 
-samples of a study may impact the calculation of alpha and beta diversity 
-metrics [@Schloss2023]. 
+obtained from the sequencing process. A variation in sequencing depth across the
+samples of a study may impact the calculation of alpha and beta diversity
+metrics [@Schloss2023].
 It is common to find significant variation in sequencing depth between samples.
-For instance, the samples of the *TreeSummarizedExperiment* dataset 
+For instance, the samples of the *TreeSummarizedExperiment* dataset
 `GlobalPatterns` show up to a 40-fold difference in the number of metagenomic
 reads.
 
@@ -255,36 +255,36 @@ reads.
 
 # Calculate the list of sequencing depths across samples
 sequencing_depths <- colSums(assay(tse))
-# Calculate variation between highest and lowest sequencing depth 
+# Calculate variation between highest and lowest sequencing depth
 depth_variation <- max(sequencing_depths)/min(sequencing_depths)
 depth_variation
 ```
 
-To address uneven sequencing effort, rarefaction aims to normalize metagenomic 
-reads counts using subsampling. 
-First, the user chooses the rarefaction depth and a number of iterations N. All 
+To address uneven sequencing effort, rarefaction aims to normalize metagenomic
+reads counts using subsampling.
+First, the user chooses the rarefaction depth and a number of iterations N. All
 the samples with metagenomic reads count below the rarefaction depth are removed
 and metagenomic reads are randomly drawn from the samples left to get subsamples
-fitting the rarefaction depth. Then a beta diversity metric is calculated from 
+fitting the rarefaction depth. Then a beta diversity metric is calculated from
 those subsamples and the process is iterated N times. Finally, beta diversity is
-estimated with the mean of all beta diversity values calculated on subsampled 
+estimated with the mean of all beta diversity values calculated on subsampled
 data.
 
 There has been a long-lasting controversy surrounding the use of rarefaction in
 microbial ecology. The main concern is that rarefaction would omit data
 [@mcmurdie2014waste] [@Schloss2024rarefaction2].
-However, if the subsampling process is repeated a sufficient number of times, 
-and if the rarefaction depth is set to the lowest metagenomic reads count found 
-across all samples, no data will be omitted. 
-Moreover, Patrick Schloss has demonstrated that rarefaction is "the only method 
-that could control for variation in uneven sequencing effort when measuring 
+However, if the subsampling process is repeated a sufficient number of times,
+and if the rarefaction depth is set to the lowest metagenomic reads count found
+across all samples, no data will be omitted.
+Moreover, Patrick Schloss has demonstrated that rarefaction is "the only method
+that could control for variation in uneven sequencing effort when measuring
 commonly used alpha and beta diversity metrics" [@Schloss2023].
 
-Before applying rarefaction, selecting the most variable features can help 
+Before applying rarefaction, selecting the most variable features can help
 minimize variation caused by random subsampling. These features have the highest
-read counts, while rare features tend to increase sampling variation. This 
-approach facilitates comparison of results between non-rarefied and rarefied 
-distance calculations later on.  
+read counts, while rare features tend to increase sampling variation. This
+approach facilitates comparison of results between non-rarefied and rarefied
+distance calculations later on.
 
 ```{r}
 #| label: rarefaction2
@@ -294,7 +294,7 @@ ntop <- 5
 tse_sub <- tse[head(rev(order(rowSds(assay(tse, "counts")))), ntop), ]
 ```
 
-Let us first convert the count assay to centered log ratio (clr) assay and 
+Let us first convert the count assay to centered log ratio (clr) assay and
 calculate Aitchison distance without rarefaction:
 
 ```{r}
@@ -304,7 +304,7 @@ calculate Aitchison distance without rarefaction:
 
 tse_sub <- transformAssay(
   tse_sub,
-  assay.type = "counts",		      
+  assay.type = "counts",
   method = "clr",
   pseudocount = 1
   )
@@ -331,11 +331,11 @@ clr <- function (x) {
 }
 
 # Run transformation after rarefactions before calculating the beta diversity..
-tse_sub <- runMDS(tse_sub, 
+tse_sub <- runMDS(tse_sub,
   assay.type="counts",
-  ntop = nrow(tse_sub), 
+  ntop = nrow(tse_sub),
   FUN = vegan::avgdist, # Custom beta diversity function that includes rarefaction
-  sample = min(colSums(assay(tse_sub, "counts"))), # rarefaction depth 
+  sample = min(colSums(assay(tse_sub, "counts"))), # rarefaction depth
   distfun = vegan::vegdist, # Use vegdist beta diversity function
   dmethod = "euclidean", # euclidean beta diversity
   iterations = 10,
@@ -346,7 +346,7 @@ tse_sub <- runMDS(tse_sub,
 
 plotReducedDim(tse_sub, "MDS_aitchison_rarefied")
 
-# Generate plots for non-rarefied and rarefied Bray-Curtis distances scaled by 
+# Generate plots for non-rarefied and rarefied Bray-Curtis distances scaled by
 # MDS
 plots <- lapply(
   c("MDS_aitchison", "MDS_aitchison_rarefied"),
@@ -371,22 +371,22 @@ print(all.equal(
     tolerance = 0.05))
 ```
 
-`ntop` is a `runMDS` option. Here it is set to total number of features in 
-`GlobalPatterns` dataset so that they are not filtered. If ntop was not set to 
+`ntop` is a `runMDS` option. Here it is set to total number of features in
+`GlobalPatterns` dataset so that they are not filtered. If ntop was not set to
 the total number of features, only the ntop features with highest variance would
 be used for dimensionality reduction and the other features would be filtered.
 
-The argument `sample` is set to the smallest metagenomic reads count across 
-all samples. This ensures that no sample will be removed during the rarefaction 
-process. 
+The argument `sample` is set to the smallest metagenomic reads count across
+all samples. This ensures that no sample will be removed during the rarefaction
+process.
 
-The argument `iterations` is by default set to 100 but 10 iterations is often 
+The argument `iterations` is by default set to 100 but 10 iterations is often
 sufficient for beta diversity calculations.
 
-To use transformations after the rarefaction of the samples and before the beta 
+To use transformations after the rarefaction of the samples and before the beta
 diversity calculation, `vegan::avgdist` has the argument `transf`.
 
-We can also plot the correlation between principal coordinates PCx and PCy for 
+We can also plot the correlation between principal coordinates PCx and PCy for
 both the rarefied and non-rarefied distance calculations:
 
 ```{r}
@@ -407,7 +407,7 @@ for (k in 1:2) {
 ```
 
 The mia package includes the rarefaction function `rarefyAssay` that randomly
-subsamples a given assay of a `TreeSummarizedExperiment` dataset. However, the 
+subsamples a given assay of a `TreeSummarizedExperiment` dataset. However, the
 iterations and beta diversity calculations are not included.
 
 ### Other ordination methods {#sec-other-ord-methods}
@@ -469,7 +469,7 @@ better if smaller. This can be calculated as shown below.
 
 # Quantify dissimilarities in the original feature space
 # Pick relabunance assay separately
-x <- assay(tse, "relabundance") 
+x <- assay(tse, "relabundance")
 d0 <- as.matrix(vegan::vegdist(t(x), "bray"))
 
 # PCoA Ordination
@@ -479,7 +479,7 @@ names(pcoa) <- c("PCoA1", "PCoA2")
 # Quantify dissimilarities in the ordination space
 dp <- as.matrix(dist(pcoa))
 
-# Calculate stress i.e. relative difference 
+# Calculate stress i.e. relative difference
 # in the original and projected dissimilarities
 stress <- sum((dp - d0)^2) / sum(d0^2)
 ```
@@ -494,10 +494,10 @@ df <- data.frame(d0 = as.vector(d0)[ord],
 
 ggplot(df, aes(x = d0, y = dmds)) +
   geom_smooth() +
-  geom_point() +    
+  geom_point() +
   labs(title = "Shepard plot",
        x = "Original distance",
-       y = "MDS distance",   
+       y = "MDS distance",
        subtitle = paste("Stress:", round(stress, 2))) +
   theme_bw()
 ```
@@ -513,7 +513,7 @@ between the supervised and unsupervised ordination methods.
 |                          | supervised ordination  | unsupervised ordination  |
 |:------------------------:|:----------------------:|:------------------------:|
 | Euclidean distance       | RDA                    | PCA                      |
-| non-Euclidean distance   | dbRDA                  | PCoA/MDS, NMDS, UMAP     | 
+| non-Euclidean distance   | dbRDA                  | PCoA/MDS, NMDS, UMAP     |
 
 In summary, the dbRDA is the more general method that allows a wider
 variety dissimilarity, or beta diversity, indices. This method is
@@ -631,11 +631,11 @@ ggplot(df, aes(x = x, y = y)) +
 ```
 
 In the example above, the largest differences between the two groups
-can be attributed to _`r names(sort(abs(top_coef), decreasing = TRUE))[[1]]`_ 
+can be attributed to _`r names(sort(abs(top_coef), decreasing = TRUE))[[1]]`_
 and
 _`r names(sort(abs(top_coef), decreasing = TRUE))[[2]] `_.
 
-### Checking the homogeneity condition 
+### Checking the homogeneity condition
 
 It is important to note that the application of PERMANOVA assumes
 homogeneous group dispersions (variances). This can be tested with the
@@ -757,11 +757,11 @@ percent_FALSE <- round(freq_FALSE / sum(freq_FALSE) * 100, 1)
 plotReducedDim(
     tse_genus[ , colData(tse_genus)$Group == TRUE], "PCoA_BC",
     colour_by = "most_abundant") +
-    
+
     scale_colour_manual(
         values = my_colors,
         labels = paste0(names(percent_TRUE), "(", percent_TRUE, "%)")) +
-  
+
   labs(
       x = paste("PC 1 (", round(var_explained[1], 1), "%)"),
       y = paste("PC 2 (", round(var_explained[2], 1), "%)"),
@@ -770,11 +770,11 @@ plotReducedDim(
 plotReducedDim(
     tse_genus[ , colData(tse_genus)$Group == FALSE], "PCoA_BC",
     colour_by = "most_abundant") +
-    
+
     scale_colour_manual(
         values = my_colors,
         labels = paste0(names(percent_FALSE), "(", percent_FALSE, "%)")) +
-    
+
     labs(
         x = paste("PC 1 (", round(var_explained[1], 1), "%)"),
         y = paste("PC 2 (", round(var_explained[2], 1), "%)"),
@@ -808,6 +808,6 @@ and `runTSNE`
 
 For more information on sample clustering, you can refer to:
 
-* [How to extract information from clusters](http://bioconductor.org/books/release/OSCA/clustering.html)  
+* [How to extract information from clusters](http://bioconductor.org/books/release/OSCA/clustering.html)
 * Chapter [@sec-clustering] on community typing
 :::
diff --git a/inst/pages/clustering.qmd b/inst/pages/clustering.qmd
index 152fdfaf7..3a26f7e29 100644
--- a/inst/pages/clustering.qmd
+++ b/inst/pages/clustering.qmd
@@ -5,36 +5,36 @@ library(rebook)
 chapterPreamble()
 ```
 
-Community typing in microbial ecology involves identifying distinct microbial 
-communities by recognizing patterns in the data. The community types are 
+Community typing in microbial ecology involves identifying distinct microbial
+communities by recognizing patterns in the data. The community types are
 described based on taxa characteristics, representing each community.
-The samples are described 
-by community type assignments, which are defined by the ecosystem features in 
+The samples are described
+by community type assignments, which are defined by the ecosystem features in
 the sample. Community typing techniques can typically be divided
-into unsupervised 
-clustering and dimensionality reduction techniques, where clustering is more 
+into unsupervised
+clustering and dimensionality reduction techniques, where clustering is more
 commonly in use.
 
 In this chapter, we first walk you through some common clustering schemes in use
 for microbial community ecology, and then focus in more detailed community
 typing techniques based on data dimensionality reduction.
 
-## Clustering 
+## Clustering
 
 Clustering techniques aim to find groups, called clusters, that share a pattern
-in the data. In the microbiome context, clustering techniques are included in 
-microbiome community typing methods. For example, clustering allow samples to be 
+in the data. In the microbiome context, clustering techniques are included in
+microbiome community typing methods. For example, clustering allow samples to be
 distinguished from each other based on their microbiome community composition.
-Clustering scheme consists of two steps, the first is to compute the sample 
-dissimilarities with a given distance metrics, and the second is to form the 
+Clustering scheme consists of two steps, the first is to compute the sample
+dissimilarities with a given distance metrics, and the second is to form the
 clusters based on the dissimilarity matrix. The data can be clustered either
-based on features or samples. The examples below 
+based on features or samples. The examples below
 are focused on sample clustering.
 
-There are multiple clustering algorithms available. *bluster* is a 
-Bioconductor package providing tools for clustering data in the 
-`SummarizedExperiment` container. It offers multiple algorithms such 
-as hierarchical clustering, DBSCAN, and K-means. 
+There are multiple clustering algorithms available. *bluster* is a
+Bioconductor package providing tools for clustering data in the
+`SummarizedExperiment` container. It offers multiple algorithms such
+as hierarchical clustering, DBSCAN, and K-means.
 
 ```{r load_bluster}
 # Load dependencies
@@ -43,11 +43,11 @@ library(kableExtra)
 ```
 
 In the examples of this chapter we use *peerj13075* data for
-microbiota community 
-typing. This chapter illustrates how different results can be obtained depending 
-on the choice of the algorithm. To reduce calculation time we decided to 
-agglomerate taxa onto 'Class' level and filter out the least prevalent taxa as 
-well as less commonly detected within a sample resulting in 25 features in the 
+microbiota community
+typing. This chapter illustrates how different results can be obtained depending
+on the choice of the algorithm. To reduce calculation time we decided to
+agglomerate taxa onto 'Class' level and filter out the least prevalent taxa as
+well as less commonly detected within a sample resulting in 25 features in the
 data.
 
 ```{r load-pkg-data1}
@@ -64,23 +64,23 @@ altExp(tse, "prevalent") <- agglomerateByPrevalence(
 
 The hierarchical clustering aims to find hierarchy between
 samples/features. There are two approaches: agglomerative ("bottom-up")
-and divisive ("top-down"). In agglomerative approach, each observation is first 
-in a unique cluster, after which the algorithm continues to agglomerate similar 
-data points into clusters. The divisive approach, instead, starts with one 
-cluster that contains all observations. Clusters are split recursively into 
-clusters that differ the most. The clustering can be continued until each 
+and divisive ("top-down"). In agglomerative approach, each observation is first
+in a unique cluster, after which the algorithm continues to agglomerate similar
+data points into clusters. The divisive approach, instead, starts with one
+cluster that contains all observations. Clusters are split recursively into
+clusters that differ the most. The clustering can be continued until each
 cluster contains only one observation.
 
-In this example we use `addCluster` function from mia to  cluster the data. 
-`addCluster` function allows to choose a clustering algorithm and offers multiple 
-parameters to shape the result. `HclustParam()` parameter is chosen for hierarchical 
-clustering. `HclustParam()` parameter itself has parameters on its own 
+In this example we use `addCluster` function from mia to  cluster the data.
+`addCluster` function allows to choose a clustering algorithm and offers multiple
+parameters to shape the result. `HclustParam()` parameter is chosen for hierarchical
+clustering. `HclustParam()` parameter itself has parameters on its own
 [HclustParam documentation](https://rdrr.io/github/LTLA/bluster/man/HclustParam-class.html).
-A parameter, `by` defines whether to cluster features or samples. Here we 
-cluster counts data, for which we compute the dissimilarities with the 
-Bray-Curtis distance. Here, again, we use `ward.d2` metohod.  Returning 
-statistical information can chosen by using 
-`full` parameter. Finally, the `clust.col` parameter allows us to choose the 
+A parameter, `by` defines whether to cluster features or samples. Here we
+cluster counts data, for which we compute the dissimilarities with the
+Bray-Curtis distance. Here, again, we use `ward.d2` metohod.  Returning
+statistical information can chosen by using
+`full` parameter. Finally, the `clust.col` parameter allows us to choose the
 name of the column in the colData (default name is `clusters`).
 
 ```{r hclust1}
@@ -100,8 +100,8 @@ table(altExp(tse, "prevalent")$hclust)
 
 Hierarchical clustering suggests six clusters to describe the data.
 
-Now, we visualize the hierarchical structure of the clusters with a dendrogram 
-tree. In dendrograms, the tree is split where the branch length is the largest. 
+Now, we visualize the hierarchical structure of the clusters with a dendrogram
+tree. In dendrograms, the tree is split where the branch length is the largest.
 In each splitting point, the tree is divided into two clusters leading to the
 hierarchy. In this example, each sample is labelled by their dominant taxon
 to visualize ecological differences between the clusters.
@@ -133,8 +133,8 @@ par(mar=c(9, 3, 0.5, 0.5))
 dend %>% set("labels_cex", 0.8) %>% plot()
 ```
 
-We can also visualize the clusters by by projecting the data onto
-two-dimensional 
+We can also visualize the clusters by projecting the data onto
+two-dimensional
 surface that captures the most variability in the data. In this example, we use
 multi-dimensional scaling (MDS).
 
@@ -144,23 +144,22 @@ library (scater)
 # Add the MDS dimensions for plotting
 altExp(tse, "prevalent") <- runMDS(
     altExp(tse, "prevalent"),
-    assay.type = "counts", 
-    FUN = vegan::vegdist, 
+    assay.type = "counts",
+    FUN = vegan::vegdist,
     method = "bray")
 ```
 
 ```{r hclust4, fig.width=6, fig.height=4}
 plotReducedDim(
     altExp(tse, "prevalent"), "MDS", colour_by = "hclust",
-    theme_size = 13, point_size = 4) + 
-    
+    theme_size = 13, point_size = 4) +
     labs(color = "HClust", title = "Hclust") + theme_minimal()
 ```
 
 ### Agglomeration based on clusters {#sec-taxa-clustering}
 
 Another way to agglomerate the data is to cluster the taxa. To do so,
-we usually start by doing a compositionality aware transformation such as CLR, 
+we usually start by doing a compositionality aware transformation such as CLR,
 followed by the application of a standard clustering method.
 
 Here is an example that does a CLR transformation followed by the hierarchical
@@ -184,14 +183,14 @@ tse <- addCluster(
     tse,
     assay.type = "standardize",
     clust.col = "hclustKendall",
-    MARGIN = "rows", 	       
+    MARGIN = "rows",
     HclustParam(dist.fun = kendall_dissimilarity, method = "ward.D2"))
 ```
 
-Results are stored in the `rowData` column specified 
+Results are stored in the `rowData` column specified
 with the `clust.col` parameter.
 
-To better visualize the results and the distribution of the clusters, we can 
+To better visualize the results and the distribution of the clusters, we can
 plot the histogram of the clusters.
 
 ```{r taxa_clustering_histogram}
@@ -220,14 +219,14 @@ You can find more information on agglomeration from [@sec-agglomeration].
 ### Dirichlet Multinomial Mixtures (DMM)
 
 This section focuses on
-[Dirichlet-Multinomial MixturenModel](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030126) 
-analysis. It is a Bayesian clustering technique that allows to search for sample 
+[Dirichlet-Multinomial MixturenModel](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030126)
+analysis. It is a Bayesian clustering technique that allows to search for sample
 patterns that reflect sample similarity in the data using both prior information
 and the data.
 
-In this example, we cluster the data with DMM clustering. In the example below, 
+In this example, we cluster the data with DMM clustering. In the example below,
 we calculate model fit using Laplace approximation ro reduce the
-calculation time. 
+calculation time.
 The cluster information is added in the metadata with an
 optional `name`. In this
 example, we use a prior assumption that the optimal number of clusters to
@@ -237,21 +236,21 @@ describe the data would be in the range from 1 to 6.
 # Run the model and calculates the most likely number of clusters from 1 to 6
 set.seed(1495723)
 altExp(tse, "prevalent") <- addCluster(
-    altExp(tse, "prevalent"), 
-    assay.type = "counts", 
-    name = "DMM", 
-    DmmParam(k = 1:6, type = "laplace"), 
-    MARGIN = "samples", 
-    full = TRUE, 
+    altExp(tse, "prevalent"),
+    assay.type = "counts",
+    name = "DMM",
+    DmmParam(k = 1:6, type = "laplace"),
+    MARGIN = "samples",
+    full = TRUE,
     clust.col = "dmmclust")
 ```
 
-The plot below represents the Laplace approximation to the model evidence for 
-each of the six models. We can see that the best number of clusters is two, as 
+The plot below represents the Laplace approximation to the model evidence for
+each of the six models. We can see that the best number of clusters is two, as
 the minimum value of Laplace approximation to the negative log model
-evidence for 
-DMM models as a function of k, determines the optimal k. The optimal k suggests 
-to fit a model with k mixtures of Dirichlet distributions as a prior. 
+evidence for
+DMM models as a function of k, determines the optimal k. The optimal k suggests
+to fit a model with k mixtures of Dirichlet distributions as a prior.
 
 ```{r dmm2, fig.width=5, fig.height=3}
 library(miaViz)
@@ -259,7 +258,7 @@ p <- plotDMNFit(altExp(tse, "prevalent"), type = "laplace", name = "DMM")
 p + theme_minimal(base_size = 11)
 ```
 
-The plot above suggests that two clusters describe the data the best. Now 
+The plot above suggests that two clusters describe the data the best. Now
 we can plot, which taxa define the key cluster features.
 
 ```{r dmm3, fig.width=6, fig.height=2}
@@ -273,45 +272,45 @@ drivers$class <- rownames(drivers)
 for (i in 1:2) {
   # Transfer to relative to optimize plotting
   drivers[i] <- drivers[i]/sum(drivers[i])
-  
+
   drivers <- drivers[order(drivers[[i]], decreasing = TRUE),]
   p <- ggplot(head(drivers, 5), aes(
       x = reorder(head(class, 5), + head(drivers[[i]], 5)),
       y = head(drivers[[i]], 5))) +
-      
+
       geom_bar(stat = "identity", fill = scales::hue_pal()(1), alpha = 0.6) +
-      coord_flip() + 
+      coord_flip() +
       labs(title = paste("Top phyla in group", i)) +
-      theme_minimal(base_size = 11) + 
-      labs(x="", y="") + 
+      theme_minimal(base_size = 11) +
+      labs(x="", y="") +
       scale_y_continuous(limits=c(0,1))
   print(p)
 }
 ```
 
-As well as in hierarchical clustering, we can also visualize the clusters by 
+As well as in hierarchical clustering, we can also visualize the clusters by
 projecting the data onto two-dimensional surface calculated using MDS. The plot
 rather clearly demonstrates the cluster division.
 
 ```{r dmm4}
 # DMM clusters
-plotReducedDim(altExp(tse, "prevalent"), "MDS", theme_size = 13) + 
-              geom_point(size = 3, alpha = 0.6, 
+plotReducedDim(altExp(tse, "prevalent"), "MDS", theme_size = 13) +
+              geom_point(size = 3, alpha = 0.6,
                       aes(color = altExp(tse, "prevalent")$dmmclust)) +
               labs(color = "DMMclust", title = "DMM") + theme_minimal()
 ```
 
 ## Dimensionality reduction
 
-Dimensionality reduction can be considered as a part of community typing, 
-where the aim is to find underlying structure in the data. Dimensionality 
+Dimensionality reduction can be considered as a part of community typing,
+where the aim is to find underlying structure in the data. Dimensionality
 reduction relies in the idea that the data can be describe in a
-lower rank representation, where latent features with different loads of the 
-observed ones describe the sample set. This representation stores the 
-information about sample dissimilarity with significantly reduced amount of 
-data dimensions. Unlike clustering, where each sample is assigned into one 
-cluster, dimensionality reduction allows continuous memberships for all samples 
-into all community types. Dimensionality reduction techniques are explained in 
+lower rank representation, where latent features with different loads of the
+observed ones describe the sample set. This representation stores the
+information about sample dissimilarity with significantly reduced amount of
+data dimensions. Unlike clustering, where each sample is assigned into one
+cluster, dimensionality reduction allows continuous memberships for all samples
+into all community types. Dimensionality reduction techniques are explained in
 chapter [@sec-community-similarity] in more detail.
 
 ### Non-negative matrix factorization
@@ -320,7 +319,7 @@ In this section, we will particularly focus on Non-negative Matrix Factorization
 (NMF). NMF decomposes the original data *X* into two lower rank  matrices: *H*
 representing key taxa characteristics of each
 community type and *W* representing
-continuous sample memberships across all community types. NMF algorithm 
+continuous sample memberships across all community types. NMF algorithm
 calculates the minimum distance between the original data and it's approximation
 using Kullback-Leibler divergence.
 
@@ -359,7 +358,7 @@ for each sample onto a two-dimensional surface.
 plotReducedDim(
     altExp(tse, "prevalent"), "MDS", colour_by = "NMFcomponent",
     point_size = 3, theme_size = 13) +
-    
+
     labs(color = "NMFcomponent", title = "NMF") + theme_minimal()
 ```
 
@@ -421,7 +420,7 @@ library(patchwork)
 # Elbow method
 p1 <- fviz_nbclust(assay(tse, "standardize"), kmeans, method = "wss")
 # Silhouette method
-p2 <- fviz_nbclust(assay(tse, "standardize"), kmeans, method = "silhouette") 
+p2 <- fviz_nbclust(assay(tse, "standardize"), kmeans, method = "silhouette")
 
 p1 + p2
 ```
@@ -440,7 +439,7 @@ colData(tse)[["clusters"]]
 
 Now we can create a heatmap with additional annotations.
 
-```{r pheatmap8}
+```{r pheatmap8, fig.width=13, fig.height=7}
 # Order the data based on clusters
 tse <- tse[
     order(rowData(tse)[["clusters"]]),
@@ -461,7 +460,7 @@ breaks <- seq(
 colors <- colorRampPalette(c("darkblue", "blue", "white", "red", "darkred"))(length(breaks)-1)
 
 pheatmap(
-    mat, annotation_row = taxa_clusters, 
+    mat, annotation_row = taxa_clusters,
     annotation_col = sample_data,
     cluster_cols = FALSE, cluster_rows = FALSE,
     breaks = breaks,
diff --git a/inst/pages/containers.qmd b/inst/pages/containers.qmd
index fdf530667..2b0642b28 100644
--- a/inst/pages/containers.qmd
+++ b/inst/pages/containers.qmd
@@ -69,8 +69,8 @@ An assay is a way of measuring the presence and abundance of different types
 of microbes in a sample. For example, if you want to know how many bacteria of
 a certain type are in your gut, you can use an assay to measure this. When
 storing assays, the original data is count-based. However, the original
-count-based taxonomic abundance tables may undergo different 
-transformations, such as logarithmic, Centered Log-Ratio (CLR), or relative 
+count-based taxonomic abundance tables may undergo different
+transformations, such as logarithmic, Centered Log-Ratio (CLR), or relative
 abundance. These are typically stored in _**assays**_. See
 [@sec-assay-transform] for more information on transformations.
 
@@ -90,7 +90,7 @@ assay(tse, "counts") |> head()
 So, in summary, in the world of microbiome analysis, an assay is essentially
 a way to quantify and understand the composition of microbes in a given sample,
 which is super important for all kinds of research, ranging from human health
-to environment studies. 
+to environment studies.
 
 Furthermore, to illustrate the use of multiple assays, we can create an empty
 matrix and add it to the object.
@@ -101,7 +101,7 @@ assay(tse, "empty_table", withDimnames=FALSE) <- mat
 assays(tse)
 ```
 
-Now there are two assays available in the `tse` object, `counts` and 
+Now there are two assays available in the `tse` object, `counts` and
 `empty_table`.
 
 ```{r}
@@ -140,13 +140,13 @@ microbiome composition and various environmental or health factors.
 rowData(tse)
 ```
 
-## rowTree  
+## rowTree
 
-Phylogenetic trees also play an important role in the microbiome field. The 
+Phylogenetic trees also play an important role in the microbiome field. The
 `TreeSE` class can keep track of features and node
 relations via two functions, `rowTree` and `rowLinks`.
 
-A tree can be accessed via `rowTree` as `phylo` object.      
+A tree can be accessed via `rowTree` as `phylo` object.
 
 ```{r rowtree}
 rowTree(tse)
@@ -158,7 +158,7 @@ The links to the individual features are available through `rowLinks`.
 rowLinks(tse)
 ```
 
-Please note that there can be a 1:1 relationship between tree nodes and 
+Please note that there can be a 1:1 relationship between tree nodes and
 features, but this is not a must-have. This means there can be features that
 are not linked to nodes, and nodes that are not linked to features. To change
 the links in an existing object, the `changeTree` function is available.
@@ -177,9 +177,9 @@ sequencing. Another common use case is including abundance tables for
 different taxonomic ranks. Such alternative experiments concerning the
 same set of samples can be stored as
 
-1. Separate _assays_ assuming that the taxonomic information can be mapped 
-between features directly 1:1; or 
-2. Data in the _altExp_ slot of the `TreeSE`, if the feature 
+1. Separate _assays_ assuming that the taxonomic information can be mapped
+between features directly 1:1; or
+2. Data in the _altExp_ slot of the `TreeSE`, if the feature
 dimensions differ. Each element of the _altExp_ slot is a `SE`
 or an object from a derived class with independent feature data.
 
@@ -227,14 +227,14 @@ _**Multiple experiments**_ relate to complementary measurement types,
 such as transcriptomic or metabolomic profiling of the microbiome or
 the host. Multiple experiments can be represented using the same
 options as alternative experiments, or by using the
-`MAE` class [@Ramos2017]. Depending on how the 
+`MAE` class [@Ramos2017]. Depending on how the
 datasets relate to each other the data can be stored as:
 
 1.  _altExp_ if the samples can be matched directly 1:1; or
 2. As `MAE` objects, in which the connections between
 samples are defined through a `sampleMap`. Each element on the
 `ExperimentList` of an `MAE` is `matrix` or
-`matrix`-like objects, including `SE` objects, and 
+`matrix`-like objects, including `SE` objects, and
 the number of samples can differ between the elements.
 
 The `MAE` object can handle more complex relationships between experiments.
@@ -254,7 +254,7 @@ important bookkeeper, maintaining the information about which samples are
 associated with which experiments. This ensures that data linkages are
 correctly managed and preserved across different types of experiments.
 
-In fact, we can have 
+In fact, we can have
 
 ```{r}
 #| label: show_mae2
@@ -287,16 +287,16 @@ samples belong to which patient.
 
 For information have a look at the
 [intro vignette](https://bioconductor.org/packages/release/bioc/vignettes/MultiAssayExperiment/inst/doc/MultiAssayExperiment.html)
-of the `MultiAssayExperiment` package.  
+of the `MultiAssayExperiment` package.
 
 ::: {.callout-tip}
 ## Recommended options for storing multiple data tables in microbiome studies
 
-   Option   Rows (features)    Cols (samples)               Recommended  
+   Option   Rows (features)    Cols (samples)               Recommended
 ---------   --------------    ---------------  ------------------------
-   assays  	         match              match       Data transformations  
-   altExp             free              match    Alternative experiments  
-MultiAssay            free      free (mapping)    Multi-omic experiments    
+   assays  	         match              match       Data transformations
+   altExp             free              match    Alternative experiments
+MultiAssay            free      free (mapping)    Multi-omic experiments
 
 :
 The _assays_ are best suited for data transformations (one-to-one match between
diff --git a/inst/pages/convert.qmd b/inst/pages/convert.qmd
index bc8fd6551..5686771e6 100644
--- a/inst/pages/convert.qmd
+++ b/inst/pages/convert.qmd
@@ -19,14 +19,14 @@ Biom, and DADA2).
 library(mia)
 
 # phyloseq example data
-data(GlobalPatterns, package = "phyloseq") 
+data(GlobalPatterns, package = "phyloseq")
 GlobalPatterns_phyloseq <- GlobalPatterns
 GlobalPatterns_phyloseq
 ```
 
 ```{r, message=FALSE}
 # convert phyloseq to TSE
-GlobalPatterns_TSE <- convertFromPhyloseq(GlobalPatterns_phyloseq) 
+GlobalPatterns_TSE <- convertFromPhyloseq(GlobalPatterns_phyloseq)
 GlobalPatterns_TSE
 ```
 
@@ -38,7 +38,7 @@ instance when additional methods are available for `phyloseq`.
 
 ```{r, message=FALSE}
 # convert TSE to phyloseq
-GlobalPatterns_phyloseq2 <- convertToPhyloseq(GlobalPatterns_TSE) 
+GlobalPatterns_phyloseq2 <- convertToPhyloseq(GlobalPatterns_TSE)
 GlobalPatterns_phyloseq2
 ```
 
@@ -69,7 +69,7 @@ such as Julia or Python. For information, have a look at the
 #| label: write_feather
 #| eval: false
 
-data(GlobalPatterns, package = "mia") 
+data(GlobalPatterns, package = "mia")
 tse <- GlobalPatterns
 
 molten_tse <- meltSE(
@@ -86,7 +86,7 @@ write_feather(molten_tse, path)
 
 Another way could be using a CSV file. This works  the same as for a feather
 file, make sure you have converted your `TreeSE` data container as a dataframe
-Here note that you can decide whether you want to write the row names or not. 
+Here note that you can decide whether you want to write the row names or not.
 
 ```{r}
 #| label: write_csv
diff --git a/inst/pages/correlation.qmd b/inst/pages/correlation.qmd
index 0b894a0a5..0c8157187 100644
--- a/inst/pages/correlation.qmd
+++ b/inst/pages/correlation.qmd
@@ -49,7 +49,8 @@ library(qgraph)
 # Create correlation network plot
 qgraph(
     res$cor, layout = "spring", labels = colnames(res$cor),
-    label.cex = 3, theme = "colorblind")
+    label.cex = 1.2, theme = "colorblind",
+    node.width = 1.5, node.height = 2)
 ```
 
 You can find more on networks from [@sec-network-learning].
@@ -80,7 +81,7 @@ Below, we present the results using a heatmap visualization.
 #| fig-width: 8
 #| fig-height: 8
 
-library(ComplexHeatmap) 
+library(ComplexHeatmap)
 library(shadowtext)
 
 # Function for marking significant correlations with "X"
@@ -133,7 +134,7 @@ p
 ```
 
 ::: {.callout-tip}
-## Cross-correlation
+## Cross-association
 
-See [@sec-cross-correlation] for further information on correlation analyses.
+See [@sec-cross-correlation] for further information on correlation and association analyses.
 :::
diff --git a/inst/pages/cross_correlation.qmd b/inst/pages/cross_correlation.qmd
index fa448c3aa..6f66fdf79 100644
--- a/inst/pages/cross_correlation.qmd
+++ b/inst/pages/cross_correlation.qmd
@@ -31,24 +31,24 @@ already in `MAE` format which is tailored for multi-assay analyses
 (see [@sec-containers]). The dataset includes three different experiments:
 microbial abundance data, metabolite concentrations, and data about
 different biomarkers. If you like to construct the same data object from the
-original files instead, [here]([@sec-loading-experimental-microbiome-data]) 
+original files instead, [here]([@sec-loading-experimental-microbiome-data])
 you can find help for importing data into an SE object.
 
-## Cross-correlation Analysis {#sec-cross-correlation}
+## Cross-association Analysis {#sec-cross-correlation}
 
-Cross-correlation analysis is a straightforward approach that can
+Cross-association analysis is a straightforward approach that can
 reveal strength and type of associations between data sets. For instance,
 we can analyze if higher presence of a specific taxon relates to higher
 levels of a biomolecule. Correlation analyses within dataset were already
 discussed in [@sec-correlation].
 
-```{r cross-correlation1}
+```{r cross-association1}
 # Load the data
 data(HintikkaXOData, package = "mia")
 mae <- HintikkaXOData
 ```
 
-```{r cross-correlation2}
+```{r cross-association2}
 library(stringr)
 # Drop those bacteria that do not include information in Phylum or lower levels
 mae[[1]] <- mae[[1]][!is.na(rowData(mae[[1]])$Phylum), ]
@@ -67,17 +67,17 @@ experiments(mae)
 getWithColData(mae, "microbiota")
 ```
 
-```{r cross-correlation3}
+```{r cross-association3}
 # Metabolite data
 getWithColData(mae, "metabolites")
 ```
 
-```{r cross-correlation4}
+```{r cross-association4}
 # Biomarker data
 getWithColData(mae, "biomarkers")
 ```
 
-Next we can perform a cross-correlation analysis. Let us analyze if
+Next we can perform a cross-association analysis. Let us analyze if
 individual bacteria genera are correlated with concentrations of
 individual metabolites. This helps to answer the following question: "If
 bacterium X is present, is the concentration of metabolite Y lower or higher"?
@@ -93,10 +93,10 @@ rownames(mae[[1]]) <- getTaxonomyLabels(mae[[1]])
 
 # Cross correlates data sets
 res <- getCrossAssociation(
-    mae, 
+    mae,
     experiment1 = 1,
     experiment2 = 2,
-    assay.type1 = "log10", 
+    assay.type1 = "log10",
     assay.type2 = "nmr",
     method = "spearman",
     test.signif = TRUE,
@@ -112,7 +112,7 @@ Next, we create a heatmap depicting all cross-correlations between bacterial
 genera and metabolite concentrations.
 
 ```{r cross-correlation6, fig.width=10, fig.height=8}
-library(ComplexHeatmap) 
+library(ComplexHeatmap)
 library(shadowtext)
 
 # Function for marking significant correlations with "X"
diff --git a/inst/pages/differential_abundance.qmd b/inst/pages/differential_abundance.qmd
index 4e42346fe..cb8053ab0 100755
--- a/inst/pages/differential_abundance.qmd
+++ b/inst/pages/differential_abundance.qmd
@@ -6,7 +6,7 @@ chapterPreamble()
 ```
 
 Differential Abundance Analysis (DAA) is a method used to identify
-differences in the abundances of individual taxa (at any taxonomic level) 
+differences in the abundances of individual taxa (at any taxonomic level)
 between two or more groups, such as treatment versus control groups.
 Here, we demonstrate its implementation on
 [Tengeler2020](https://microbiome.github.io/mia/reference/Tengeler2020.html)
@@ -17,13 +17,13 @@ and gain understanding of a complex system by looking at its isolated
 components. For example, the identification of a bacterial taxon that is
 more or less abundant between healthy patients and diseased patients can
 lead to important
-insights into the underlying mechanisms of the disease. In other words, 
-differentially abundant taxa can be involved in the dynamics of the 
-disease, which in turn helps understand the system as a whole. Despite its 
+insights into the underlying mechanisms of the disease. In other words,
+differentially abundant taxa can be involved in the dynamics of the
+disease, which in turn helps understand the system as a whole. Despite its
 relevance in current research, the DAA approach has also been subject to
 debate [@Quinn2021].
 
-## Statistical challenges of microbiome data 
+## Statistical challenges of microbiome data
 
 As discussed in [@sec-stat-challenges], data display unique properties that are
 exclusively addressed by DAA tools developed for microbiome analysis.
@@ -32,13 +32,13 @@ We recommend to have a look at @Nearing2022. In this study,
 multiple DAA methods
 were applied to 38 different datasets and their results were compared to one
 another. The study highlighted that different methods might
-yield varying results 
+yield varying results
 due to differing assumptions and normalization techniques. Interestingly,
 @pelto2024 concluded that elementary DAA methods achieve the highest
 consistency.
 
 Recently, @Yang2022 comprehensively evaluated DAA methods via a
-semi-parametric framework and 106 real 
+semi-parametric framework and 106 real
 datasets, concluding that different methods can produce contradictory results,
 creating the risk of cherry-picking the most favorable options for one’s
 own hypothesis. Therefore, it is highly recommended to perform DAA with
@@ -64,7 +64,7 @@ such as **over-dispersed count models** and **zero-inflated mixture models**.
 DESeq2, edgeR and corncorb are based on over-dispersed count models, whereas
 metagenomeSeq, RAIDA, ZIBB and Omnibus implement zero-inflated mixture models to
 address zero-inflation. Typically, these models assume a negative binomial,
-beta-binomial or normal/log-normal distribution. 
+beta-binomial or normal/log-normal distribution.
 
 Another approach to deal with zero-inflated data is **zero imputation**
 where zeross are replaced with estimated values. ALDEx2 and
@@ -73,7 +73,7 @@ data, accounting for sampling variability and sequencing depth variation. Other
 methods, such as MaAsLin2 and ANCOMBC impute the zeros with a
 pseudo-count strategy.
 
-#### Approaches to handle Compositional Data : 
+#### Approaches to handle Compositional Data :
 
 To address the compositionality of microbiome data, several approaches have been
 developed to perform **robust normalization** with methods specifically designed
@@ -81,18 +81,18 @@ to reduce the bias found in compositional data. Some examples include :
 
 - *Trimmed mean of M-values (TMM)* normalization used by edgeR.
 
-- *Relative log expression (RLE)* normalization used by DESeq2. 
+- *Relative log expression (RLE)* normalization used by DESeq2.
 
 - *Cumulative sum scaling (CSS)* normalization used by metagenomeSeq.
 
 - *Centered log-ratio transformation (CLR)* normalization used by ALDEx2.
 
-- *Geometric mean of pairwise ratios (GMPR)* normalization used by Omnibus 
+- *Geometric mean of pairwise ratios (GMPR)* normalization used by Omnibus
 and Wrench normalization [@Kumar2018], which corrects the compositional bias
-by an empirical Bayes approach. 
+by an empirical Bayes approach.
 
-Other methods to deal with compositional data entail reference taxa approach 
-used by DACOMP and RAIDA, analyzing the pattern of pairwise log ratios as done 
+Other methods to deal with compositional data entail reference taxa approach
+used by DACOMP and RAIDA, analyzing the pattern of pairwise log ratios as done
 by ANCOM and bias-correction applied by ANCOMBC.
 
 ## Using the tools
@@ -127,7 +127,7 @@ table(tse$patient_status, tse$cohort) %>%
   knitr::kable()
 ```
 
-### Preparing the data for DAA 
+### Preparing the data for DAA
 
 Before starting the analysis, it is recommended to reduce the size and
 complexity of the data to make the results more reproducible. For this
@@ -153,7 +153,7 @@ regarded as the method of choice for its consistency, as it normally
 identifies features that are also found by complementary methods
 [@Nearing2022]. A more
 extensive introduction to its functionality is available in the
-[ALDEx2 vignette](https://bioconductor.org/packages/release/bioc/vignettes/ALDEx2/inst/doc/ALDEx2_vignette.html). 
+[ALDEx2 vignette](https://bioconductor.org/packages/release/bioc/vignettes/ALDEx2/inst/doc/ALDEx2_vignette.html).
 
 ALDEx2 estimates technical variation within each sample per taxon by utilizing
 the Dirichlet distribution. It furthermore applies the CLR transformation (or
@@ -171,7 +171,7 @@ library(ALDEx2)
 # Convert each instance using the centered log-ratio transform.
 # This is the input for all further analyses.
 set.seed(123)
-x <- aldex.clr(assay(tse), tse$patient_status)     
+x <- aldex.clr(assay(tse), tse$patient_status)
 ```
 
 The t-test:
@@ -191,7 +191,7 @@ Effect sizes:
 # of the between group difference and the larger of the variance within groups
 x_effect <- aldex.effect(x, CI = TRUE, verbose = FALSE)
 
-# combine all outputs 
+# combine all outputs
 aldex_out <- data.frame(x_tt, x_effect)
 ```
 
@@ -262,19 +262,19 @@ this method provides p-values and confidence intervals for each
 taxon. It also controls the FDR and it is computationally simple to
 implement.
 
-Note that the original method was implemented in the `ancombc()` function (see 
+Note that the original method was implemented in the `ancombc()` function (see
 [extended tutorial](https://www.bioconductor.org/packages/release/bioc/vignettes/ANCOMBC/inst/doc/ANCOMBC.html)).
 The method has since then been updated and new features have been added to enable
-multi-group comparisons and repeated measurements among other improvements. 
-We do not cover the more advanced features of ANCOMBC in this tutorial 
-as these features are documented in detail in this 
+multi-group comparisons and repeated measurements among other improvements.
+We do not cover the more advanced features of ANCOMBC in this tutorial
+as these features are documented in detail in this
 [tutorial](https://www.bioconductor.org/packages/release/bioc/vignettes/ANCOMBC/inst/doc/ANCOMBC2.html).
 
-We now proceed with a simple example.  First, we specify a formula. In this 
-formula, other covariates could potentially be included to adjust for 
-confounding. We show this further below. Again, please make sure to check the 
+We now proceed with a simple example.  First, we specify a formula. In this
+formula, other covariates could potentially be included to adjust for
+confounding. We show this further below. Again, please make sure to check the
 [function documentation](https://rdrr.io/github/FrederickHuangLin/ANCOMBC/man/ancombc.html)
-as well as the linked tutorials to learn about the additional arguments 
+as well as the linked tutorials to learn about the additional arguments
 that we specify.
 
 ```{r run-ancombc, warning=FALSE}
@@ -305,7 +305,7 @@ errors, p-values and q-values. Below we show the first entries of this
 dataframe.
 
 ```{r ancombc-res}
-# store the FDR adjusted results 
+# store the FDR adjusted results
 ancombc2_out$res %>%
   dplyr::select(taxon, lfc_patient_statusControl, q_patient_statusControl) %>%
   filter(q_patient_statusControl < 0.05) %>%
@@ -314,7 +314,7 @@ ancombc2_out$res %>%
   knitr::kable()
 ```
 
-### MaAsLin2 
+### MaAsLin2
 
 Let us next illustrate MaAsLin2 [@Mallick2020]. This method is based on
 generalized linear models and flexible for different study designs
@@ -325,8 +325,8 @@ and covariate structures. For details, check their
 # Load package
 library(Maaslin2)
 
-# maaslin expects features as columns and samples as rows 
-# for both the abundance table as well as metadata 
+# maaslin expects features as columns and samples as rows
+# for both the abundance table as well as metadata
 
 # We can specify different GLMs/normalizations/transforms.
 # specifying a ref is especially important if you have more than 2 levels
@@ -362,7 +362,7 @@ significant taxa.
 
 PhILR is a tree-based method that tests group-wise associations based on
 balances. A detailed introduction to this method is available in
-[this Bioconductor tutorial](https://www.bioconductor.org/packages/devel/bioc/vignettes/philr/inst/doc/philr-intro.html). 
+[this Bioconductor tutorial](https://www.bioconductor.org/packages/devel/bioc/vignettes/philr/inst/doc/philr-intro.html).
 
 ### Comparison of methods
 
@@ -380,7 +380,7 @@ apparent dynamics between the response and the main independent variable.
 They are common in experimental studies. Generally, they can be classified
 into three groups:
 
-- Biological confounders, such as age and sex 
+- Biological confounders, such as age and sex
 
 - Technical confounders produced during sample collection, processing and
 analysis
@@ -423,15 +423,15 @@ abundant taxa are associated with one of the variables when the other two
 are kept constant.
 
 ```{r run-adj-ancombc, warning=FALSE}
-# perform the analysis 
+# perform the analysis
 ancombc2_out <- ancombc2(
     tse,
     assay.type = "counts",
     fix_formula = "patient_status + cohort + library_size",
     p_adj_method = "fdr",
     lib_cut = 0,
-    group = "patient_status", 
-    struc_zero = TRUE, 
+    group = "patient_status",
+    struc_zero = TRUE,
     neg_lb = TRUE,
     alpha = 0.05,
     # multi-group comparison is deactivated automatically
@@ -455,38 +455,38 @@ ancombc2_out$res %>%
   knitr::kable()
 ```
 
-## Further information on tools for DAA : 
-  
+## Further information on tools for DAA :
+
 #### LinDA
-  
+
 LinDA covers linear models for differential abundance analysis of microbiome
-compositional data (@Zhou2022). This is very similar to ANCOMBC with few differences: 
-    
+compositional data (@Zhou2022). This is very similar to ANCOMBC with few differences:
+
   1. LinDA corrects for the compositional bias differently using the mode of
-  all regression coefficients. 
-  
-  2. It is faster (100x-1000x than ANCOMBC and according to the authors); 
-  
+  all regression coefficients.
+
+  2. It is faster (100x-1000x than ANCOMBC and according to the authors);
+
   3. It supports hierarchical models. The latest ANCOMBC versions are also
   supporting hierarchical models.
-  
+
 Nevertheless, LinDA seems a promising tool that achieves a very good power/fdr
 trade-off together with ANCOMBC according to the review. The speed improvements
 might make it critical especially for datasets that have higher sample or
 feature set sizes.
-  
+
 #### ZicoSeq
-  
+
 Subsequently, we demonstrate DAA with ZicoSeq, a method based on linear models
 and permutation. Further details can be found in this tutorial. This approach
 has been assessed to exhibit high power and a low false discovery rate, which
 has the following components:
-    
+
   1. Winsorization to decrease the influence of outliers;
-  
+
   2. Posterior sampling based on a beta mixture prior to address sampling
   variability and zero inflation;
-  
+
   3. Reference-based multiple-stage normalization to address compositional
   effects;Additional resources
 
diff --git a/inst/pages/exercises.qmd b/inst/pages/exercises.qmd
index 75929cbd3..361c7e1c8 100644
--- a/inst/pages/exercises.qmd
+++ b/inst/pages/exercises.qmd
@@ -1,11 +1,11 @@
 # Exercises {#sec-exercises}
 
-Here you can find assignments on different topics. 
+Here you can find assignments on different topics.
 
 **Tips for exercises:**
 
 
-   
+
 ::: {.callout-tip}
 ## Tips for exercises
 
@@ -17,7 +17,7 @@ that results can easily be understood by others without advanced knowledge on
 data analytics.
 - Avoid hard-coding. Use variables which get values in the beginning of the
 pipeline. That way it is easier for you to modify parameters and reuse the code.
-   
+
 :::
 
 ## Basics of R/Bioconductor
@@ -73,17 +73,17 @@ previously Rmarkdown) is to combine text with code in R or other programming
 languages, so that both the analytical pipeline and verbal description can be
 put in one place. In this exercise, we learn how to write and run code in Quarto.
 
-1. Open RStudio and create a new Quarto file (click on File tab - New File - Quarto document).  
+1. Open RStudio and create a new Quarto file (click on File tab - New File - Quarto document).
 2. Initialize a code chunk by pressing `alt` + `cmd` + `i` and define the variables
-   `A <- "my name"` and `B <- 0` in it.  
-3. Write the text `Below is my first code chunk` just above the code chunk.  
-4. Initialize one more code chunk and add 100 to the variable `B` in it.   
-5. Write the text `Below I change variable B` just above the second chunk.  
+   `A <- "my name"` and `B <- 0` in it.
+3. Write the text `Below is my first code chunk` just above the code chunk.
+4. Initialize one more code chunk and add 100 to the variable `B` in it.
+5. Write the text `Below I change variable B` just above the second chunk.
 6. **Extra**: Write the following line of text: `my name is A and I am B years old`,
    where A and B are variables defined in the code chunks upstream and change if
    those variables are modified. Be aware that inline code can be added as
-   `> r my_inline_command` (without `>`).  
-   
+   `> r my_inline_command` (without `>`).
+
 Good job. You are now able to combine text and code in a Quarto document. If you
 got stuck, you can refer to the Quarto docs on
 [using code chunks](https://quarto.org/docs/visual-editor/technical.html#code-chunks).
@@ -97,11 +97,11 @@ syntax `#| option-name: value`, also described
 [here](https://quarto.org/docs/computations/r.html#chunk-options).
 In this exercise, we explore part of the knitr potential.
 
-1. Open RStudio and create a new Quarto file.  
-2. Initialize three code chunks and label them as `setup`, `fig-box` and 
-   `tbl-coldata`, respectively. Remember that the name of a chunk can be 
-   specified with the `label` option.  
-3. Write the following code in the corresponding chunk and render the document.  
+1. Open RStudio and create a new Quarto file.
+2. Initialize three code chunks and label them as `setup`, `fig-box` and
+   `tbl-coldata`, respectively. Remember that the name of a chunk can be
+   specified with the `label` option.
+3. Write the following code in the corresponding chunk and render the document.
 
 ```{r}
 #| label: knitr-options-ex1
@@ -134,7 +134,7 @@ knitr::kable(head(colData(tse)))
 
 4. Set `include: false` in the `setup` chunk, `fig-width: 10` in the `fig-box`
    chunk and `echo: false` in the `tbl-coldata` chunk. Render the document again
-   and find the differences from before.  
+   and find the differences from before.
 
 The setup code and output are no longer visible, resulting in a cleaner
 document without initialization details. Figures in the fig-box chunk are now
@@ -144,14 +144,14 @@ the results rather than the code implementation.
 
 5. Add the options `fig-cap` and `tab-cap` to the `fig-box` and `tbl-coldata`
    chunks, respectively. They require some text input, which makes for the
-   caption of the figures or tables.  
+   caption of the figures or tables.
 6. **Extra**: Create a cross-reference to `fig-box` and `tbl-coldata` in the
    text above the respective code chunk. You can do that with the syntax
-   `@chunk-name`.  
+   `@chunk-name`.
 7. **Extra**: Define a custom folder for storing figures with `fig-path`. Insert
    it in `knitr::opts_chunk$set`, so that it applies globally to all the figures
-   generated in the document.  
-   
+   generated in the document.
+
 Congratulations! You are now familiar with the great potential and flexibility
 of knitr chunk options. An exhaustive list of available options can be found
 in the [knitr documentation](https://yihui.org/knitr/options/).
@@ -166,15 +166,15 @@ execution and figures can be specified. A comprehensive list of yaml options is
 available [here](https://quarto.org/docs/reference/formats/html.html). In this
 exercise, we will get a tiny taste of such functionality.
 
-1. Open RStudio and create a new Quarto file.  
+1. Open RStudio and create a new Quarto file.
 2. In the yaml box at the beginning of the document, change the title from
-   `Untitled` to `My first Quarto`.  
+   `Untitled` to `My first Quarto`.
 4. In the same box, add the two lines `author` and `date` followed by your name
-   and today's date, respectively.  
-5. Render the document and check its appearance.  
+   and today's date, respectively.
+5. Render the document and check its appearance.
 6. **Extra**: Set `toc: true` to produce a table of contents. This line should
-   follow `format` and `html` at the second level of indentation.  
-   
+   follow `format` and `html` at the second level of indentation.
+
 Well done! Now you are able to specify yaml options and understand how they
 affect your Quarto document. If you got stuck, you can check
 [this section](https://quarto.org/docs/tools/rstudio.html#yaml-intelligence) of
@@ -186,14 +186,14 @@ An advanced feature of Quarto consists of execution parameters, which are
 externally pre-defined variables that are also accessible in the Quarto document.
 They can be specified in the yaml box as `params`. Here we learn how to use them.
 
-1. Open RStudio and create a new Quarto file.  
+1. Open RStudio and create a new Quarto file.
 2. In the yaml box at the beginning of the document, add a line named `params`
-   followed by an indented line with `gamma: 10`  
-3. Initialize a code chunk and type `str(params$gamma)` in it.  
-4. Render the document and check what happened.  
+   followed by an indented line with `gamma: 10`
+3. Initialize a code chunk and type `str(params$gamma)` in it.
+4. Render the document and check what happened.
 5. Define one more parameter `beta: 3` and multiply `gamma` by `beta` in a code
-   chunk below.  
-6. Render the document again and check what happened.  
+   chunk below.
+6. Render the document again and check what happened.
 
 Well done! You can now use an advanced feature of Quarto such as parameters.
 If you got stuck,
@@ -229,8 +229,8 @@ url_to_rowdata <- url("https://github.com/microbiome/data/raw/main/OKeefeDSData/
 ```
 
 2. Read in the csv files with `read.csv` and store them into the variables
-`assays`, `rowdata` and `coldata`, respectively.  
-   
+`assays`, `rowdata` and `coldata`, respectively.
+
 ```{r}
 #| label: Data-containers-ex2
 #| eval: FALSE
@@ -245,7 +245,7 @@ okeefe_rowdata <- read.csv(url_to_rowdata, row.names = 1)
 3. Create a TreeSE from the individual components with
 `TreeSummarizedExperiment`. Note that the function requires three arguments:
 assays, rowData and colData, to which you can give the appropriate item.
-   
+
 ```{r}
 #| label: Data-containers-ex3
 #| eval: FALSE
@@ -258,7 +258,7 @@ tse <- TreeSummarizedExperiment(assays = list(counts = okeefe_assay),
 ```
 
 4. Check that importing is done correctly. E.g., choose random samples and
-   features, and check that their values equal between raw files and TreeSE.  
+   features, and check that their values equal between raw files and TreeSE.
 
 ```{r}
 #| label: Data-containers-ex4
@@ -284,11 +284,11 @@ functions explained in chapter [@sec-import-from-file]. You can also check the
 [microbiome data repository](https://github.com/microbiome/data)
 and read the instructions in its
 [README](https://github.com/microbiome/data#training-microbiome-datasets)
-to import and construct datasets from there.  
-2. Import data by using mia. (functions: [importMetaPhlAn](https://microbiome.github.io/mia/reference/loadFromMetaphlan.html) | [importMothur](https://microbiome.github.io/mia/reference/importMothur.html) | 
-[importQIIME2](https://microbiome.github.io/mia/reference/importQIIME2.html) | [convertFromBIOM](https://microbiome.github.io/mia/reference/convertFromBIOM.html) | 
+to import and construct datasets from there.
+2. Import data by using mia. (functions: [importMetaPhlAn](https://microbiome.github.io/mia/reference/loadFromMetaphlan.html) | [importMothur](https://microbiome.github.io/mia/reference/importMothur.html) |
+[importQIIME2](https://microbiome.github.io/mia/reference/importQIIME2.html) | [convertFromBIOM](https://microbiome.github.io/mia/reference/convertFromBIOM.html) |
 [convertFromDADA2](https://microbiome.github.io/mia/reference/convertFromDADA2.html) ...)
-3. Try out conversions between TreeSE and phyloseq data containers ([convertFromPhyloseq](https://microbiome.github.io/mia/reference/convertFromPhyloseq.html); [convertToPhyloseq](https://microbiome.github.io/mia/reference/convertToPhyloseq.html))  
+3. Try out conversions between TreeSE and phyloseq data containers ([convertFromPhyloseq](https://microbiome.github.io/mia/reference/convertFromPhyloseq.html); [convertToPhyloseq](https://microbiome.github.io/mia/reference/convertToPhyloseq.html))
 
 
 ### Preliminary exploration
@@ -331,7 +331,7 @@ and `ncol`. How many samples and features are present?
 dim(tse)
 ```
 
-4. List sample and features names with `rownames` and `colnames`. 
+4. List sample and features names with `rownames` and `colnames`.
 
 ```{r}
 #| label: Preliminary-exploration-ex4
@@ -372,7 +372,7 @@ can use `apply` to count unique elements for each column of rowData.
 
 1. Import the mia package, load any of the example data sets mentioned in
 [@sec-example-data] with `data` and store it into a variable named `tse`.
-   
+
 ```{r}
 #| label: Assay-retrieval-ex1
 #| eval: FALSE
@@ -395,7 +395,7 @@ tse <- GlobalPatterns
 assayNames(tse)
 ```
 
-3. Fetch the list of assays with `assays`.  
+3. Fetch the list of assays with `assays`.
 
 ```{r}
 #| label: Assay-retrieval-ex3
@@ -407,7 +407,7 @@ assays(tse)
 ```
 
 4. Retrieve the first assay of the TreeSE with `assay`, where the second
-argument can be either the name or the index of the desired assay.  
+argument can be either the name or the index of the desired assay.
 
 ```{r}
 #| label: Assay-retrieval-ex4
@@ -422,7 +422,7 @@ Well done! You can now locate and retrieve individual assays of a TreeSE. If you
 got stuck, you can refer to chapter [@sec-assay-slot] of this book.
 
 
-### Sample information 
+### Sample information
 
 1.  Import the mia package, load any of the example data sets mentioned
 in [@sec-example-data] with `data` and store it into a variable named `tse`.
@@ -473,7 +473,7 @@ View(colData(tse))
 ```
 
 5.  Get the abundances of all features for a specific sample, such as
-`ID34`, for an *assay* of your choice.  
+`ID34`, for an *assay* of your choice.
 
 ### Feature information
 
@@ -527,7 +527,7 @@ View(rowData(tse))
 
 5. Get the abundances for a specific feature, such as `OTU1810`, in all the
 samples. You can access feature-specific abundances for an _assay_ of your
-choice.  
+choice.
 
 If you got stuck, you can look up chapters [@sec-taxonomic-information]
 and [@sec-fly-tree] on how to pick specific abundances and generate
@@ -540,8 +540,8 @@ Extract some of the other TreeSE elements listed in chapter [@sec-containers].
 However, such data are not always included.
 
 1. Import the mia package, load any of the example data sets mentioned in
-[@sec-example-data] and store it into a variable named `tse`.  
-   
+[@sec-example-data] and store it into a variable named `tse`.
+
 ```{r}
 #| label: Other-elements-ex1
 #| eval: FALSE
@@ -551,7 +551,7 @@ However, such data are not always included.
 library(mia)
 data("GlobalPatterns", package = "mia")
 tse <- GlobalPatterns
-```   
+```
 
 2. Fetch the metadata of the TreeSE. Is there any information available?
 
@@ -562,23 +562,23 @@ tse <- GlobalPatterns
 #| code-summary: "Show solution"
 
 names(metadata(tse))
-```   
+```
 
 3. Access the phylogenetic tree with `rowTree`. How big is it in terms of tips
-and nodes. If you like you can visualize it with `ggtree`. 
-   
+and nodes. If you like you can visualize it with `ggtree`.
+
 ```{r}
 #| label: Other-elements-ex3
 #| eval: FALSE
 #| code-fold: true
 #| code-summary: "Show solution"
 
-rowTree(tse) 
+rowTree(tse)
 ```
 
 4. Check if a sample tree is available with `colTree`, which is suitable for
 hierarchical or nested study designs.
-   
+
 ```{r}
 #| label: Other-elements-ex4
 #| eval: FALSE
@@ -598,7 +598,7 @@ sequence slot.
 #| code-summary: "Show solution"
 
 colData(tse)$Barcode_full_length
-```   
+```
 
 ## Data manipulation
 
@@ -608,9 +608,9 @@ In this exercise, we'll go over the basics of subsetting a TreeSE object.
 Keep in mind that it's good practice to always keep the original data intact.
 Therefore, we suggest creating a new TreeSE object whenever you subset.
 
-Please have a try at the following exercises. 
+Please have a try at the following exercises.
 
-1. Subset the TreeSE object to specific samples and features.  
+1. Subset the TreeSE object to specific samples and features.
 
 1.1. Find some samples and features to subset with.
 
@@ -645,11 +645,11 @@ cols <- tse$SampleType %in% samples
 tse_subset_by_feature <- tse[rows, cols]
 
 unique(rowData(tse_subset_by_feature)$Phylum)
-```   
+```
 
 ### Library sizes
 
-1. Calculate library sizes  
+1. Calculate library sizes
 2. Subsample / rarify the counts (see: rarefyAssay)
 
 Useful functions: nrow, ncol, dim, summary, table, quantile, unique,
@@ -671,7 +671,7 @@ data("GlobalPatterns", package = "mia")
 tse <- GlobalPatterns
 
 # First, merge the features by rank and add it as an alternative experiment.
-tse_phylum <- agglomerateByRank(tse, "Phylum") 
+tse_phylum <- agglomerateByRank(tse, "Phylum")
 
 # Calculate relative abundance
 tse_phylum <- transformAssay(tse_phylum, method = "relabundance")
@@ -701,7 +701,7 @@ tse_prevalent2 <- subsetByPrevalent(
      tse, assay.type = "counts", detection = 1, prevalence = 0.1)
 
 identical(tse_prevalent1, tse_prevalent2)
-```  
+```
 
 3. How many features are left ?
 
@@ -712,9 +712,9 @@ identical(tse_prevalent1, tse_prevalent2)
 #| code-summary: "Show solution"
 
 nrow(tse_prevalent1)
-``` 
+```
 
-4. Recalculate the most prevalent features based on relative abundance. 
+4. Recalculate the most prevalent features based on relative abundance.
 How many are left?
 
 ```{r}
@@ -730,7 +730,7 @@ tse_prevalent <- subsetByPrevalent(
      detection = 0.01/100, prevalence = 50/100)
 
 nrow(tse_prevalent)
-```  
+```
 
 5. Visualize prevalence of most prevalent Phyla using a violin/bean plot.
 
@@ -754,43 +754,43 @@ rowData(tse_phylum)$prevalence <- res
 
 # Then plot the row data
 plotRowData(tse_phylum, "prevalence", colour_by = "Phylum")
-```   
+```
 
 Useful functions: getPrevalence, getPrevalent, subsetByPrevalent
 
 ### Data exploration
 
 1. Summarize sample metadata variables. (How many age groups, how they are
-distributed? 0%, 25%, 50%, 75%, and 100% quantiles of library size?)  
+distributed? 0%, 25%, 50%, 75%, and 100% quantiles of library size?)
 2. Create two histograms. Another shows the distribution of absolute counts,
-another shows how CLR transformed values are distributed.  
-3. Visualize how relative abundances are distributed between taxa in samples.  
+another shows how CLR transformed values are distributed.
+3. Visualize how relative abundances are distributed between taxa in samples.
 
 Useful functions: nrow, ncol, dim, summary, table, quantile, unique,
 transformAssay, ggplot, wilcox.test, agglomerateByRank, miaViz::plotAbundance
 
 ### Other functions
 
-1. Merge data objects (merge, mergeSEs)  
-2. Melt the data for visualization purposes (meltSE)  
+1. Merge data objects (merge, mergeSEs)
+2. Melt the data for visualization purposes (meltSE)
 
 ### Assay transformation
 
 1. Import the mia package, load any of the example data sets mentioned in
-[@sec-example-data] with `data` and store it into a variable named `tse`.  
+[@sec-example-data] with `data` and store it into a variable named `tse`.
 2. Transform the counts assay into relative abundances with `transformAssay` and
 store it into the TreeSE as an assay named `relabund`
-(see chapter [@sec-assay-transform]).  
+(see chapter [@sec-assay-transform]).
 3. Similarly, perform a clr transformation on the counts assay with a
-`pseudocount` of 1 and add it to the TreeSE as a new assay.  
-4. List the available assays by name with `assays`.  
+`pseudocount` of 1 and add it to the TreeSE as a new assay.
+4. List the available assays by name with `assays`.
 5. Access the clr assay and select a subset of its first 100 features and 10
-samples. Remember that assays are subsettable with `assay[row_idx, col_idx]`.  
+samples. Remember that assays are subsettable with `assay[row_idx, col_idx]`.
 5. Take the same subset from the TreeSE, and check how this affects the
 individual transformed assays. TreeSE can also be subsetted with
-`tse[row_idx, col_idx]`.  
+`tse[row_idx, col_idx]`.
 6. **Extra**: If the data has phylogenetic tree, perform the phILR
-transformation.  
+transformation.
 
 
 ## Abundance tables
@@ -799,17 +799,17 @@ transformation.
 
 1. Import the mia package, load one of the example data sets mentioned in
 [@sec-example-data] with `data` (you need one with taxonomic information at
-Phylum level) and store it into a variable named `tse`.  
-2. List the available taxonomic ranks in the data with `taxonomyRanks`.  
+Phylum level) and store it into a variable named `tse`.
+2. List the available taxonomic ranks in the data with `taxonomyRanks`.
 3. Agglomerate the data to Phylum level with `agglomerateByRank` and the
-appropriate value for `Rank`.  
+appropriate value for `Rank`.
 4. Report the dimensions of the TreeSE before and after agglomerating. You can
-use `dim` for that.  
+use `dim` for that.
 5. **Extra**: Perform CLR transformation on the data. Does this affect
-agglomeration?  
+agglomeration?
 6. **Extra**: List full taxonomic information for a few selected taxa, such as
 `OTU1` and `OTU1368`. For that you can use `mapTaxonomy` on a specific subset
-of the TreeSE.  
+of the TreeSE.
 
 
 ### Alternative experiments
@@ -817,17 +817,17 @@ of the TreeSE.
 1. Import the mia package, load one of the example data sets mentioned in
 [@sec-example-data] with `data` (you need one with taxonomic information) and
 store it into a variable named `tse`. Can you find the same taxonomyRanks from
-rowData? 
+rowData?
 2. Check the taxonomic ranks of the features with `taxonomyRanks`. What is the
-deepest taxonomic rank available?  
+deepest taxonomic rank available?
 3. Agglomerate the TreeSE to each taxonomic rank and store the resulting
-experiments as altExps. This can be performed automatically with 
-`agglomerateByRanks`.  
+experiments as altExps. This can be performed automatically with
+`agglomerateByRanks`.
 4. Check the names of the generated altExps with `altExpNames` and retrieve a
-complete list with `altExps`.  
+complete list with `altExps`.
 5. Retrieve the data agglomerated by genus from the corresponding `altExp`.
-As for assays, you can access the desired altExp by name or index.  
-6. **Extra**: Split the data based on other features with `splitOn`.  
+As for assays, you can access the desired altExp by name or index.
+6. **Extra**: Split the data based on other features with `splitOn`.
 
 ### Beta Diversity
 
@@ -939,7 +939,7 @@ diversity of the original experiment with its agglomerated version. You can
 use `agglomerateByRank` to perform agglomeration and `mean` to calculate the
 mean values of the respective columns in colData.
 
-### Visualization 
+### Visualization
 
 1. Import the mia and scater packages, load any of the example data sets
 mentioned in [@sec-example-data] with `data` and store it into a variable
@@ -954,7 +954,7 @@ differ?
 5. **Extra**: Make a scatterplot of Shannon diversity on the y axis and Faith's
 phylogenetic diversity on the x axis with `plotColData`. Colour the points
 by sample type with the appropriate optional argument.
-   
+
 ### Correlation
 
 1. Import the mia and scater packages, load any of the example data sets
@@ -972,7 +972,7 @@ relate to one another?
 6. **Extra**: Compute the library size of the samples by applying `colSums` to
 the counts assay of the TreeSE, and test the correlation of library size with
 Shannon diversity or coverage. Which index is more correlated with library size?
-   
+
 In this example, we inspected the correlation between two related variables,
 also known as multicollinearity, and checked the correlation to library size,
 which is part of quality control. However, the correlation between alpha
@@ -1058,7 +1058,7 @@ dissimilarity. The distance metric to use can be defined with the optional
 argument `method`, choosing from the methods in `?vegan::vegdist`. If you
 don't want to overwrite the reducedDim object made in point 3, set `name` to
 a name of your choice. Visualize and compare it to the plot from point 4.
-   
+
 Good job! You are now able to produce and visualize reduced dimensions of a
 TreeSE. `runMDS` is actually one of several algorithms for PCoA and
 dimensionality reduction, which you can find in [@sec-other-ord-methods].
@@ -1086,7 +1086,7 @@ Diet with respect to beta diversity?
 dissimilarity matrix of `relabund_assay` calculated with
 `vegdist(relabund_assay, "bray")` and `my_groups` is the vector of Diet values
 obtained from the colData of the TreeSE.
-   
+
 Well done! You went through testing the effect and significance of Diet on beta
 diversity. Keep in mind that the formula fed to `adonis2` can take more than one
 independent variable, so that you can also (and very often should) include
@@ -1110,7 +1110,7 @@ abundance assay (`formula = assay ~ Diet` and `assay.type = relabundance`) in
 terms of Bray-Curtis dissimilarity (`method = "bray"`).
 4. Extract the RDA dimensions from the appropriate reducedDim slot with
 `attr(reducedDim(tse, "RDA"), "rda)` and store it into a variable named `rda`.
-5. Extract the effect and significance of Diet on beta diversity with 
+5. Extract the effect and significance of Diet on beta diversity with
 attr(reducedDim(tse, "RDA"), "significance"). What do the results tell you about Diet?
 6. **Extra**: Check what other parameters are stored in the colData of
 peerj13075, add them to the formula (`formula = assay ~ Diet + ...`) of `getRDA`
@@ -1131,7 +1131,7 @@ walkthrough, which may be simplified in the future.
 1. Import the mia package, load any of the example data sets
 mentioned in [@sec-example-data] with `data` and store it into a variable
 named `tse`.
-   
+
 ```{r}
 #| label: dbRDA-ex1
 #| eval: FALSE
@@ -1143,8 +1143,8 @@ library(mia)
 data("GlobalPatterns", package = "mia") # Feel free to use any dataset
 tse <- GlobalPatterns
 ```
-   
-2. Create a new assay with the relative abundance for your dataset and merge the 
+
+2. Create a new assay with the relative abundance for your dataset and merge the
 features by a rank of your choice.
 
 ```{r}
@@ -1157,7 +1157,7 @@ tse <- transformAssay(tse, MARGIN = "cols", method="relabundance")
 ```
 
 3. Select a group of samples and perform a permutational analysis on the
-calculated relative abundance using Bray-Curtis dissimilarities 
+calculated relative abundance using Bray-Curtis dissimilarities
 
 ```{r}
 #| label: dbRDA-ex3
@@ -1169,7 +1169,7 @@ tse$Group <- tse$SampleType == "Feces"
 ```
 
 4. Implement dbRDA on relative abundances and perform another permutational
-analysis on the redundancy analysis. 
+analysis on the redundancy analysis.
 
 ```{r}
 #| label: dbRDA-ex4
@@ -1189,9 +1189,9 @@ rda_info <- attr(reducedDim(tse, "RDA"), "significance")
 rda_info
 ```
 
-rda_info now holds information on the variation differences between 
-the chosen groups, in our case, the homogeneity test has a p-value of 0.01, 
-indicating that the groups have different  variances, making it so that the 
+rda_info now holds information on the variation differences between
+the chosen groups, in our case, the homogeneity test has a p-value of 0.01,
+indicating that the groups have different  variances, making it so that the
 permanova results are compromised and do not correctly explain whether there are
 significant differences between groups.
 
@@ -1286,7 +1286,7 @@ aldex_out %>%
      rownames_to_column(var = "Genus") %>%
      filter(wi.eBH <= 0.05)  %>%
      dplyr::select(Genus, we.eBH, wi.eBH, effect, overlap)
-```   
+```
 
 6. **Extra**: If these results appear boring, repeat steps 1 - 5, but use
 `Gender` or `Age` as the grouping variable. Do we have any better luck with
@@ -1357,7 +1357,7 @@ linda_res <- linda_out$output$AgeElderly
 5. Filter `linda_res` for features with `reject == TRUE`. How many
 differentially abundant taxa were found? What are their names and how
 significant are they in terms of log-fold change and adjusted p-value?
-   
+
 ```{r}
 #| label: confounders-ex5
 #| eval: FALSE
@@ -1424,7 +1424,7 @@ zicoseq_out <- ZicoSeq(
     feature.dat = as.matrix(assay(tse)),
     meta.dat = as.data.frame(colData(tse)),
     grp.name = "Age",
-    adj.name = c("Diet", "Gender"), 
+    adj.name = c("Diet", "Gender"),
     feature.dat.type = "count",
     return.feature.dat = TRUE,
     perm.no = 999)
@@ -1457,7 +1457,7 @@ zicoseq_res %>%
 1. Get the abundances for an individual feature (taxonomic group / row).
 2. Visualize the abundances per group with boxplot / jitterplot.
 3. Is the difference significant (Wilcoxon test)?
-4. Is the difference significant (linear model with covariates)? 
+4. Is the difference significant (linear model with covariates)?
 5. How do transformations affect the outcome (log10, clr..)?
 6. Get p-values for all features (taxa), for instance with a for loop.
 7. Do multiple testing correction.
@@ -1472,7 +1472,7 @@ lm.test, transformAssay, p.adjust
 1. Install the latest development version of mia from GitHub.
 2. Load experimental dataset from mia.
 3. Compare abundances of each taxa between groups. First, use Wilcoxon or
-Kruskall-Wallis test. Then use some other method dedicated to microbiome data. 
+Kruskall-Wallis test. Then use some other method dedicated to microbiome data.
 4. Summarize findings by plotting a taxa vs samples heatmap. Add column
 annotation that tells the group of each sample, and row annotation that tells
 whether the difference of certain taxa was statistically significant.
@@ -1566,7 +1566,7 @@ choice
 library(mia)
 library(miaViz)
 library(shadowtext)
-library(ComplexHeatmap) 
+library(ComplexHeatmap)
 
 data("Tengeler2020", package = "mia")
 tse <- Tengeler2020
@@ -1598,11 +1598,11 @@ and count assays.
 #| eval: false
 
 correlations <- getCrossAssociation(
-    tse, 
-    test.signif=TRUE,            
-    assay.type1 = "log10", 
+    tse,
+    test.signif=TRUE,
+    assay.type1 = "log10",
     assay.type2 = "counts",
-    method = "spearman", 
+    method = "spearman",
     p.adj.threshold = NULL,
     cor.threshold = NULL,
     # Remove when mia is fixed
@@ -1701,7 +1701,7 @@ pca_aitchison_clusters <- kmeans(reducedDim(pca_aitchison), centers = 3)$cluster
 plotReducedDim(pca_aitchison, colour_by = pca_aitchison_clusters)
 ```
 
-Where: 
+Where:
 - `reducedDim` is a function that extracts the reduced dimensionality
 representation of your data (e.g., from a PCA or PCoA object).
 - `plotReducedDim` is a function that can plot these reduced dimensions and
@@ -1799,7 +1799,7 @@ transformation of assays of individual experiments in the MAE.
    to be transformed with the argument `assay.type`.
 4. Apply a CLR transformation to the counts assay of the microbiota experiment.
    To ensure non-null values in the assay, set `pseudocount` equal to 1.
-   
+
 You made it! You learnt how to apply different transformations to the assays
 of individual experiments in a MAE with `transformAssay`, specifying optional
 arguments based on the used method.
@@ -1848,7 +1848,7 @@ variables `sample_data` and `sample_map`, respectively.
 is an `ExperimentList` (for now include only the microbiota and metabolites
 TreeSEs), the second is colData and the third is sampleMap.
 5. Make sure that the MAE experiments are identical to the original TreeSEs. You
-can do that qualitatively by checking their `head` and quantitatively by 
+can do that qualitatively by checking their `head` and quantitatively by
 looking at their `dim`.
 6. **Extra**: Add the biomarkers TreeSE as a new experiment to the MAE.
 Note that new experiments can be added to a MAE through simple concatenation
@@ -1857,9 +1857,9 @@ with `c(mae, experiment)`.
 Good job! Now you are aware of how MAEs are built and we can proceed to some
 analytical exercises.
 
-### Cross-correlation analysis
+### cross-association analysis
 
-Now we will perform a cross-correlation analysis between two of the
+Now we will perform a cross-association analysis between two of the
 experiments in the MAE.
 
 1. Import the mia package, load HintikkaXOData with `data` and store it into a
@@ -1871,14 +1871,14 @@ which of their assays with `assay.type1` and `assay.type2`.
 3. What does the output look like? By default, correlation is measured in terms
 of Kendall tau coefficients. Repeat point 2, but this time change `method`
 to Spearman coefficients.
-4. Are you able to infer significance from the output? 
+4. Are you able to infer significance from the output?
 5. Visualize results with a heatmap similarly to the example in section
 [@sec-cross-correlation]. Do you see any significant correlations?
 Interpret your results.
-6. **Extra**: Perform cross-correlation analysis between the remaining
+6. **Extra**: Perform cross-association analysis between the remaining
 experiments (microbiota vs metabolites and metabolites vs biomarkers) and
 visualize results with heatmaps.
-   
-Great job! You performed a cross-correlation analysis between two experiments of
+
+Great job! You performed a cross-association analysis between two experiments of
 a MAE and visualized the results with a heatmap. You are also able to customise
 the correlation method and significance testing used for the analysis.
diff --git a/inst/pages/extra_material.qmd b/inst/pages/extra_material.qmd
index b232f0165..1c28d5348 100644
--- a/inst/pages/extra_material.qmd
+++ b/inst/pages/extra_material.qmd
@@ -1,4 +1,4 @@
-# Extra material {#sec-extras} 
+# Extra material {#sec-extras}
 
 ```{r}
 knitr::opts_chunk$set(eval=TRUE, warning=FALSE, message=FALSE)
@@ -160,7 +160,7 @@ Let's convert the data frame to a matrix and make the list of assays.
 counts <- as.matrix(phy_otutable) # Convert to a matrix
 assays <- SimpleList(counts = counts)
 tse <- TreeSummarizedExperiment(
-    assays = assays, 
+    assays = assays,
     colData = phy_sampledata,
     rowData = phy_taxtable
 )
@@ -178,7 +178,7 @@ assayNames(tse)
 
 Adding the assays as a list might seem inconvenient if you only have one type of
 OTU table (counts in our example), but let's see why it is actually very
-convenient to be able to hold multiple assays in one data object. 
+convenient to be able to hold multiple assays in one data object.
 
 Here we'll show an example of how to add relative abundances and CLR normalized
 OTU tables to your tse assays.
@@ -253,14 +253,14 @@ tse_nomock_bacteria <-
   tse[tse$Kingdom == "Bacteria", ]
 
 phy_nomock_bacteria # We have 19008 taxa (only bacteria) and before 19216
-tse_nomock_bacteria 
+tse_nomock_bacteria
 ```
 
 ### Calculating alpha diversity: estimate_richness = estimateDiversity
 
 Now we know how data stored in TreeSE can be accessed and the TreeSE data
 objects created. Let's look at how we can calculate alpha diversity using
-mia compared to phyloseq package. 
+mia compared to phyloseq package.
 
 The mia command estimateDiversity will return a TreeSE and the results are
 stored in colData, unlike the phyloseq command that outputs a data frame with
@@ -273,7 +273,7 @@ sample data to keep it safe with the other sample level data.
 #| label: "alpha_div"
 
 # Alpha diversity with phyloseq
-df <- estimate_richness(phy, measures = "Shannon") 
+df <- estimate_richness(phy, measures = "Shannon")
 head(df) # Inspect
 
 # Add Shannon to the sample_data to keep results safe with other sample data
@@ -311,7 +311,7 @@ df <- colData(tse) %>% data.frame %>% dplyr::select(matches("shannon"))
 
 We can calculate PCoA with Bray-Curtis distances in phyloseq using the
 ordinate command. The beta diversity calculation in mia  outputs a TreeSE
-with a new type of data, reduced dimensions or reducedDim. 
+with a new type of data, reduced dimensions or reducedDim.
 
 Here we will use the scater package that runs the PCoA with runMDS. (PCoA and
 MDS mean the same thing)
@@ -413,7 +413,7 @@ df <-
     Functionality = c(
       "Access sample data", # Row 1
       "Access tax table", # Row 2
-      "Access OTU table", 
+      "Access OTU table",
       "Build data object",
       "Calculate alpha diversity",
       "Calculate beta diversity",
@@ -472,16 +472,16 @@ kable(df2)
 Result of amplicon sequencing is a large number of files that include all the
 sequences that were read from samples. Those sequences need to be matched with
 taxa. Additionally, we need to know how many times each taxa were found from
-each sample. 
+each sample.
 
-There are several algorithms to do that, and DADA2 is one of the most common. 
-You can find DADA2 pipeline tutorial, for example, 
+There are several algorithms to do that, and DADA2 is one of the most common.
+You can find DADA2 pipeline tutorial, for example,
 [here](https://benjjneb.github.io/dada2/tutorial.html).
 After the DADA2 portion of the tutorial is completed, the data is stored into
 _phyloseq_ object  (Bonus: Handoff to phyloseq). To store the data to _TreeSE_,
-follow the example below. 
+follow the example below.
 
-You can find full workflow script without further explanations and comments from 
+You can find full workflow script without further explanations and comments from
 [Rmd file](extra_material/dada2_workflow.Rmd)
 
 ```{r dada2_1, include=FALSE}
@@ -557,7 +557,7 @@ Logistic-Normal Linear Regression model; see
 [vignette](https://jsilve24.github.io/fido/articles/introduction-to-fido.html)
 of package.
 
-The following presents such an exemplary analysis based on the 
+The following presents such an exemplary analysis based on the
 data of @Sprockett2020 available
 through `microbiomeDataSets` package.
 
@@ -646,7 +646,7 @@ Note: Some computational failures could occur (see
 the arguments `multDirichletBoot` `calcGradHess` could be passed in such case.
 
 ```{r}
-priors$Y <- Y 
+priors$Y <- Y
 posterior <- refit(priors, optim_method="adam", multDirichletBoot=0.5) #calcGradHess=FALSE
 ```
 
@@ -679,8 +679,8 @@ Here, we use following packages:
 -   [cobiclust](https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.13582)
 
 _cobiclust_ is especially developed for microbiome data whereas _biclust_ is
-more general method. In this section, we show two different cases and example 
-solutions to apply biclustering to them. 
+more general method. In this section, we show two different cases and example
+solutions to apply biclustering to them.
 
 1.   Taxa vs samples
 2.   Taxa vs biomolecule/biomarker
@@ -705,7 +705,7 @@ data("HintikkaXOData")
 mae <- HintikkaXOData
 ```
 
-Only the most prevalent taxa are included in analysis. 
+Only the most prevalent taxa are included in analysis.
 
 ```{r cobiclust_1}
 # Subset data in the first experiment
@@ -717,7 +717,7 @@ mae[[1]] <- transformAssay(mae[[1]], method = "rclr")
 ```
 
 _cobiclust_ takes counts table as an input and gives _cobiclust_ object as an
-output. It includes clusters for taxa and samples. 
+output. It includes clusters for taxa and samples.
 
 ```{r cobiclust_2}
 # Do clustering using counts table
@@ -791,7 +791,7 @@ p1 + p2
 
 ### Taxa vs biomolecules
 
-Here, we analyze cross-correlation between taxa and metabolites. This
+Here, we analyze cross-association between taxa and metabolites. This
 is a case, where we use _biclust_ method which is suitable for numeric
 matrices in general. First we pre-process the data.
 
@@ -825,7 +825,7 @@ corr <- getExperimentCrossCorrelation(
     mode = "matrix", correlation = "spearman")
 ```
 
-_biclust_ takes a matrix as an input and returns a _biclust_ object. 
+_biclust_ takes a matrix as an input and returns a _biclust_ object.
 
 ```{r biclust_3}
 library(biclust)
@@ -842,7 +842,7 @@ The object includes cluster information. However compared to
 _cobiclust_, _biclust_ object includes only information about clusters
 that were found, not general cluster.
 
-Meaning that if one cluster size of 5 features was found out of 20 features, 
+Meaning that if one cluster size of 5 features was found out of 20 features,
 those 15 features do not belong to any cluster. That is why we have to create an
 additional cluster for features/samples that are not assigned into any cluster.
 
@@ -860,7 +860,7 @@ additional cluster for features/samples that are not assigned into any cluster.
     # Get data into right format
     bc_columns <- .manipulate_bc_data(bc_columns, assay, "col")
     bc_rows <- .manipulate_bc_data(bc_rows, assay, "row")
-    
+
     return(list(bc_columns = bc_columns, bc_rows = bc_rows))
 }
 
@@ -891,7 +891,7 @@ additional cluster for features/samples that are not assigned into any cluster.
             bc_clusters <- cbind(bc_clusters, vec)
         }
     }
-    
+
     # Adjust row and column names
     rownames(bc_clusters) <- names
     colnames(bc_clusters) <- paste0("cluster_", 1:ncol(bc_clusters))
@@ -910,7 +910,7 @@ bicluster_columns <- bcs$bc_columns
 head(bicluster_rows)
 ```
 
-Let's collect information for the scatter plot. 
+Let's collect information for the scatter plot.
 
 ```{r biclust_6}
 # Function for obtaining sample-wise sum, mean, median, and mean variance
@@ -931,13 +931,13 @@ Let's collect information for the scatter plot.
         mean1 <- colMeans2(assay1, na.rm = T)
         median1 <- colMedians(assay1, na.rm = T)
         var1 <- colVars(assay1, na.rm = T)
-        
+
         sum2 <- colSums2(assay2, na.rm = T)
         mean2 <- colMeans2(assay2, na.rm = T)
         median2 <- colMedians(assay2, na.rm = T)
         var2 <- colVars(assay2, na.rm = T)
-        
-        list[[i]] <- data.frame(sample = colnames(tse1), sum1, sum2, mean1, 
+
+        list[[i]] <- data.frame(sample = colnames(tse1), sum1, sum2, mean1,
                                  mean2, median1, median2, var1, var2)
     }
     return(list)
diff --git a/inst/pages/import.qmd b/inst/pages/import.qmd
index 07b9f9b21..fdfa29740 100644
--- a/inst/pages/import.qmd
+++ b/inst/pages/import.qmd
@@ -55,7 +55,7 @@ sample_meta_file_path <- system.file(
     "extdata", "Mapping_file_ADHD_aggregated.csv", package = "OMA")
 tree_file_path <- system.file(
     "extdata", "Data_humanization_phylo_aggregation.tre", package = "OMA")
-```  
+```
 
 Now we can read in the biom file and convert it into a `TreeSE` object. In
 addition, we retrieve the rank names from the prefixes of the feature names
@@ -73,7 +73,7 @@ tse <- importBIOM(
 
 # Check
 tse
-```  
+```
 
 The `assays` slot includes a list of abundance tables. The imported
 abundance table is named as "counts".  Let us inspect only the first
@@ -98,11 +98,11 @@ then replace the original rowData with its updated version.
 ```{r}
 # Genus level has additional '\"', so let's delete that also
 rowdata_modified <- BiocParallel::bplapply(
-    rowData(tse), 
-    FUN = stringr::str_remove, 
+    rowData(tse),
+    FUN = stringr::str_remove,
     pattern = '\"')
 
-# rowdata_modified is a list, so convert this back to DataFrame format. 
+# rowdata_modified is a list, so convert this back to DataFrame format.
 # and assign the cleaned data back to the TSE rowData
 rowData(tse) <- DataFrame(rowdata_modified)
 
@@ -190,13 +190,13 @@ sample IDs; other values should be numeric (abundances).
 column provides feature IDs, the first row provides column headers;
 this file usually contains the taxonomic mapping between different
 taxonomic levels. Ideally, the feature IDs (row names) match one-to-one with
-the abundance table row names. 
+the abundance table row names.
 
 - Column data (`coldata.csv`): data table (samples x info); first
 column provides sample IDs, the first row provides column headers;
 this file usually contains the sample metadata/phenodata (such as
 subject age, health etc). Ideally, the sample IDs match one-to-one with
-the abundance table column names. 
+the abundance table column names.
 
 After you have set up the CSV files, you can read them in R:
 
@@ -259,7 +259,7 @@ class(samples) # should be data.frame or DataFrame
 all(rownames(tax) == rownames(counts)) # our dataset
 class(tax) # should be data.frame or DataFrame
 
-# Counts 
+# Counts
 class(counts) # should be a numeric matrix
 ```
 
@@ -300,7 +300,7 @@ downstream analyses.
 
 #### Constructing MAE
 
-To construct a _MAE_ object, just combine multiple _TreeSE_ data containers. 
+To construct a _MAE_ object, just combine multiple _TreeSE_ data containers.
 Here we import metabolite data from the same study.
 
 ```{r importingcsv4, message=FALSE}
@@ -308,7 +308,7 @@ count_file <- system.file("extdata", "assay_metabolites.csv", package = "OMA")
 sample_file <- system.file("extdata", "coldata.csv", package = "OMA")
 
 # Load files
-counts  <- read.csv(count_file, row.names=1)  
+counts  <- read.csv(count_file, row.names=1)
 samples <- read.csv(sample_file, row.names=1)
 
 # Create a TreeSE for the metabolite data
@@ -428,8 +428,8 @@ tse <- curatedMetagenomicData("Vatanen*", dryrun = FALSE, counts = TRUE)
 
 ###  Human microbiome compendium
 
-[MicroBioMap](https://seandavi.github.io/MicroBioMap/) dataset includes 
-over 170k samples of publicly available 16S rRNA amplicon sequencing data, 
+[MicroBioMap](https://seandavi.github.io/MicroBioMap/) dataset includes
+over 170k samples of publicly available 16S rRNA amplicon sequencing data,
 all processed using the same pipeline and reference database [@microbiomap].
 After installing the MicroBioMap package (see the
 [original website](https://github.com/seandavi/MicroBioMap#microbiome-compendium)
@@ -440,7 +440,7 @@ library(MicroBioMap)
 cpd <- getCompendium()
 ```
 
-This returns a `TreeSE` object. Currently, 
+This returns a `TreeSE` object. Currently,
 the `rowTree` slot of the `TreeSE` is not populated.
 
 After loading the compendium, you will have immediate access to
@@ -455,6 +455,6 @@ The current collections provide access to vast microbiome data
 resources. The output has to be converted into `TreeSE/MAE` separately.
 
 - [MGnifyR](https://bioconductor.org/packages/release/bioc/html/MGnifyR.html) provides access to [EBI/MGnify](https://www.ebi.ac.uk/metagenomics/)
-- [HoloFoodR](https://bioconductor.org/packages/release/bioc/html/HoloFoodR.html) provides access to [EBI/HoloFood](https://www.holofooddata.org/) 
-- [qiitr](https://github.com/cran/qiitr) provides access to [QIITA](https://qiita.com/about) 
-- [qiime2R](https://github.com/microbiome/qiime2R) provides access to [QIIME2](https://docs.qiime2.org/2024.2/) 
+- [HoloFoodR](https://bioconductor.org/packages/release/bioc/html/HoloFoodR.html) provides access to [EBI/HoloFood](https://www.holofooddata.org/)
+- [qiitr](https://github.com/cran/qiitr) provides access to [QIITA](https://qiita.com/about)
+- [qiime2R](https://github.com/microbiome/qiime2R) provides access to [QIIME2](https://docs.qiime2.org/2024.2/)
diff --git a/inst/pages/integrated_learner.qmd b/inst/pages/integrated_learner.qmd
index 06c72db4e..824a7b8b8 100644
--- a/inst/pages/integrated_learner.qmd
+++ b/inst/pages/integrated_learner.qmd
@@ -29,9 +29,9 @@ from the multi-omics integrated framework (`IntegratedLearner`) proposed by
 
 ## Input data
 
-We use the publicly available Inflammatory Bowel Diseases (IBD) data from the 
-curatedMetagenomicData package [@Lloyd-Price2019], where we aim to 
-predict IBD disease status based on both taxonomic (species abundances) and 
+We use the publicly available Inflammatory Bowel Diseases (IBD) data from the
+curatedMetagenomicData package [@Lloyd-Price2019], where we aim to
+predict IBD disease status based on both taxonomic (species abundances) and
 functional (pathway abundances) profiles.
 
 ```{r load-pkg-data}
@@ -85,10 +85,10 @@ rownames(feature_species) <- sub('.*s__', '', rownames(feature_species))
 
 feature_pwys <- as.data.frame(assay(se_pathway))
 feature_pwys <- rownames_to_column(feature_pwys, "ID")
-feature_pwys <- feature_pwys %>% 
-  filter(!grepl("\\|", ID)) %>% 
-  filter(!ID %in% c('UNMAPPED', 'UNINTEGRATED')) %>% 
-  column_to_rownames('ID') %>% 
+feature_pwys <- feature_pwys %>%
+  filter(!grepl("\\|", ID)) %>%
+  filter(!ID %in% c('UNMAPPED', 'UNINTEGRATED')) %>%
+  column_to_rownames('ID') %>%
   as.data.frame()
 
 ################################
@@ -148,9 +148,9 @@ library(caret)
 feature_table_t <- as.data.frame(t(feature_table))
 abd_threshold = 0
 prev_threshold = 0.1
-nzv <- nearZeroVar(feature_table_t) 
-features_var_filtered <- feature_table_t[, -nzv] 
-features_var_filtered <- as.data.frame(features_var_filtered) 
+nzv <- nearZeroVar(feature_table_t)
+features_var_filtered <- feature_table_t[, -nzv]
+features_var_filtered <- as.data.frame(features_var_filtered)
 features_filtered <- features_var_filtered[
   , colSums(features_var_filtered > abd_threshold) >
     nrow(features_var_filtered) * prev_threshold
@@ -199,7 +199,7 @@ subjectCvFoldsIN <- createFolds(1:length(subjectID), k = 5, returnTrain=TRUE)
 # Curate subject-level samples per fold #
 #########################################
 
-obsIndexIn <- vector("list", 5) 
+obsIndexIn <- vector("list", 5)
 for(k in 1:length(obsIndexIn)){
   x <- which(!sample_metadata$subjectID %in% subjectID[subjectCvFoldsIN[[k]]])
   obsIndexIn[[k]] <- x
@@ -221,10 +221,10 @@ name_layers <- with(
     droplevels(feature_metadata),
     list(levels = levels(featureType)), nlevels = nlevels(featureType))$levels
 SL_fit_predictions <- vector("list", length(name_layers))
-SL_fit_layers <- vector("list", length(name_layers)) 
+SL_fit_layers <- vector("list", length(name_layers))
 names(SL_fit_layers) <- name_layers
 names(SL_fit_predictions) <- name_layers
-X_train_layers <- vector("list", length(name_layers)) 
+X_train_layers <- vector("list", length(name_layers))
 names(X_train_layers) <- name_layers
 layer_wise_predictions_train <- vector("list", length(name_layers))
 names(layer_wise_predictions_train) <- name_layers
@@ -244,39 +244,39 @@ utilizing the SuperLearner R package.
 ########################################################
 
 for (i in seq_along(name_layers)){
-  
+
     # Prepare single-omic input data
-    include_list <- feature_metadata %>% filter(featureType == name_layers[i]) 
+    include_list <- feature_metadata %>% filter(featureType == name_layers[i])
     t_dat_slice <- feature_table[
         rownames(feature_table) %in% include_list$featureID, ]
     dat_slice <- as.data.frame(t(t_dat_slice))
     Y = sample_metadata$Y
     X = dat_slice
     X_train_layers[[i]] <- X
-    
+
     # Run user-specified base learner
     library(SuperLearner)
-    
+
     SL_fit_layers[[i]] <- SuperLearner(
-        Y = Y, 
+        Y = Y,
         X = X,
-        cvControl = cvControl,  
+        cvControl = cvControl,
         SL.library = c('SL.randomForest'),
         family = binomial())
-    
+
     # Append the corresponding y and X to the results
     SL_fit_layers[[i]]$Y <- sample_metadata['Y']
     SL_fit_layers[[i]]$X <- X
-    
+
     # Remove redundant data frames and collect pre-stack predictions
     rm(t_dat_slice)
     rm(dat_slice)
     rm(X)
     SL_fit_predictions[[i]] <- SL_fit_layers[[i]]$Z
-    
+
     # Re-fit to entire dataset for final predictions
     layer_wise_predictions_train[[i]] <- SL_fit_layers[[i]]$SL.predict
-  
+
 }
 
 ```
@@ -308,9 +308,9 @@ names(combo_final) <- name_layers
 
 # Run user-specified meta learner
 SL_fit_stacked <- SuperLearner(
-    Y = Y, 
-    X = combo, 
-    cvControl = cvControl,    
+    Y = Y,
+    X = combo,
+    cvControl = cvControl,
     verbose = TRUE,
     SL.library = 'SL.glmnet',
     family = binomial())
@@ -326,8 +326,8 @@ SL_fit_stacked$X <- combo
 
 ```
 
-In contrast to late fusion, in early fusion, we will simply supply a combined 
-representation of the data and we will use the `random forest` method for 
+In contrast to late fusion, in early fusion, we will simply supply a combined
+representation of the data and we will use the `random forest` method for
 building an integrated prediction model.
 
 ```{r early-fusion}
@@ -340,9 +340,9 @@ fulldat<-as.data.frame(t(feature_table))
 
 # Early Fusion using Random Forest
 SL_fit_concat <- SuperLearner(
-    Y = Y, 
+    Y = Y,
     X = fulldat,
-    cvControl = cvControl,   
+    cvControl = cvControl,
     SL.library = 'SL.randomForest',
     family = binomial())
 
@@ -355,16 +355,16 @@ SL_fit_concat$X <- fulldat
 
 ```
 
-Finally, we consider the intermediate fusion approach, which combines ideas 
-from both late fusion and early fusion by integrating the usual squared-error 
-loss of predictions with an "agreement" (fusion) penalty ($\rho$) so that the 
+Finally, we consider the intermediate fusion approach, which combines ideas
+from both late fusion and early fusion by integrating the usual squared-error
+loss of predictions with an "agreement" (fusion) penalty ($\rho$) so that the
 predictions from different data views agree.
 
 The intermediate fusion adjusts the degree of fusion in an adaptive manner,
 where the test set prediction error is estimated with a cross-validation method.
 By varying the weight of the fusion penalty - hyperparameter $\rho$, we obtain
 the early and late fusion approaches as special cases. If $\rho = 0$, we have a
-simple form of early fusion; if $\rho = 1$, we obtain a simple form of late 
+simple form of early fusion; if $\rho = 1$, we obtain a simple form of late
 fusion. For $0 < \rho < 1$, we obtain the intermediate fusion.
 
 Here, as examples, we will prepare the input data, fit the multiview model, and
@@ -388,8 +388,14 @@ name_layers <- with(
 dataList <- vector("list", length = length(name_layers))
 names(dataList) <- name_layers
 table(feature_metadata$featureType)
-dataList[[1]] <- t(feature_table[124:nrow(feature_table), ])
-dataList[[2]] <- t(feature_table[1:123, ])
+
+# Select indices of Pathways and Species rows
+pathways_ind <- which(feature_metadata$featureType == "Pathways")
+species_ind <- which(feature_metadata$featureType == "Species")
+
+# Create two data lists for pathways and species
+dataList[[1]] <- t(feature_table[pathways_ind, ])
+dataList[[2]] <- t(feature_table[species_ind, ])
 
 # Extract y and X's
 dataList <- lapply(dataList, as.matrix)
@@ -403,15 +409,15 @@ set.seed(1234)
 library(multiview)
 library(glmnet)
 
-cvfit <- cv.multiview(dataList, Y, family = binomial(), alpha = 0.5) 
+cvfit <- cv.multiview(dataList, Y, family = binomial(), alpha = 0.5)
 DD <- as.data.frame(as.matrix(coef_ordered(cvfit, s="lambda.min", alpha = 0.5)))
 DD$standardized_coef <- as.numeric(DD$standardized_coef)
 DD$coef <- as.numeric(DD$coef)
 
 ```
 
-In this section, we visualize the prediction performance by gathering all 
-predictions and extracting the data necessary for ROC plotting, corresponding 
+In this section, we visualize the prediction performance by gathering all
+predictions and extracting the data necessary for ROC plotting, corresponding
 to each integration paradigm.
 
 ```{r visualization1}
@@ -427,36 +433,36 @@ yhat.train <- cbind(
 colnames(yhat.train) <- c(
     colnames(combo), "Late Fusion", "Early Fusion", "Intermediate Fusion")
 
-# Extract ROC plot data 
+# Extract ROC plot data
 list.ROC <- vector("list", length = ncol(yhat.train))
 names(list.ROC) <- colnames(yhat.train)
-    
-# Loop over layers 
+
+# Loop over layers
 library(ROCR)
 
 for(k in 1:length(list.ROC)){
     preds <- yhat.train[ ,k]
     pred <- prediction(preds, Y)
     AUC <- round(performance(pred, "auc")@y.values[[1]], 2)
-    perf <- performance(pred, "sens", "spec") 
+    perf <- performance(pred, "sens", "spec")
     list.ROC[[k]] <- data.frame(
         sensitivity = slot(perf, "y.values")[[1]],
         specificity = 1 - slot(perf, "x.values")[[1]],
         AUC = AUC,
         layer = names(list.ROC)[k])
 }
-    
+
 # Combine
 ROC_table <- do.call('rbind', list.ROC)
 
 ```
 
-Based on the ROC plot described below, we observe that the AUC is 0.65 when 
+Based on the ROC plot described below, we observe that the AUC is 0.65 when
 considering only the pathway abundance data in the model, and 0.64 for the model
 including only the species abundance data. The AUC increases to 0.68 when using
-the early fusion model and reaches 1 with the late fusion model. In contrast, 
+the early fusion model and reaches 1 with the late fusion model. In contrast,
 the intermediate fusion model achieves an AUC of 0.99. Overall, most integrated
-classifiers outperform individual layers in distinguishing between IBD and 
+classifiers outperform individual layers in distinguishing between IBD and
 non-IBD controls.
 
 ```{r visualization2}
@@ -466,24 +472,24 @@ plot_data <- ROC_table
 plot_data$displayItem <- paste(plot_data$layer, " AUC = ", plot_data$AUC, sep="")
 plot_data$displayItem <- factor(
     plot_data$displayItem, levels = unique(plot_data$displayItem))
-    
+
 # ROC curves
 p <- ggplot(
     plot_data,
     aes(x = specificity,
     y = sensitivity,
-    group = displayItem)) + 
+    group = displayItem)) +
     geom_line(aes(x = specificity,y = sensitivity,color = displayItem)) +
     theme(
-        legend.position = "bottom", 
+        legend.position = "bottom",
         legend.background=element_blank(),
-        legend.box.background=element_rect(colour = "black")) + 
+        legend.box.background=element_rect(colour = "black")) +
     theme_bw() +
     xlab("False Positive Rate") +
     ylab("True Positive Rate") +
     theme(legend.position = "right", legend.direction = "vertical") +
-    labs(color = '') 
-    
+    labs(color = '')
+
 # Print
 print(p)
 
@@ -492,9 +498,9 @@ print(p)
 Finally, we visualize the results for the top 20 features for each layer based
 on the intermediate fusion.
 
-```{r visualization3, fig.width = 6, fig.height = 3}
-#### Change the pathway 
-# Only plot tp to 20 per layer
+```{r visualization3, fig.width = 6, fig.height = 7}
+#### Change the pathway
+# Only plot 20 features per layer
 DD <- DD %>%
   group_by(view) %>%
   top_n(n = 20, wt = abs(standardized_coef))
@@ -504,16 +510,18 @@ DD$view_col <- gsub(":.*", "", DD$view_col)
 # Visualization
 library(forcats)
 
-p <- DD %>% 
-    mutate(view_col = fct_reorder(view_col, standardized_coef)) %>% 
-    ggplot(aes(x = view_col, y = standardized_coef, fill = view, width = 0.75)) + 
+p <- DD %>%
+    mutate(view_col = fct_reorder(view_col, standardized_coef)) %>%
+    ggplot(aes(x = view_col, y = standardized_coef, fill = view, width = 0.75)) +
     geom_bar(stat = "identity", show.legend = FALSE, width = 1) +
     coord_flip() +
     facet_wrap(~ view, scales = 'free_y', nrow = 2) +
     ylab('Standardized LFC') +
     xlab('') +
-    ggtitle('IBD-associated multi-omics features') + 
+    #labs(title = "IBD-associated\nmulti-omics features") +
+    ggtitle('IBD-associated\nmulti-omics features') +
     theme_bw() + theme(
+        plot.title = element_text(hjust = 0.5, vjust = 0.5),
         strip.background  = element_blank(),
         panel.grid.major = element_line(colour = "grey80"),
         panel.border = element_blank(),
@@ -521,7 +529,6 @@ p <- DD %>%
         panel.grid.minor.y = element_blank(),
         panel.grid.major.y = element_blank())
 p
-
 ```
 
 Based on the bar plot, we observe that \emph{Proteobacteria_bacterium_CAG.139}
diff --git a/inst/pages/intro.qmd b/inst/pages/intro.qmd
index 5a9ddf5e1..022ef9088 100644
--- a/inst/pages/intro.qmd
+++ b/inst/pages/intro.qmd
@@ -49,8 +49,8 @@ on how to handle data containers from the SummarizedExperiment family.
 The Bioconductor microbiome data science framework consists of:
 
 - **Data containers**, designed to organize multi-assay microbiome data
-- **R/Bioconductor packages** that provide dedicated methods 
-- **Community** of users and developers 
+- **R/Bioconductor packages** that provide dedicated methods
+- **Community** of users and developers
 
 ![Data containers are central in Bioconductor.](figures/ecosystem.png){width="50%"}
 
diff --git a/inst/pages/introductory_workflow.qmd b/inst/pages/introductory_workflow.qmd
index cfdcfa5db..d050c5fe0 100644
--- a/inst/pages/introductory_workflow.qmd
+++ b/inst/pages/introductory_workflow.qmd
@@ -101,7 +101,7 @@ You can access **assays** [@sec-assay-slot] as such:
 
 ```{r}
 #| label: showAssay
-assay(tse)[1:5,1:10] 
+assay(tse)[1:5,1:10]
 ```
 
 Whereas **colData** [@sec-add-or-modify-data] is accessible through:
@@ -198,9 +198,9 @@ All these variations are referred to as alpha diversity.
 
 # Estimate (observed) richness
 tse_alpha <- addAlpha(
-    tse, 
-    assay.type = "counts", 
-    index = "observed", 
+    tse,
+    assay.type = "counts",
+    index = "observed",
     name="observed")
 
 # Check some of the first values in colData
@@ -215,12 +215,12 @@ graph to visualize this.
 ```{r}
 #| label: plotColdata
 plotColData(
-    tse_alpha, 
-    "observed", 
-    "cohort", 
+    tse_alpha,
+    "observed",
+    "cohort",
     colour_by = "patient_status") +
-    
-    theme(axis.text.x = element_text(angle=45,hjust=1)) + 
+    ggtitle("Richness plot of three patient cohorts") +
+    theme(axis.text.x = element_text(angle=45,hjust=1)) +
     labs(y=expression(Richness[Observed]))
 ```
 
@@ -240,9 +240,9 @@ First, we can easily calculate this measure and add them to our TreeSE.
 ```{r}
 #| label: calculateDiversity
 tse_alpha <- addAlpha(
-    tse_alpha, 
+    tse_alpha,
     assay.type = "counts",
-    index = c("shannon"), 
+    index = c("shannon"),
     name = c("shannon"))
 ```
 
@@ -261,16 +261,26 @@ plots <- lapply(
     colour_by = "patient_status")
 
 # Fine-tune visual appearance
-plots <- lapply(plots, "+", 
+plots <- lapply(plots, "+",
     theme(
         axis.text.x = element_blank(),
         axis.title.x = element_blank(),
         axis.ticks.x = element_blank()))
 
-# Plot the figures
-(plots[[1]] | plots[[2]]) +
-  plot_layout(guides = "collect")
+# Add titles to individual plots
+plots[[1]] <- plots[[1]] +
+  ggtitle("Observed richness")
+plots[[2]] <- plots[[2]] +
+  ggtitle("Shannon diversity")
 
+
+# Plot the figures
+patchwork <- (plots[[1]] | plots[[2]])
+patchwork +
+  plot_layout(guides = "collect") +
+  plot_annotation(
+  title = "Comparison of observed richness estimate and Shannon diversity",
+)
 ```
 
 It's very important to make all these comparisons in order to quantify
@@ -295,7 +305,7 @@ $$
 
 Luckily for us, the [**mia package**](https://microbiome.github.io/mia/)
 provides an easy way to calculate the relative abundance for our TreeSE
-using the `transformAssay` method.
+using the `transformAssay()` method.
 
 ```{r}
 #| label: calculateRelabundance
@@ -331,8 +341,8 @@ In our case, the assay contains 151 rows and 27 columns. Having so many
 columns and thus dimensions can be troublesome for visualizing dissimilarity.
 
 In order to visualize the dissimilarity between the different samples, we can
-conduct a Principal Coordinate analysis on the newly created assay. This
-essentially projects the Bray-curtis dimensions onto a lower space whilst
+conduct a Principal Coordinate Analysis on the newly created assay. This
+essentially projects the Bray-Curtis dimensions onto a lower space whilst
 maintaining as much of the variation as possible, the projected values are
 called principal coordinates. You can read more on @Multidimensional-scaling
 here.
@@ -343,8 +353,9 @@ we can use Bioconductor's scater package and the vegan package, made by
 
 ```{r}
 #| label: showPCoA
-# Create ggplot object
-p <- plotReducedDim(tse, "MDS_bray", colour_by = "cohort")
+# Create a ggplot object
+p <- plotReducedDim(tse, "MDS_bray", colour_by = "cohort") +
+  ggtitle("Bray-Curtis dissimilarity using multidimensional scaling")
 
 # Convert to an interactive plot with ggplotly
 ggplotly(p)
@@ -363,7 +374,8 @@ rel_eig <- e / sum(e[e > 0])
 # Add explained variance for each axis on the plot
 p <- p + labs(
     x = paste("PCoA 1 (", round(100 * rel_eig[[1]], 1), "%", ")", sep = ""),
-    y = paste("PCoA 2 (", round(100 * rel_eig[[2]], 1), "%", ")", sep = ""))
+    y = paste("PCoA 2 (", round(100 * rel_eig[[2]], 1), "%", ")", sep = "")) +
+    ggtitle("Bray-Curtis dissimilarity using multidimensional scaling")
 
 # Reonvert to an interactive plot with ggplotly
 ggplotly(p)
diff --git a/inst/pages/introductory_workflow_dutch_version.qmd b/inst/pages/introductory_workflow_dutch_version.qmd
index b8b231b95..baca33741 100644
--- a/inst/pages/introductory_workflow_dutch_version.qmd
+++ b/inst/pages/introductory_workflow_dutch_version.qmd
@@ -113,7 +113,7 @@ zal er meer uitleg worden gegeven over het maken van subsets.
 
 ```{r}
 #| label: assay
-assay(tse)[1:5,1:10] 
+assay(tse)[1:5,1:10]
 ```
 
 ### colData
@@ -242,9 +242,9 @@ worden ook wel *Alfabetische diversiteit (alpha-diversiteit)* genoemd.
 #| label: rijkdom
 # Schatting (waargenomen) rijkdom
 tse_alpha <- addAlpha(
-    tse, 
-    assay.type = "counts", 
-    index = "observed", 
+    tse,
+    assay.type = "counts",
+    index = "observed",
     name="observed")
 
 # Controleer enkele van de eerste waarden in colData
@@ -259,12 +259,12 @@ kunnen dan vervolgens een figuur tekenen om dit te visualiseren.
 ```{r}
 #| label: rijkdom-grafiek
 plotColData(
-    tse_alpha, 
-    "observed", 
-    "cohort", 
+    tse_alpha,
+    "observed",
+    "cohort",
     colour_by = "patient_status") +
-  
-    theme(axis.text.x = element_text(angle=45,hjust=1)) + 
+
+    theme(axis.text.x = element_text(angle=45,hjust=1)) +
     labs(y=expression(Richness[Observed]))
 ```
 
@@ -288,9 +288,9 @@ Eerst kunnen we deze maatstaf eenvoudig berekenen en toevoegen aan onze TSE.
 ```{r}
 #| label: diversiteit
 tse_alpha <- addAlpha(
-    tse_alpha, 
+    tse_alpha,
     assay.type = "counts",
-    index = c("shannon"), 
+    index = c("shannon"),
     name = c("shannon"))
 ```
 
@@ -309,7 +309,7 @@ plots <- lapply(
     colour_by = "patient_status")
 
 # Weergave verbeteren
-plots <- lapply(plots, "+", 
+plots <- lapply(plots, "+",
     theme(
         axis.text.x = element_blank(),
         axis.title.x = element_blank(),
diff --git a/inst/pages/introductory_workflow_french_version.qmd b/inst/pages/introductory_workflow_french_version.qmd
index 949b7817b..76e2dbcc3 100644
--- a/inst/pages/introductory_workflow_french_version.qmd
+++ b/inst/pages/introductory_workflow_french_version.qmd
@@ -106,7 +106,7 @@ manière :
 
 ```{r}
 #| label: showAssay
-assay(tse)[1:5,1:10] 
+assay(tse)[1:5,1:10]
 ```
 
 Alors que **colData** ([@sec-add-or-modify-data]) est accessible via :
@@ -207,9 +207,9 @@ Toutes ces variations sont appelées diversité alpha.
 
 # Estimate (observed) richness
 tse_alpha <- addAlpha(
-    tse, 
-    assay.type = "counts", 
-    index = "observed", 
+    tse,
+    assay.type = "counts",
+    index = "observed",
     name="observed")
 
 # Check some of the first values in colData
@@ -225,12 +225,12 @@ visualiser cela.
 ```{r}
 #| label: plotColdata
 plotColData(
-    tse_alpha, 
-    "observed", 
-    "cohort", 
+    tse_alpha,
+    "observed",
+    "cohort",
     colour_by = "patient_status") +
-    
-    theme(axis.text.x = element_text(angle=45,hjust=1)) + 
+
+    theme(axis.text.x = element_text(angle=45,hjust=1)) +
     labs(y=expression(Richness[Observed]))
 ```
 
@@ -251,9 +251,9 @@ notre TreeSE.
 ```{r}
 #| label: calculateDiversity
 tse_alpha <- addAlpha(
-    tse_alpha, 
+    tse_alpha,
     assay.type = "counts",
-    index = c("shannon"), 
+    index = c("shannon"),
     name = c("shannon"))
 ```
 
@@ -272,7 +272,7 @@ plots <- lapply(
     colour_by = "patient_status")
 
 # Fine-tune visual appearance
-plots <- lapply(plots, "+", 
+plots <- lapply(plots, "+",
     theme(
         axis.text.x = element_blank(),
         axis.title.x = element_blank(),
diff --git a/inst/pages/machine_learning.qmd b/inst/pages/machine_learning.qmd
index 6530efb31..3d8ff30ee 100644
--- a/inst/pages/machine_learning.qmd
+++ b/inst/pages/machine_learning.qmd
@@ -128,7 +128,7 @@ library(mikropml)
 df <- assay
 df[["diagnosis"]] <- labels
 
-# Run random forest 
+# Run random forest
 results <- run_ml(
     df, "rf", outcome_colname = "diagnosis",
     kfold = 2, cv_times = 5, training_frac = 0.8)
@@ -146,7 +146,7 @@ In this second model, we try to improve the prediction accuracy. We address the
 imbalanced data by applying class weights. Moreover, we utilize a more
 complex model, XGBoost, which is another commonly used algorithm in
 bioinformatics, usually one of the best performing models
-for tabular data [@SHWARTZZIV202284]. We use 
+for tabular data [@SHWARTZZIV202284]. We use
 [`caret`](https://cran.r-project.org/web/packages/caret/index.html) package
 which offers a R framework for machine learning.
 
@@ -166,7 +166,7 @@ class_weights <- weights[ match(labels, names(weights)) ]
 # Specify train control
 train_control <- trainControl(
     method = "cv", number = 5,
-    classProbs = TRUE, 
+    classProbs = TRUE,
     savePredictions = "final",
     allowParallel = TRUE,
     summaryFunction = twoClassSummary
@@ -186,7 +186,7 @@ tune_grid <- expand.grid(
 
 # Train the model
 model <- train(
-    x = assay, 
+    x = assay,
     y = labels,
     preProcess = c("center", "scale"),
     method = "xgbTree",
diff --git a/inst/pages/miaverse.qmd b/inst/pages/miaverse.qmd
index 86ab3e927..343d94e67 100644
--- a/inst/pages/miaverse.qmd
+++ b/inst/pages/miaverse.qmd
@@ -127,13 +127,13 @@ The following DA methods support `(Tree)SummarizedExperiment`.
 - [philr](http://bioconductor.org/packages/devel/bioc/html/philr.html) (@Silverman2017) phylogeny-aware phILR transformation
 - [MicrobiotaProcess](https://bioconductor.org/packages/release/bioc/html/MicrobiotaProcess.html) for the "tidy" analysis of microbiome and other ecological data
 - [Tools for Microbiome Analysis](https://microsud.github.io/Tools-Microbiome-Analysis/)
-site listed over 130 R packages for microbiome data science in 2023. 
+site listed over 130 R packages for microbiome data science in 2023.
 Many of these are not in Bioconductor, or do not directly support the
 data containers  used in this book but can be often used with minor
 modifications. The phyloseq-based tools can be used by converting the
 TreeSE data into phyloseq with `convertToPhyloseq`.
 
-### Open microbiome data 
+### Open microbiome data
 
 Hundreds of published microbiome datasets are readily available in
 these data containers (see [@sec-example-data]).
@@ -157,7 +157,7 @@ script.
 ```{r}
 #| label: all_packages
 #| eval: FALSE
-#| 
+#|
 
 # URL of the raw CSV file on GitHub. It includes all packages needed.
 url <- "https://raw.githubusercontent.com/microbiome/OMA/devel/oma_packages/oma_packages.csv"
@@ -241,6 +241,6 @@ and [@sec-support].
 - TreeSummarizedExperiment is derived from SummarizedExperiment class.
 - miaverse is based on TreeSummarizedExperiment data container.
 - We can borrow methods from packages utilizing SingleCellExperiment and
-SummarizedExperiment. 
+SummarizedExperiment.
 
 :::
diff --git a/inst/pages/mmuphin_meta_analysis.qmd b/inst/pages/mmuphin_meta_analysis.qmd
index 2c84bb13a..c7642cb62 100644
--- a/inst/pages/mmuphin_meta_analysis.qmd
+++ b/inst/pages/mmuphin_meta_analysis.qmd
@@ -99,13 +99,13 @@ library(dplyr)
 
 se_relative <- sampleMetadata |>
     filter(study_name %in% c(
-        "FrankelAE_2017", "GopalakrishnanV_2018", "LeeKA_2022", 
+        "FrankelAE_2017", "GopalakrishnanV_2018", "LeeKA_2022",
         "MatsonV_2018", "PetersBA_2019", "WindTT_2020")) |>
     returnSamples("relative_abundance", rownames = "short")
 
 se_pathway <- sampleMetadata |>
     filter(study_name %in% c(
-        "FrankelAE_2017", "GopalakrishnanV_2018", "LeeKA_2022", 
+        "FrankelAE_2017", "GopalakrishnanV_2018", "LeeKA_2022",
         "MatsonV_2018", "PetersBA_2019", "WindTT_2020")) |>
     returnSamples("pathway_abundance", rownames = "short")
 
@@ -232,7 +232,7 @@ pcoa_before <- cmdscale(D_before, eig = TRUE)
 ord_before <- as.data.frame(pcoa_before$points)
 percent_var_before <- round(pcoa_before$eig / sum(pcoa_before$eig) * 100, 1)[1:2]
 before_labels <- c(
-    paste('Axis 1 (', percent_var_before[1], '%)', sep = ''), 
+    paste('Axis 1 (', percent_var_before[1], '%)', sep = ''),
     paste('Axis 2 (', percent_var_before[2], '%)', sep = ''))
 
 before_phrase <- paste('Before (PERMANOVA R2 = ', R2_before, '%)', sep = '')
@@ -244,7 +244,7 @@ R2_after <- round(fit_adonis_after$aov.tab[1, 5]*100, 1)
 pcoa_after <- cmdscale(D_after, eig = TRUE)
 ord_after <- as.data.frame(pcoa_after$points)
 percent_var_after <- round(pcoa_after$eig / sum(pcoa_after$eig) * 100, 1)[1:2]
-after_labels <- c(paste('Axis 1 (', percent_var_after[1], '%)', sep = ''), 
+after_labels <- c(paste('Axis 1 (', percent_var_after[1], '%)', sep = ''),
                 paste('Axis 2 (', percent_var_after[2], '%)', sep = ''))
 
 after_phrase <- paste('After (PERMANOVA R2 = ', R2_after, '%)', sep = '')
@@ -252,22 +252,22 @@ colnames(ord_after) <- c('PC1', 'PC2')
 ord_after$Study <- data_meta$study_name
 
 # Ordination Plot
-p_before <- ggplot(ord_before, aes(x = PC1, y = PC2, color = Study)) + 
-    geom_point(size = 4) + 
-    theme_bw() + 
-    xlab(before_labels[1]) + 
-    ylab(before_labels[2]) + 
-    ggtitle(before_phrase) + 
-    theme(plot.title = element_text(hjust = 0.5)) + 
+p_before <- ggplot(ord_before, aes(x = PC1, y = PC2, color = Study)) +
+    geom_point(size = 4) +
+    theme_bw() +
+    xlab(before_labels[1]) +
+    ylab(before_labels[2]) +
+    ggtitle(before_phrase) +
+    theme(plot.title = element_text(hjust = 0.5)) +
     theme(legend.position ="none")
 
-p_after <- ggplot(ord_after, aes(x = PC1, y = PC2, color = Study)) + 
-    geom_point(size = 4) + 
-    theme_bw() + 
-    xlab(after_labels[1]) + 
-    ylab(after_labels[2]) + 
-    ggtitle(after_phrase) + 
-    theme(plot.title = element_text(hjust = 0.5)) + 
+p_after <- ggplot(ord_after, aes(x = PC1, y = PC2, color = Study)) +
+    geom_point(size = 4) +
+    theme_bw() +
+    xlab(after_labels[1]) +
+    ylab(after_labels[2]) +
+    ggtitle(after_phrase) +
+    theme(plot.title = element_text(hjust = 0.5)) +
     theme(legend.position ="none")
 
 p_before + p_after
@@ -295,7 +295,7 @@ for (i in 1:length(transform)) {
         exposure = "resvar",
         data = data_meta,
         control = list(analysis_method = 'LM', transform = transform[i]))
-  
+
     num[i] <- sum(fit_lm_meta$meta_fits$qval.fdr < 0.05, na.rm = TRUE)
 }
 
@@ -320,7 +320,7 @@ meta_fits <- fit_lm_meta$meta_fits
 
 meta_fits %>%
     filter(qval.fdr < 0.05) %>%
-    arrange(coef) %>% 
+    arrange(coef) %>%
     mutate(feature = factor(feature, levels = feature)) %>%
     ggplot(aes(y = coef, x = feature)) +
     geom_bar(stat = "identity") +
@@ -423,7 +423,7 @@ pcoa_before <- cmdscale(D_before, eig = TRUE)
 ord_before <- as.data.frame(pcoa_before$points)
 percent_var_before <- round(pcoa_before$eig / sum(pcoa_before$eig) * 100, 1)[1:2]
 before_labels <- c(
-    paste('Axis 1 (', percent_var_before[1], '%)', sep = ''), 
+    paste('Axis 1 (', percent_var_before[1], '%)', sep = ''),
     paste('Axis 2 (', percent_var_before[2], '%)', sep = ''))
 
 before_phrase <- paste('Before (PERMANOVA R2 = ', R2_before, '%)', sep = '')
@@ -436,7 +436,7 @@ pcoa_after <- cmdscale(D_after, eig = TRUE)
 ord_after <- as.data.frame(pcoa_after$points)
 percent_var_after <- round(pcoa_after$eig / sum(pcoa_after$eig) * 100, 1)[1:2]
 after_labels <- c(
-    paste('Axis 1 (', percent_var_after[1], '%)', sep = ''), 
+    paste('Axis 1 (', percent_var_after[1], '%)', sep = ''),
     paste('Axis 2 (', percent_var_after[2], '%)', sep = ''))
 
 after_phrase <- paste('After (PERMANOVA R2 = ', R2_after, '%)', sep = '')
@@ -444,22 +444,22 @@ colnames(ord_after) <- c('PC1', 'PC2')
 ord_after$Study <- data_meta$study_name
 
 # Ordination Plot
-p_before <- ggplot(ord_before, aes(x = PC1, y = PC2, color = Study)) + 
-    geom_point(size = 4) + 
-    theme_bw() + 
-    xlab(before_labels[1]) + 
-    ylab(before_labels[2]) + 
-    ggtitle(before_phrase) + 
-    theme(plot.title = element_text(hjust = 0.5)) + 
+p_before <- ggplot(ord_before, aes(x = PC1, y = PC2, color = Study)) +
+    geom_point(size = 4) +
+    theme_bw() +
+    xlab(before_labels[1]) +
+    ylab(before_labels[2]) +
+    ggtitle(before_phrase) +
+    theme(plot.title = element_text(hjust = 0.5)) +
     theme(legend.position ="none")
 
-p_after <- ggplot(ord_after, aes(x = PC1, y = PC2, color = Study)) + 
-    geom_point(size = 4) + 
-    theme_bw() + 
-    xlab(after_labels[1]) + 
-    ylab(after_labels[2]) + 
-    ggtitle(after_phrase) + 
-    theme(plot.title = element_text(hjust = 0.5)) + 
+p_after <- ggplot(ord_after, aes(x = PC1, y = PC2, color = Study)) +
+    geom_point(size = 4) +
+    theme_bw() +
+    xlab(after_labels[1]) +
+    ylab(after_labels[2]) +
+    ggtitle(after_phrase) +
+    theme(plot.title = element_text(hjust = 0.5)) +
     theme(legend.position ="none")
 
 p <- p_before + p_after
@@ -482,7 +482,7 @@ for (i in 1:length(transform)) {
         exposure = "resvar",
         data = data_meta,
         control = list(analysis_method = 'LM', transform = transform[i]))
-  
+
     num[i] <- sum(fit_lm_meta$meta_fits$pval < 0.05, na.rm = TRUE)
 }
 
@@ -535,7 +535,7 @@ meta_fits2 <- fit_lm_meta2$meta_fits
 # Create the figure
 meta_fits2 %>%
     filter(pval < 0.05) %>%
-    arrange(coef) %>% 
+    arrange(coef) %>%
     mutate(feature = factor(feature, levels = feature)) %>%
     ggplot(aes(y = coef, x = feature)) +
     theme(axis.text.y = element_text(size = 10)) +
diff --git a/inst/pages/msea.qmd b/inst/pages/msea.qmd
index 11136c00c..75ccc6209 100644
--- a/inst/pages/msea.qmd
+++ b/inst/pages/msea.qmd
@@ -13,7 +13,7 @@ detect the modest but coordinated changes in pre-specified sets of related
 microbial features. Such a set might include all the microbes in a specific
 pathway or microbial genes that have been shown to be co-regulated based on
 previously published studies. Like GSEA, MSEA aggregates the per-feature
-statistics across microbes within a microbe set. This corresponds to the 
+statistics across microbes within a microbe set. This corresponds to the
 hypothesis that many relevant phenotype differences are manifested by small but
 consistent changes in a set of features.
 
@@ -54,9 +54,9 @@ se_relative <- sampleMetadata |>
 # Create sample metadata #
 ##########################
 
-sample_metadata <- 
-    colData(se_relative) %>% 
-    as.data.frame() %>% filter(visit_number == 1) %>% 
+sample_metadata <-
+    colData(se_relative) %>%
+    as.data.frame() %>% filter(visit_number == 1) %>%
     .[, c("age", "disease", "antibiotics_current_use")]
 
 #################
@@ -95,17 +95,17 @@ study.
 library(Maaslin2)
 
 fit_data = Maaslin2(
-    input_data = feature_species, 
-    input_metadata = sample_metadata, 
+    input_data = feature_species,
+    input_metadata = sample_metadata,
     normalization = "NONE",
-    output = "output_species", 
+    output = "output_species",
     fixed_effects = c("disease", "age", "antibiotics_current_use"))
 
 ```
 
 Unlike gene expression studies, we do not have well-defined signatures or
 modules for microbiome data. Here, we will construct data-driven modules using
-weighted gene co-expression network analysis (WGCNA) [@Langfelder2008], 
+weighted gene co-expression network analysis (WGCNA) [@Langfelder2008],
 [@Geistlinger2023]. We aim to ensure that the effect of disease and other
 covariates has been removed by working on the residuals. Following the WGCNA
 tutorial, our first step will be to check whether there are any outliers in our
@@ -151,7 +151,7 @@ connectivity graphic for the selection of this power parameter.
 powers = c(c(1:20), seq(from = 22, to=30, by=2))
 sft = pickSoftThreshold(
     datExpr, powerVector = powers, verbose = 5, dataIsExpr = TRUE,
-    RsquaredCut = 0.30) 
+    RsquaredCut = 0.30)
 
 ```
 
@@ -162,16 +162,16 @@ soft threshold parameter selected above.
 
 power = sft$powerEstimate
 net = blockwiseModules(
-    datExpr, 
+    datExpr,
     power = power,
-    corFnc="bicor", 
-    corOptions=list(maxPOutliers=0.1), 
-    networkType ="unsigned", 
+    corFnc="bicor",
+    corOptions=list(maxPOutliers=0.1),
+    networkType ="unsigned",
     maxBlockSize = ncol(datExpr),
     minModuleSize = 3,
-    TOMType = "unsigned", 
-    reassignThreshold = 0, 
-    mergeCutHeight = 0,                       
+    TOMType = "unsigned",
+    reassignThreshold = 0,
+    mergeCutHeight = 0,
     verbose = 3)
 
 ####################
@@ -241,26 +241,26 @@ library(gsEasy)
 run_MSEA <- function(
         microbeSet, # A list
         ranked_features, # Ranked list of featured
-        filter.count = 3, 
+        filter.count = 3,
         seed = 1234,
         fdr.correction = 'BH') {
-    
-  
+
+
     ###################
     # Filter out sets #
     ##################
-    
+
     microbeSet0 <- microbeSet
     cond <- sapply(microbeSet0, function(x) length(x) > filter.count)
     microbeSet <- microbeSet0[cond]
     lengthmicrobeSet <- as.data.frame(
         reshape2::melt(lapply(microbeSet, function(x) length(x))))
     colnames(lengthmicrobeSet) <- c('Freq','Set')
-    
+
     ################
     # Classic MSEA #
     ################
-    
+
     set.seed(seed)
     enrichment <- as.data.frame(
         sapply(microbeSet, function(set) gset(S = set, r = ranked_features)))
@@ -268,13 +268,13 @@ run_MSEA <- function(
     enrichment <- rownames_to_column(enrichment, 'Set')
     enrichment <- merge(enrichment, lengthmicrobeSet, 'Set')
     enrichment$qval <- p.adjust(enrichment$ES, fdr.correction)
-    
+
     ##########
     # Return #
     ##########
-    
+
     return(enrichment)
-  
+
 }
 
 ```
@@ -322,14 +322,14 @@ We can plot the enrichment scores to visualize the MSEA results.
 
 ```{r visualzation}
 
-p <- MSEA %>% 
-    arrange(-pval) %>% 
-    mutate(ID = factor(ID, levels = ID)) %>% 
+p <- MSEA %>%
+    arrange(-pval) %>%
+    mutate(ID = factor(ID, levels = ID)) %>%
     ggplot(aes(y = -log10(pval), x = ID)) +
-    geom_bar(stat = "identity", fill = 'cornflowerblue') + theme_bw() + 
-    coord_flip() + 
-    ggtitle('Statistically significant modules associated with disease') + 
-    xlab('') + 
+    geom_bar(stat = "identity", fill = 'cornflowerblue') + theme_bw() +
+    coord_flip() +
+    ggtitle('Statistically significant modules associated with disease') +
+    xlab('') +
     ylab('MSEA enrichment score')
 
 print(p)
@@ -356,7 +356,7 @@ project and follow the same steps as before.
 ```{r input-data}
 
 ##########################
-# Load HMP2 pathway data # 
+# Load HMP2 pathway data #
 ##########################
 
 se_pathway <- sampleMetadata |>
@@ -367,8 +367,8 @@ se_pathway <- sampleMetadata |>
 # Create sample metadata #
 ##########################
 
-sample_metadata <- colData(se_pathway) %>% 
-    as.data.frame() %>% filter(visit_number == 1) %>% 
+sample_metadata <- colData(se_pathway) %>%
+    as.data.frame() %>% filter(visit_number == 1) %>%
     dplyr::select("age", "disease", "antibiotics_current_use")
 
 # Set reference
@@ -381,10 +381,10 @@ sample_metadata$disease <- relevel(sample_metadata$disease, 'healthy')
 
 feature_pwys_t <- as.data.frame(assay(se_pathway))
 feature_pwys_t <- rownames_to_column(feature_pwys_t, "ID")
-feature_pwys_t <- feature_pwys_t %>% 
-    filter(!grepl("\\|", ID)) %>% 
-    filter(!ID %in% c('UNMAPPED', 'UNINTEGRATED')) %>% 
-    column_to_rownames('ID') %>% 
+feature_pwys_t <- feature_pwys_t %>%
+    filter(!grepl("\\|", ID)) %>%
+    filter(!ID %in% c('UNMAPPED', 'UNINTEGRATED')) %>%
+    column_to_rownames('ID') %>%
     as.data.frame()
 
 ##############################
@@ -404,10 +404,10 @@ construct the modules using residuals from the `MaAsLin2` models.
 ```{r MSEA-preparation}
 
 fit_data = Maaslin2(
-    input_data = feature_pwys, 
-    input_metadata = sample_metadata, 
+    input_data = feature_pwys,
+    input_metadata = sample_metadata,
     normalization = "NONE",
-    output = "output_pwys", 
+    output = "output_pwys",
     fixed_effects = c("disease", "age", "antibiotics_current_use"))
 
 ##########################
@@ -453,16 +453,16 @@ sft = pickSoftThreshold(
 
 power = sft$powerEstimate
 net = blockwiseModules(
-    datExpr, 
+    datExpr,
     power = power,
-    corFnc = "bicor", 
-    corOptions = list(maxPOutliers = 0.1), 
-    networkType ="unsigned", 
+    corFnc = "bicor",
+    corOptions = list(maxPOutliers = 0.1),
+    networkType ="unsigned",
     maxBlockSize = ncol(datExpr),
     minModuleSize = 3,
-    TOMType = "unsigned", 
-    reassignThreshold = 0, 
-    mergeCutHeight = 0,                       
+    TOMType = "unsigned",
+    reassignThreshold = 0,
+    mergeCutHeight = 0,
     verbose = 3)
 
 ####################
@@ -538,14 +538,14 @@ MSEA$ID <- paste(MSEA$ID, ' (', MSEA$Size, ')', sep = '')
 # Plot #
 ########
 
-p <- MSEA %>% 
-  arrange(-pval) %>% 
-  mutate(ID = factor(ID, levels = ID)) %>% 
+p <- MSEA %>%
+  arrange(-pval) %>%
+  mutate(ID = factor(ID, levels = ID)) %>%
   ggplot(aes(y = -log10(pval), x = ID)) +
-  geom_bar(stat = "identity", fill = 'cornflowerblue') + theme_bw() + 
-  coord_flip() + 
-  ggtitle('Statistically significant modules associated with disease') + 
-  xlab('') + 
+  geom_bar(stat = "identity", fill = 'cornflowerblue') + theme_bw() +
+  coord_flip() +
+  ggtitle('Statistically significant modules associated with disease') +
+  xlab('') +
   ylab('MSEA enrichment score')
 
 print(p)
diff --git a/inst/pages/multiassay_ordination.qmd b/inst/pages/multiassay_ordination.qmd
index cd0f231b6..ba24d460d 100644
--- a/inst/pages/multiassay_ordination.qmd
+++ b/inst/pages/multiassay_ordination.qmd
@@ -44,9 +44,9 @@ library(MOFA2)
 # reticulate package is needed
 library(reticulate)
 # Let us assume that these have been installed already.
-#reticulate::install_miniconda(force = TRUE)
-#reticulate::use_miniconda(condaenv = "env1", required = FALSE)
-#reticulate::py_install(packages = c("mofapy2"), pip = TRUE, python_version=3.6)
+reticulate::install_miniconda(force = TRUE)
+reticulate::use_miniconda(condaenv = "env1", required = FALSE)
+reticulate::py_install(packages = c("mofapy2"), pip = TRUE, python_version=3.6)
 ```
 
 The `mae` object could be used straight to create the MOFA model. Yet,
@@ -66,10 +66,10 @@ remove them.
 #| label: mofa3
 
 library(MOFA2)
-# For simplicity, classify all high-fat diets as high-fat, and all the low-fat 
+# For simplicity, classify all high-fat diets as high-fat, and all the low-fat
 # diets as low-fat diets
 colData(mae)$Diet <- ifelse(
-    colData(mae)$Diet == "High-fat" | colData(mae)$Diet == "High-fat + XOS", 
+    colData(mae)$Diet == "High-fat" | colData(mae)$Diet == "High-fat + XOS",
     "High-fat", "Low-fat")
 
 # Agglomerate microbiome data
@@ -96,7 +96,7 @@ assays(mae[[3]]) <- assays(mae[[3]])["standardize"]
 # Building our mofa model
 model <- create_mofa_from_MultiAssayExperiment(
     mae,
-    groups = "Diet", 
+    groups = "Diet",
     extract_metadata = TRUE)
 model
 ```
diff --git a/inst/pages/network_comparison.qmd b/inst/pages/network_comparison.qmd
index 9de85e5f9..9b52ed607 100644
--- a/inst/pages/network_comparison.qmd
+++ b/inst/pages/network_comparison.qmd
@@ -6,8 +6,8 @@ chapterPreamble()
 ```
 
 ```{r, echo=FALSE}
-# The networks shown in this chapter take several minutes to generate. 
-# Therefore, the network objects have been stored in a folder and are loaded 
+# The networks shown in this chapter take several minutes to generate.
+# Therefore, the network objects have been stored in a folder and are loaded
 # here to reduce the time needed to knit the book.
 networks_diet_file  <- system.file(
     "extdata", "networks_diet.RData", package = "OMA")
@@ -26,10 +26,10 @@ networks side by side, or quantitatively using differential network analysis
 tools.
 
 Two approaches for comparing networks between two conditions are considered
-in this chapter: 
+in this chapter:
 
 - **Differential network analysis**, which analyzes differences in network
-metrics and network structure. 
+metrics and network structure.
 - **Differential association analysis**, which focuses on differences in the
 strength of individual associations.
 
@@ -40,7 +40,7 @@ Here we use [NetCoMi](https://github.com/stefpeschel/NetCoMi)
 [@peschel2021netcomi] for network comparison, which includes several
 **differential network analysis approaches** as well as functionality for
 *differential association analysis**, i.e., generating a differential network.
-How to install NetCoMi from GitHub is explained in [@sec-network-learning]. 
+How to install NetCoMi from GitHub is explained in [@sec-network-learning].
 
 The **PeerJ data set** [@potbhare2022skin] containing skin microbial profiles
 for 58 subjects is again used in this chapter. The dataset also includes
@@ -64,14 +64,14 @@ tse <- peerj13075
 
 ```{r preprocessing}
 # Agglomerate to genus level
-tse <- agglomerateByRank(tse, rank = "genus") 
+tse <- agglomerateByRank(tse, rank = "genus")
 
 # Add relative abundances
 tse <- transformAssay(
-    tse, 
-    assay.type = "counts", 
+    tse,
+    assay.type = "counts",
     method = "relabundance",
-    MARGIN = "cols") 
+    MARGIN = "cols")
 
 # Filter by prevalence
 tse <- subsetByPrevalent(
@@ -119,7 +119,7 @@ spring_net_diet <- netConstruct(
         rep.num = 10,
         thresh = 0.05,
         Rmethod = "approx"),
-    sparsMethod = "none", 
+    sparsMethod = "none",
     dissFunc = "signed",
     verbose = 3,
     seed = 13075)
@@ -132,7 +132,7 @@ different from zero.
 
 ```{r netAnalyze_diet, fig.width=10, fig.height=10}
 spring_netprops_diet <- netAnalyze(
-    spring_net_diet, 
+    spring_net_diet,
     clustMethod = "cluster_fast_greedy",
     hubPar = "eigenvector",
     normDeg = FALSE)
@@ -167,12 +167,12 @@ plot(spring_netprops_diet,
      labelScale = FALSE,
      nodeSize = "eigenvector",
      nodeSizeSpread = 2,
-     nodeColor = "cluster", 
+     nodeColor = "cluster",
      sameColThresh = 2,
      hubBorderCol = "darkgray",
      cexNodes = 2,
      edgeTranspHigh = 20,
-     title1 = "Mixed diet", 
+     title1 = "Mixed diet",
      title2 = "Vegetarian",
      showTitle = TRUE,
      cexTitle = 2,
@@ -183,7 +183,7 @@ par(fig=c(0, 1, 0, 1), oma=c(0, 0, 0, 0), mar=c(0, 0, 0, 0), new=TRUE)
 plot(0, 0, type='n', bty='n', xaxt='n', yaxt='n')
 
 legend(-0.2, -0.9, cex = 1.5, title = "estimated correlation:",
-       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"), 
+       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"),
        bty = "n", horiz = TRUE)
 ```
 
@@ -201,12 +201,12 @@ plot(spring_netprops_diet,
      labelScale = FALSE,
      nodeSize = "eigenvector",
      nodeSizeSpread = 2,
-     nodeColor = "cluster", 
+     nodeColor = "cluster",
      sameColThresh = 2,
      hubBorderCol = "darkgray",
      cexNodes = 2,
      edgeTranspHigh = 20,
-     title1 = "Mixed diet", 
+     title1 = "Mixed diet",
      title2 = "Vegetarian",
      showTitle = TRUE,
      cexTitle = 2,
@@ -216,7 +216,7 @@ plot(spring_netprops_diet,
 par(fig=c(0, 1, 0, 1), oma=c(0, 0, 0, 0), mar=c(0, 0, 0, 0), new=TRUE)
 plot(0, 0, type='n', bty='n', xaxt='n', yaxt='n')
 legend(-0.2, -0.8, cex = 1.7, title = "estimated correlation:",
-       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"), 
+       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"),
        bty = "n", horiz = TRUE)
 ```
 
@@ -254,15 +254,15 @@ p_diet <- plot(
     rmSingles = "inboth",
     labelScale = FALSE,
     nodeSize = "clr",
-    nodeColor = "feature", 
-    featVecCol = phyla, 
+    nodeColor = "feature",
+    featVecCol = phyla,
     colorVec =  colvec,
     nodeTransp = 20,
     sameColThresh = 2,
     highlightHubs = FALSE,
     cexNodes = 2,
     edgeTranspHigh = 20,
-   title1 = "Mixed diet", 
+   title1 = "Mixed diet",
    title2 = "Vegetarian",
    showTitle = TRUE,
    cexTitle = 2,
@@ -275,10 +275,10 @@ col_transp <- colToTransp(colvec, 20)
 par(fig=c(0, 1, 0, 1), oma=c(0, 0, 0, 0), mar=c(0, 0, 0, 0), new=TRUE)
 plot(0, 0, type='n', bty='n', xaxt='n', yaxt='n')
 legend(-0.15, -0.8, cex = 1.7, title = "estimated correlation:",
-       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"), 
+       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"),
        bty = "n", horiz = TRUE)
-legend(-0.15, 1.3, cex = 1.7, pt.cex = 2.5, title = "Phylum:", 
-       legend=levels(phyla), col = col_transp, bty = "n", pch = 16) 
+legend(-0.15, 1.3, cex = 1.7, pt.cex = 2.5, title = "Phylum:",
+       legend=levels(phyla), col = col_transp, bty = "n", pch = 16)
 ```
 
 ### Quantitative comparison
@@ -300,7 +300,7 @@ before applying it to your data.
 
 ```{r netcomp_diet_permute, eval=FALSE}
 spring_netcomp_diet <- netCompare(
-    spring_netprops_diet, 
+    spring_netprops_diet,
     permTest = TRUE,
     nPerm = 1000,
     cores = 6,
@@ -311,7 +311,7 @@ spring_netcomp_diet <- netCompare(
 ```
 
 ```{r summary_netcomp_diet}
-summary(spring_netcomp_diet, 
+summary(spring_netcomp_diet,
         groupNames = c("Mixed diet", "Vegetarian"),
         numbNodes = 5)
 ```
@@ -382,12 +382,12 @@ spring_diffnet_unadj <- diffnet(
     adjust = "none")
 ```
 
-The `diffnet` object it now plotted using NetCoMi's plot function. 
+The `diffnet` object it now plotted using NetCoMi's plot function.
 
 ```{r diffnet_plot, fig.width=20, fig.height=14}
-plot(spring_diffnet_unadj, 
-     cexLabels = 2,
-     cexNodes = 0.7, 
+plot(spring_diffnet_unadj,
+     cexLabels = 1,
+     cexNodes = 1.3,
      cexLegend = 2.5,
      cexTitle = 3,
      mar = c(3,2,5,15),
@@ -403,7 +403,7 @@ uncorrelated in the vegetarian group (such as Serratia and Citrobacter), the
 edge color is dark green.
 
 ```{r, eval=FALSE, echo=FALSE}
-save(spring_diffnet, spring_net_diet, spring_netcomp_diet, 
+save(spring_diffnet, spring_net_diet, spring_netcomp_diet,
      file = "general/network_data/networks_diet.RData")
 ```
 
@@ -430,7 +430,7 @@ analysis methods, but focus only on changes in edge sets.
 that incorporate multiple local and global network measures.
 @jardim2019bionetstat present a tool "BioNetStat" for differential analysis
 of biological networks, which is able to compare certain network measures
-between groups. 
+between groups.
 
 The [NetCoMi](https://github.com/stefpeschel/NetCoMi) package used for
 network comparison in this chapter includes the following
@@ -450,4 +450,4 @@ Two methods (Fisher's z-test [@fisher1970statistical] and the Discordant
 method [@siska2016discordant]) are available for identifying differential
 correlations, and **permutation tests** for the more general case of
 identifying **differential associations**. See [@peschel2021netcomi] for
-details. NetCoMi offers also a function for plotting a differential network. 
+details. NetCoMi offers also a function for plotting a differential network.
diff --git a/inst/pages/network_learning.qmd b/inst/pages/network_learning.qmd
index a5391fb29..cb00dc247 100644
--- a/inst/pages/network_learning.qmd
+++ b/inst/pages/network_learning.qmd
@@ -74,7 +74,7 @@ The following list gives a selection of compositionality aware approaches:
     + **Proportionality measures (proportionality aware by definition):**
       - [propr](https://github.com/tpq/propr)
       - [Shrinkage proportionality estimator](https://github.com/MichelleBadri/NormCorr-manuscript/blob/master/code/helpers/norm_functions.R); applied in [@sec-shrinkage-prop]
-        
+
 4.  **Sparsification:** Transforming the estimated associations directly into
 adjacencies would lead to a dense network where all nodes are connected and
 only weighted network measures are meaningful. Therefore, the association
@@ -132,8 +132,8 @@ dim(tse)
 ```
 
 ```{r, echo=FALSE}
-# The networks shown in this chapter take several minutes to generate. 
-# Therefore, the network objects have been stored in a folder and are loaded 
+# The networks shown in this chapter take several minutes to generate.
+# Therefore, the network objects have been stored in a folder and are loaded
 # here to reduce the time needed to knit the book.
 spring_networks_file  <- system.file(
     "extdata", "spring_networks.RData", package = "OMA")
@@ -176,14 +176,14 @@ steps:
 
 ```{r}
 # Agglomerate to genus level
-tse <- agglomerateByRank(tse, rank = "genus") 
+tse <- agglomerateByRank(tse, rank = "genus")
 
 # Add relative abundances
 tse <- transformAssay(
-    tse, 
-    assay.type = "counts", 
+    tse,
+    assay.type = "counts",
     method = "relabundance",
-    MARGIN = "cols") 
+    MARGIN = "cols")
 
 # Filter by prevalence
 tse <- subsetByPrevalent(
@@ -206,7 +206,7 @@ dim(tse)
 As explained in [@sec-network-learning-workflow], we use
 [SPRING](https://github.com/GraceYoon/SPRING) ("Semi-Parametric Rank-based
 approach for INference in Graphical model") as association measure. We first
-use the `SPRING` function directly to construct a conditional dependency graph. 
+use the `SPRING` function directly to construct a conditional dependency graph.
 
 Neither zero replacement nor normalization (steps 1 and 2 in our workflow)
 are required because SPRING uses a modified clr (mclr) transformation that
@@ -236,8 +236,8 @@ library(SPRING)
 ```{r, eval=FALSE}
 set.seed(13075)
 spring_est <- SPRING(
-    t(assay(tse, "counts")), 
-    Rmethod = "approx", 
+    t(assay(tse, "counts")),
+    Rmethod = "approx",
     thresh = 0.05,
     lambdaseq = "data-specific")
 ```
@@ -283,24 +283,24 @@ transform_asso <- function(assoMat, thresh = NULL, dissTrans = "signed") {
   if (!is.null(thresh)) {
       assoMat[abs(assoMat) < thresh] <- 0
   }
-  
+
   # Compute dissimilarity matrix
   if (dissTrans == "signed") {
       dissMat <- sqrt(0.5 * (1 - assoMat))
   } else {
       dissMat <- sqrt(1 - assoMat^2)
   }
-  
+
   # Dissimilarity between nodes with zero correlation is set to 1
   # (these nodes are unconnected and thus should have maximum dissimilarity)
   dissMat[assoMat == 0] <- 1
-  
+
   # Compute similarity matrix
   simMat <- 1 - dissMat
-  
+
   # Turn into igraph object
   graphObj <- SpiecEasi::adj2igraph(simMat)
-  
+
   return(list(graph = graphObj, adja = simMat, asso = assoMat, diss = dissMat))
 }
 ```
@@ -324,7 +324,7 @@ preprocessing, association estimation, sparsification, and transformation.
 The returned `microNet` object can then be passed to `netAnalyze()`
 (the network analysis function) so that all necessary information is available
 for the network analysis workflow.
- 
+
 ```{r, message=FALSE, warning=FALSE}
 library(NetCoMi)
 ```
@@ -344,7 +344,7 @@ netcomi_net <- netConstruct(
     filtTaxPar = list(numbSamp = 0.2),
     measure = "spring",
     measurePar = list(thresh = 0.05, Rmethod = "approx"),
-    sparsMethod = "none", 
+    sparsMethod = "none",
     dissFunc = "signed",
     seed = 13075)
 ```
@@ -418,9 +418,9 @@ set.seed(13075)
 lay_fr <- layout_with_fr(spring_graph)
 
 par(mfrow = c(1,2))
-plot(spring_graph, layout = lay_fr, vertex.size = vsize, 
+plot(spring_graph, layout = lay_fr, vertex.size = vsize,
      vertex.label = NA, main = "SPRING network")
-plot(netcomi_graph, layout = lay_fr, vertex.size = vsize, 
+plot(netcomi_graph, layout = lay_fr, vertex.size = vsize,
      vertex.label = NA, main = "NetCoMi network\n(with SPRING associations)")
 ```
 
@@ -480,7 +480,7 @@ head(vsize_df)
 ```{r plots_centrality, fig.height=10, fig.width=10}
 par(mfrow = c(2,2))
 for (i in seq_along(centr_df)) {
-    plot(spring_graph, layout = lay_fr, vertex.size = vsize_df[, i], 
+    plot(spring_graph, layout = lay_fr, vertex.size = vsize_df[, i],
         vertex.label = NA, main = colnames(centr_df)[i])
 }
 ```
@@ -515,7 +515,7 @@ vsize_df <- get_vsizes(centr_df)
 ```{r plots_centrality_lcc, fig.height=10, fig.width=10}
 par(mfrow = c(2,2))
 for (i in seq_along(centr_df)) {
-    plot(spring_graph, layout = lay_fr, vertex.size = vsize_df[, i], 
+    plot(spring_graph, layout = lay_fr, vertex.size = vsize_df[, i],
         vertex.label = NA, main = colnames(centr_df)[i])
 }
 ```
@@ -550,6 +550,7 @@ df <- data.frame(
 ggplot(data = df, aes(x = Degree, y = Fraction, group = 1)) +
     geom_line() +
     geom_point() +
+    labs(title = "Degree distribution") +
     theme_bw()
 ```
 
@@ -579,14 +580,14 @@ adja_sel <- spring_cor[sel, sel]
 ```{r clustered_heatmap, fig.width=5.8, fig.height=5}
 # Color vector
 col <- colorRamp2(
-    c(-1, -0.5, 0, 0.5, 1), 
+    c(-1, -0.5, 0, 0.5, 1),
     c("royalblue4", "lightblue", "white", "orange", "firebrick3"))
 
 Heatmap(
-    adja_sel, 
-    col = col, 
+    adja_sel,
+    col = col,
     rect_gp = gpar(col = "gray", lwd = 1),
-    show_row_names = FALSE, 
+    show_row_names = FALSE,
     show_column_names = FALSE,
     name = "Association")
 ```
@@ -644,7 +645,7 @@ resulting from Student's t-test in the lower triangle). See `?calcGCM` and
 
 ```{r netAnalyze_single, fig.width=7, fig.height=7, out.width='75%'}
 netcomi_netprops <- netAnalyze(
-    netcomi_net, 
+    netcomi_net,
     clustMethod = "cluster_fast_greedy",
     hubPar = "eigenvector",
     normDeg = FALSE)
@@ -707,17 +708,17 @@ plot(netcomi_netprops,
      labelScale = FALSE,
      nodeSize = "eigenvector",
      nodeSizeSpread = 3,
-     nodeColor = "cluster", 
+     nodeColor = "cluster",
      hubBorderCol = "gray40",
      cexNodes = 1.8,
      edgeTranspHigh = 20,
-     title1 = "Network properties highlighted", 
+     title1 = "Network properties highlighted",
      showTitle = TRUE,
      cexTitle = 2.3,
      mar = c(1, 3, 4, 8))
 
 legend(0.7, 1.1, cex = 1.7, title = "estimated correlation:",
-       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"), 
+       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"),
        bty = "n", horiz = TRUE)
 ```
 
@@ -747,28 +748,28 @@ plot(netcomi_netprops,
      shortenLabels = "intelligent",
      labelScale = FALSE,
      nodeSize = "mclr",
-     nodeColor = "feature", 
-     featVecCol = phyla, 
+     nodeColor = "feature",
+     featVecCol = phyla,
      colorVec =  colvec,
      nodeTransp = 20,
      highlightHubs = FALSE,
      cexNodes = 1.8,
      edgeTranspHigh = 20,
-     title1 = "Data features highlighted", 
+     title1 = "Data features highlighted",
      showTitle = TRUE,
      cexTitle = 2.3,
      mar = c(1, 10, 4, 6))
 
 # Add legends
 legend(0.7, 1.1, cex = 1.7, title = "estimated correlation:",
-       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"), 
+       legend = c("+","-"), lty = 1, lwd = 3, col = c("#009900","red"),
        bty = "n", horiz = TRUE)
 
 # Colors used in the legend should be equally transparent as in the plot
 col_transp <- colToTransp(colvec, 20)
 
-legend(-1.8, 1.1, cex = 1.7, pt.cex = 2.5, title = "Phylum:", 
-       legend=levels(phyla), col = col_transp, bty = "n", pch = 16) 
+legend(-1.8, 1.1, cex = 1.7, pt.cex = 2.5, title = "Phylum:",
+       legend=levels(phyla), col = col_transp, bty = "n", pch = 16)
 ```
 
 **A few things to observe:**
@@ -849,7 +850,7 @@ conditional dependence measures address the high dimensionality of microbiome da
 as an alternative measure of pairwise association for compositional data.
 The idea is that if the relative abundances between two taxa $i$ and $j$ are
 proportional, then their corresponding absolute abundances are also
-proportional: 
+proportional:
 $\frac{\omega_i}{m} \propto \frac{\omega_j}{m} \Rightarrow \omega_i \propto \omega_j$,
 where $m$ is the sum of counts in the sample. It follows that proportionality
 is identical for the observed (relative) read counts and the true unobserved
@@ -866,15 +867,15 @@ even with small sample sizes.
 ## Comparison of association measures
 
 In this section, we provide three additional examples for constructing a
-network using each of the three types of association: 
+network using each of the three types of association:
 
 * Correlation using `SparCC`
 * Partial correlation using `SpiecEasi`
 * Proportionality using the shrinkage proportionality measure
 
 ```{r, echo=FALSE}
-# The networks shown in this chapter take several minutes to generate. 
-# Therefore, the network objects have been stored in a folder and are loaded 
+# The networks shown in this chapter take several minutes to generate.
+# Therefore, the network objects have been stored in a folder and are loaded
 # here to reduce the time needed to knit the book.
 networks_file  <- system.file("extdata", "networks.RData", package = "OMA")
 load(networks_file)
@@ -961,7 +962,7 @@ for Ecological Association and Statistical Inference") approach proposed by
 @Kurtz2015 to estimate a sparse conditional dependency graph. The neighborhood
 selection method ("MB") introduced by @meinshausen2006high is used for network
 learning. The approach is implemented in the R package
-[SpiecEasi](https://github.com/zdk123/SpiecEasi). 
+[SpiecEasi](https://github.com/zdk123/SpiecEasi).
 
 ```{r, message=FALSE, warning=FALSE}
 library(SpiecEasi)
@@ -970,8 +971,8 @@ library(SpiecEasi)
 ```{r, eval=FALSE}
 set.seed(13075)
 se_mb_est <- spiec.easi(
-    t(assay(tse, "counts")), 
-    method = 'mb', nlambda = 20, 
+    t(assay(tse, "counts")),
+    method = 'mb', nlambda = 20,
     pulsar.params = list(rep.num = 20))
 ```
 
@@ -1015,17 +1016,17 @@ set.seed(13075)
 lay_fr <- layout_with_fr(se_mb_graph)
 
 par(mfrow = c(2,2))
-plot(sparcc_graph03, layout = lay_fr, vertex.size = vsize, 
+plot(sparcc_graph03, layout = lay_fr, vertex.size = vsize,
      vertex.label = NA, main = "SparCC (thresh 0.3)")
-plot(sparcc_graph04, layout = lay_fr, vertex.size = vsize, 
+plot(sparcc_graph04, layout = lay_fr, vertex.size = vsize,
      vertex.label = NA, main = "SparCC (thresh 0.4)")
-plot(prop_graph, layout = lay_fr, vertex.size = vsize, 
+plot(prop_graph, layout = lay_fr, vertex.size = vsize,
      vertex.label = NA, main = "Shrinkage proportionality\n(thresh 0.4)")
-plot(se_mb_graph, layout = lay_fr, vertex.size = vsize, 
+plot(se_mb_graph, layout = lay_fr, vertex.size = vsize,
      vertex.label = NA, main = "SpiecEasi (MB)")
 ```
 
-**A few observations:**  
+**A few observations:**
 The density of SparCC (threshold 0.4), proportionality and SpiecEasi is
 comparable, while the SparCC correlation network with threshold 0.3 is much
 denser. However, there are edges in the proportionality and SpiecEasi networks
@@ -1046,7 +1047,7 @@ graph objects we need for the analyses.
 
 ```{r graphlist}
 graphlist <- list(
-    SparCC = sparcc_graph04, 
+    SparCC = sparcc_graph04,
     Proportionality = prop_graph,
     SpiecEasi = se_mb_graph,
     SPRING = spring_graph)
@@ -1075,8 +1076,8 @@ for(i in seq_along(graphlist)) {
 
 # Data frame needed for ggplot2
 df <- data.frame(
-    Degree = rep(seq_len(maxdeg), length(graphlist)), 
-    Fraction = unlist(ddlist), 
+    Degree = rep(seq_len(maxdeg), length(graphlist)),
+    Fraction = unlist(ddlist),
     Method = rep(names(graphlist), each = maxdeg))
 
 ggplot(df, aes(x = Degree, y = Fraction, group = Method)) +
@@ -1136,18 +1137,18 @@ for(i in seq_along(assolist)) {
     } else {
         showlegend <- FALSE
     }
-    
+
     hm_list[[i]] <- Heatmap(
-        assolist[[i]], 
-        col = col, 
+        assolist[[i]],
+        col = col,
         rect_gp = gpar(col = "gray", lwd = 1),
-        show_row_names = FALSE, 
+        show_row_names = FALSE,
         show_column_names = FALSE,
-        column_title = names(assolist)[i], 
+        column_title = names(assolist)[i],
         name = "Association",
         show_heatmap_legend = showlegend) %>%
-      
-        draw() %>% 
+
+        draw() %>%
         grid.grabExpr()
 }
 
diff --git a/inst/pages/quality_control.qmd b/inst/pages/quality_control.qmd
index 229703dc3..f0e582c01 100644
--- a/inst/pages/quality_control.qmd
+++ b/inst/pages/quality_control.qmd
@@ -41,10 +41,10 @@ tse <- transformAssay(tse, MARGIN = "cols", method = "relabundance")
 # assay.type / assay.type / assay.type
 # depending on the mia package version
 plotAbundanceDensity(
-    tse, layout = "jitter", 
+    tse, layout = "jitter",
     assay.type = "relabundance",
-    n = 40, point_size=1, point.shape=19, 
-    point.alpha=0.1) + 
+    n = 40, point_size=1, point.shape=19,
+    point.alpha=0.1) +
     scale_x_log10(label=scales::percent)
 ```
 
@@ -54,9 +54,9 @@ visualized as a density plot over a log-scaled axis, with
 
 ```{r, warning=FALSE, message=FALSE}
 plotAbundanceDensity(
-    tse, layout = "density", 
+    tse, layout = "density",
     assay.type = "relabundance",
-    n = 5, colour.by="nationality", 
+    n = 5, colour.by="nationality",
     point.alpha=1/10) +
     scale_x_log10()
 ```
@@ -73,7 +73,7 @@ which may be _conditionally abundant_ in a small number of samples.
 
 The population prevalence (frequency) at a 1% relative abundance
 threshold (`detection = 1/100` and `as.relative = TRUE`) can look
-like this. 
+like this.
 
 ```{r exploration-prevalence}
 getPrevalence(tse, detection = 1/100, sort = TRUE, as.relative = TRUE) |>
@@ -98,7 +98,7 @@ If the output is to be used for subsetting or storing the data in the
 
 ### Prevalence analysis
 
-To investigate microbiome prevalence at a selected taxonomic level, two 
+To investigate microbiome prevalence at a selected taxonomic level, two
 approaches are available.
 
 First, the data can be agglomerated to the taxonomic level and then
@@ -108,7 +108,7 @@ First, the data can be agglomerated to the taxonomic level and then
 # Agglomerate taxa abundances to Phylum level
 #and add the new table to the altExp slot
 altExp(tse,"Phylum") <- agglomerateByRank(tse, "Phylum")
-# Check prevalence for the Phylum abundance table 
+# Check prevalence for the Phylum abundance table
 #from the altExp slot
 getPrevalence(
     altExp(tse,"Phylum"), detection = 1/100, sort = TRUE,
@@ -149,7 +149,7 @@ prev
 ## Note
 
 The `detection` and `prevalence` thresholds are not the same, as
-`detection` can be applied to relative counts or absolute counts depending on 
+`detection` can be applied to relative counts or absolute counts depending on
 whether `as.relative` is set `TRUE` or `FALSE`
 :::
 
@@ -170,14 +170,14 @@ are stored in the `altExp` slot.
 
 ```{r}
 rowData(altExp(tse,"Phylum"))$prevalence <- getPrevalence(
-    altExp(tse,"Phylum"), detection = 1/100, 
+    altExp(tse,"Phylum"), detection = 1/100,
     sort = FALSE,
     assay.type = "counts", as.relative = TRUE)
 ```
 
 The prevalences can then be plotted using the plotting functions from
 the `scater` package.
- 
+
 ```{r, message=FALSE, warning=FALSE}
 library(scater)
 plotRowData(altExp(tse,"Phylum"), "prevalence", colour_by = "Phylum")
@@ -191,9 +191,9 @@ tse <- agglomerateByRanks(tse)
 altExps(tse) <- lapply(
     altExps(tse), function(y){
         rowData(y)$prevalence <- getPrevalence(
-            y, detection = 1/100, 
+            y, detection = 1/100,
             sort = FALSE,
-            assay.type = "counts", 
+            assay.type = "counts",
             as.relative = TRUE)
         return(y)
     })
@@ -211,7 +211,7 @@ top_phyla_mean <- getTop(
 x <- unsplitByRanks(tse, ranks = taxonomyRanks(tse)[1:6])
 x <- addHierarchyTree(x)
 ```
- 
+
 After some preparation, the data are assembled and can be plotted with
 `plotRowTree`.
 
@@ -241,22 +241,22 @@ microbiome data summaries.
 ```{r, message=FALSE}
 library(mia)
 data("GlobalPatterns", package="mia")
-tse <- GlobalPatterns 
+tse <- GlobalPatterns
 ```
 
-### Top taxa  
+### Top taxa
 
-The function `getTop` identifies top taxa in the data.   
+The function `getTop` identifies top taxa in the data.
 
 ```{r top-feature-taxo}
 # Pick the top taxa
 top_features <- getTop(tse, method="median", top=10)
 
-# Check the information for these 
+# Check the information for these
 rowData(tse)[top_features, taxonomyRanks(tse)][1:5, 1:3]
 ```
 
-### Library size / read count  
+### Library size / read count
 
 The total counts per sample can be calculated using
 `perCellQCMetrics`/`addPerCellQC` from the `scater` package. The former one
@@ -279,36 +279,36 @@ library(ggplot2)
 p1 <- ggplot(colData(tse)) +
         geom_histogram(
             aes(x = sum), color = "black", fill = "gray", bins = 30) +
-        labs(x = "Library size", y = "Frequency (n)") + 
+        labs(x = "Library size", y = "Frequency (n)") +
         # scale_x_log10(
-            # breaks = scales::trans_breaks("log10", function(x) 10^x), 
+            # breaks = scales::trans_breaks("log10", function(x) 10^x),
             # labels = scales::trans_format("log10", scales::math_format(10^.x))) +
         theme_bw() +
         # Removes the grid
-        theme(panel.grid.major = element_blank(), 
+        theme(panel.grid.major = element_blank(),
             panel.grid.minor = element_blank(),
             panel.border = element_blank(),
             panel.background = element_blank(),
             # Adds y-axis
-            axis.line = element_line(colour = "black")) 
+            axis.line = element_line(colour = "black"))
 
 library(dplyr)
 df <- as.data.frame(colData(tse)) %>%
     arrange(sum) %>%
     mutate(index = 1:n())
 p2 <- ggplot(df, aes(y = index, x = sum/1e6)) +
-    geom_point() +	
+    geom_point() +
     labs(
-        x = "Library size (million reads)", 
-        y = "Sample index") +	
+        x = "Library size (million reads)",
+        y = "Sample index") +
     theme_bw() +
     # Removes the grid
-    theme(panel.grid.major = element_blank(), 
+    theme(panel.grid.major = element_blank(),
     panel.grid.minor = element_blank(),
     panel.border = element_blank(),
     panel.background = element_blank(),
     # Adds y-axis
-    axis.line = element_line(colour = "black")) 
+    axis.line = element_line(colour = "black"))
 
 library(patchwork)
 p1 + p2
@@ -318,28 +318,28 @@ Library sizes and other variables from `colData` can be visualized using a
 specified function called `plotColData`.
 
 ```{r plot-viz-lib-size-2, fig.width=8, fig.height=4, fig.cap="Library sizes per sample."}
-# Sort samples by read count, 
-# order the factor levels, 
+# Sort samples by read count,
+# order the factor levels,
 # and store back to tse as DataFrame
-# TODO: plotColData could include an option 
+# TODO: plotColData could include an option
 # for sorting samples based on colData variables
 colData(tse) <- as.data.frame(colData(tse)) %>%
     arrange(X.SampleID) %>%
     mutate(X.SampleID = factor(X.SampleID, levels=X.SampleID)) %>%
         DataFrame
-plotColData(tse,"sum","X.SampleID", colour_by = "SampleType") + 
+plotColData(tse,"sum","X.SampleID", colour_by = "SampleType") +
     theme(axis.text.x = element_text(angle = 45, hjust=1)) +
-    labs(y = "Library size (N)", x = "Sample ID") 	    
+    labs(y = "Library size (N)", x = "Sample ID")
 ```
 
 ```{r plot-viz-lib-size-3, fig.width=8, fig.height=4, fig.cap="Library sizes per sample type."}
-plotColData(tse,"sum","SampleType", colour_by = "SampleType") + 
+plotColData(tse,"sum","SampleType", colour_by = "SampleType") +
     theme(axis.text.x = element_text(angle = 45, hjust=1))
 ```
 
 In addition, data can be rarefied with
 [rarefyAssay](https://microbiome.github.io/mia/reference/rarefyAssay.html),
-which normalizes the samples to an equal number of reads. 
+which normalizes the samples to an equal number of reads.
 This remains controversial, however, and strategies to mitigate the information
 loss in rarefaction have been proposed [@Schloss2024rarefaction1]
 [@Schloss2024rarefaction2].
@@ -352,9 +352,9 @@ Samples might be contaminated with exogenous sequences. The impact of
 each contaminant can be estimated based on its frequencies and
 concentrations across the samples.
 
-The following 
+The following
 [decontam functions](https://microbiome.github.io/mia/reference/isContaminant.html)
 are based on [@davis2018simple] and support such functionality:
 
-* `isContaminant`, `isNotContaminant`  
-* `addContaminantQC`, `addNotContaminantQC`  
+* `isContaminant`, `isNotContaminant`
+* `addContaminantQC`, `addNotContaminantQC`
diff --git a/inst/pages/resources.qmd b/inst/pages/resources.qmd
index 735e8d142..30e573c7f 100644
--- a/inst/pages/resources.qmd
+++ b/inst/pages/resources.qmd
@@ -4,34 +4,34 @@
 
 ### Data container documentation
 
-* SingleCellExperiment [@R_SingleCellExperiment]  
+* SingleCellExperiment [@R_SingleCellExperiment]
 
-   + [Online tutorial](https://bioconductor.org/packages/release/bioc/vignettes/SingleCellExperiment/inst/doc/intro.html)  
-   + [Project page](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html)  
-   + [Publication](https://doi.org/10.1038/s41592-019-0654-x)  
+   + [Online tutorial](https://bioconductor.org/packages/release/bioc/vignettes/SingleCellExperiment/inst/doc/intro.html)
+   + [Project page](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html)
+   + [Publication](https://doi.org/10.1038/s41592-019-0654-x)
 
-* SummarizedExperiment [@R_SummarizedExperiment]  
+* SummarizedExperiment [@R_SummarizedExperiment]
 
-   + [Online tutorial](https://bioconductor.org/packages/release/bioc/vignettes/SummarizedExperiment/inst/doc/SummarizedExperiment.html)  
-   + [Project page](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html)  
+   + [Online tutorial](https://bioconductor.org/packages/release/bioc/vignettes/SummarizedExperiment/inst/doc/SummarizedExperiment.html)
+   + [Project page](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html)
 
-* TreeSummarizedExperiment [@R_TreeSummarizedExperiment]  
+* TreeSummarizedExperiment [@R_TreeSummarizedExperiment]
 
-   + [Online tutorial](https://bioconductor.org/packages/release/bioc/vignettes/TreeSummarizedExperiment/inst/doc/Introduction_to_treeSummarizedExperiment.html)  
-   + [Project page](https://www.bioconductor.org/packages/release/bioc/html/TreeSummarizedExperiment.html)  
-   + [Publication](https://doi.org/10.12688/f1000research.26669.2)  
+   + [Online tutorial](https://bioconductor.org/packages/release/bioc/vignettes/TreeSummarizedExperiment/inst/doc/Introduction_to_treeSummarizedExperiment.html)
+   + [Project page](https://www.bioconductor.org/packages/release/bioc/html/TreeSummarizedExperiment.html)
+   + [Publication](https://doi.org/10.12688/f1000research.26669.2)
 
-* MultiAssayExperiment [@Ramos2017]  
+* MultiAssayExperiment [@Ramos2017]
 
-   + [Online tutorial](https://www.bioconductor.org/packages/release/bioc/vignettes/MultiAssayExperiment/inst/doc/MultiAssayExperiment.html)  
-   + [Project page](https://bioconductor.org/packages/release/bioc/html/MultiAssayExperiment.html)     
-   + [Publication](https://doi.org/10.1158/0008-5472.CAN-17-0344)  
+   + [Online tutorial](https://www.bioconductor.org/packages/release/bioc/vignettes/MultiAssayExperiment/inst/doc/MultiAssayExperiment.html)
+   + [Project page](https://bioconductor.org/packages/release/bioc/html/MultiAssayExperiment.html)
+   + [Publication](https://doi.org/10.1158/0008-5472.CAN-17-0344)
 
 ### Other relevant containers
 
-* [DataFrame](https://rdrr.io/bioc/S4Vectors/man/DataFrame-class.html) which behaves similarly to `data.frame`, yet efficient and fast when used with large datasets.  
-* [DNAString](https://rdrr.io/bioc/Biostrings/man/DNAString-class.html) along with `DNAStringSet`,`RNAString` and `RNAStringSet`  efficient storage and handling of long biological sequences are offered within the Biostrings package [@R_Biostrings].    
-* GenomicRanges ([@GenomicRanges2013]) offers an efficient representation and manipulation of genomic annotations and alignments, see e.g. `GRanges` and `GRangesList` at [An Introduction to the GenomicRangesPackage](https://bioconductor.org/packages/release/bioc/vignettes/GenomicRanges/inst/doc/GenomicRangesIntroduction.html).    
+* [DataFrame](https://rdrr.io/bioc/S4Vectors/man/DataFrame-class.html) which behaves similarly to `data.frame`, yet efficient and fast when used with large datasets.
+* [DNAString](https://rdrr.io/bioc/Biostrings/man/DNAString-class.html) along with `DNAStringSet`,`RNAString` and `RNAStringSet`  efficient storage and handling of long biological sequences are offered within the Biostrings package [@R_Biostrings].
+* GenomicRanges ([@GenomicRanges2013]) offers an efficient representation and manipulation of genomic annotations and alignments, see e.g. `GRanges` and `GRangesList` at [An Introduction to the GenomicRangesPackage](https://bioconductor.org/packages/release/bioc/vignettes/GenomicRanges/inst/doc/GenomicRangesIntroduction.html).
 
 [NGS Analysis Basics](http://girke.bioinformatics.ucr.edu/GEN242/tutorials/rsequences/rsequences/) provides a walk-through of the above-mentioned features with detailed examples.
 
@@ -88,9 +88,9 @@ of the datasets increased. Further details on such results can be found in the
 The phyloseq container provides analogous methods to TreeSE. The following
 material can be used to familiarize with such alternative methods:
 
- * [List of R tools for microbiome analysis](https://microsud.github.io/Tools-Microbiome-Analysis/)  
- * phyloseq [@McMurdie2013]  
- * [microbiome tutorial](http://microbiome.github.io/tutorials/)  
+ * [List of R tools for microbiome analysis](https://microsud.github.io/Tools-Microbiome-Analysis/)
+ * phyloseq [@McMurdie2013]
+ * [microbiome tutorial](http://microbiome.github.io/tutorials/)
  * [microbiomeutilities](https://microsud.github.io/microbiomeutilities/)
  * [phyloseq/TreeSE cheatsheet](https://microbiome.github.io/OMA/docs/devel/pages/97_extra_materials.html#cheatsheet)
  * Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses [@Callahan2016].
@@ -104,10 +104,10 @@ for a kickstart to R programming. Further support resources are
 available through the Bioconductor
 project [@Huber2015].
 
- * [Base R and RStudio cheatsheets](https://geomoer.github.io/moer-base-r/cheatsheet.html)  
- * [Package-specific cheatsheets](https://www.rstudio.com/resources/cheatsheets/)  
- * [Visualization with ggplot2](https://ggplot2.tidyverse.org/)   
- * [R graphics cookbook](http://www.cookbook-r.com/Graphs/)  
+ * [Base R and RStudio cheatsheets](https://geomoer.github.io/moer-base-r/cheatsheet.html)
+ * [Package-specific cheatsheets](https://www.rstudio.com/resources/cheatsheets/)
+ * [Visualization with ggplot2](https://ggplot2.tidyverse.org/)
+ * [R graphics cookbook](http://www.cookbook-r.com/Graphs/)
 
 
 ### Bioconductor Classes {#sec-bioc_intro}
@@ -119,17 +119,17 @@ project [@Huber2015].
 S4 class system has brought several useful features to the
 object-oriented programming paradigm within R, and it is constantly
 deployed in R/Bioconductor packages [@Huber2015].
-    
+
 |   Online Document:
 
-* Hervé Pagès, [A quick overview of the S4 class system](https://bioconductor.org/packages/release/bioc/vignettes/S4Vectors/inst/doc/S4QuickOverview.pdf).  
-* Laurent Gatto, [A practical tutorial on S4 programming](https://bioconductor.org/help/course-materials/2013/CSAMA2013/friday/afternoon/S4-tutorial.pdf)  
-* How S4 Methods Work [@Chambers2006]  
+* Hervé Pagès, [A quick overview of the S4 class system](https://bioconductor.org/packages/release/bioc/vignettes/S4Vectors/inst/doc/S4QuickOverview.pdf).
+* Laurent Gatto, [A practical tutorial on S4 programming](https://bioconductor.org/help/course-materials/2013/CSAMA2013/friday/afternoon/S4-tutorial.pdf)
+* How S4 Methods Work [@Chambers2006]
 
 |   Books:
 
-* John M. Chambers. Software for Data Analysis: Programming with R. Springer, New York, 2008. ISBN-13 978-0387759357 [@Chambers2008]  
-* I Robert Gentleman. R Programming for Bioinformatics. Chapman & Hall/CRC, New York, 2008. ISBN-13 978-1420063677 [@gentleman2008r]  
+* John M. Chambers. Software for Data Analysis: Programming with R. Springer, New York, 2008. ISBN-13 978-0387759357 [@Chambers2008]
+* I Robert Gentleman. R Programming for Bioinformatics. Chapman & Hall/CRC, New York, 2008. ISBN-13 978-1420063677 [@gentleman2008r]
 
 ## Reproducible reporting with Quarto {#sec-quarto}
 
@@ -159,21 +159,21 @@ to interactive reproducible reporting.
 Being able to use Quarto in R partly relies on your previous knowledge of
 Rmarkdown. The following resources can help you get familiar with Rmarkdown:
 
- * [Online tutorial](https://www.markdowntutorial.com/)  
- * [Cheatsheet](https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf)  
- * [Documentation](https://rmarkdown.rstudio.com/lesson-1.html)  
- * [Dr. C Titus Brown's tutorial](https://rpubs.com/marschmi/RMarkdown)  
+ * [Online tutorial](https://www.markdowntutorial.com/)
+ * [Cheatsheet](https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf)
+ * [Documentation](https://rmarkdown.rstudio.com/lesson-1.html)
+ * [Dr. C Titus Brown's tutorial](https://rpubs.com/marschmi/RMarkdown)
 
-**Figure sources:** 
+**Figure sources:**
 
 **Original article**
 
-- Huang R _et al_. (2021) TreeSummarizedExperiment: a S4 class 
+- Huang R _et al_. (2021) TreeSummarizedExperiment: a S4 class
 for data with hierarchical structure. F1000Research 9:1246. [@Huang2021]
 
 **Reference Sequence slot extension**
 
-- Lahti L _et al_. (2020) [Upgrading the R/Bioconductor ecosystem for microbiome 
+- Lahti L _et al_. (2020) [Upgrading the R/Bioconductor ecosystem for microbiome
 research](https://doi.org/10.7490/
 f1000research.1118447.1) F1000Research 9:1464 (slides).
 
diff --git a/inst/pages/session_info.qmd b/inst/pages/session_info.qmd
index 8479dfc5d..342946188 100644
--- a/inst/pages/session_info.qmd
+++ b/inst/pages/session_info.qmd
@@ -2,15 +2,15 @@
 
 ## Docker image {#sec-docker-image}
 
-A `Docker` image built from this repository is available here: 
+A `Docker` image built from this repository is available here:
 
 👉 [ghcr.io/microbiome/oma](https://ghcr.io/microbiome/oma) 🐳
 
 ::: {.callout-tip icon='true'}
 ## Get started now 🎉
 
-You can get access to all the packages used in this book in < 1 minute, 
-using this command in a terminal: 
+You can get access to all the packages used in this book in < 1 minute,
+using this command in a terminal:
 
 ```{sh "docker", filename="bash"}
 #| eval: false
@@ -21,7 +21,7 @@ docker run -it ghcr.io/microbiome/oma:devel R
 
 ## RStudio Server {-}
 
-An RStudio Server instance can be initiated from the `Docker` image as follows: 
+An RStudio Server instance can be initiated from the `Docker` image as follows:
 
 ```{sh "rstudio", filename="bash"}
 #| eval: false
@@ -32,7 +32,7 @@ docker run \
     ghcr.io/microbiome/oma:devel
 ```
 
-The initiated RStudio Server instance will be available at 
+The initiated RStudio Server instance will be available at
 [https://localhost:8787](https://localhost:8787).
 
 ## Session info {#sec-session-info}
@@ -44,7 +44,7 @@ The initiated RStudio Server instance will be available at
 ```{r "session info"}
 #| cache: false
 sessioninfo::session_info(
-    installed.packages()[,"Package"], 
+    installed.packages()[,"Package"],
     include_base = TRUE
 )
 ```
diff --git a/inst/pages/taxonomy.qmd b/inst/pages/taxonomy.qmd
index 54c8903ed..010fd4830 100644
--- a/inst/pages/taxonomy.qmd
+++ b/inst/pages/taxonomy.qmd
@@ -19,7 +19,7 @@ and annotation data used.
 Therefore, the mia package expects a loose assembly of taxonomic information
 and assumes certain key aspects:
 
-* Taxonomic information is given as character vectors or factors in the 
+* Taxonomic information is given as character vectors or factors in the
 `rowData` of a `SummarizedExperiment` object.
 * The columns containing the taxonomic information must be named `domain`,
 `kingdom`, `phylum`, `class`, `order`, `family`, `genus`, `species` or with
@@ -33,8 +33,8 @@ clusters of taxa that co-vary across samples.
 ## Assigning taxonomic information
 
 There are a number of methods to assign taxonomic information. We like to give
-a short introduction about the methods available without ranking one over the 
-other. This has to be your choice based on the result for the individual 
+a short introduction about the methods available without ranking one over the
+other. This has to be your choice based on the result for the individual
 dataset.
 
 ### DADA2
@@ -61,7 +61,7 @@ found on the [DECIPHER website](http://www2.decipher.codes/Classification.html).
 checkTaxonomy(tse)
 ```
 
-Since the `rowData` can contain other data, `taxonomyRanks` will return the 
+Since the `rowData` can contain other data, `taxonomyRanks` will return the
 columns `mia` assumes to contain the taxonomic information.
 
 ```{r}
@@ -74,7 +74,7 @@ This can then be used to subset the `rowData` to columns needed.
 rowData(tse)[, taxonomyRanks(tse)]
 ```
 
-`taxonomyRankEmpty` checks for empty values in the given `rank` and returns a 
+`taxonomyRankEmpty` checks for empty values in the given `rank` and returns a
 logical vector of `length(x)`.
 
 ```{r}
diff --git a/inst/pages/training.qmd b/inst/pages/training.qmd
index fd85f2e00..702fdfb47 100644
--- a/inst/pages/training.qmd
+++ b/inst/pages/training.qmd
@@ -22,7 +22,7 @@ We recommend installing and setting up the relevant software packages on
 your own computer as this will support later use. The essential
 components to install include:
 
-- [R (the latest official release)](https://www.r-project.org/) 
+- [R (the latest official release)](https://www.r-project.org/)
 
 - [RStudio](https://www.rstudio.com/products/rstudio/download/);
 choose "Rstudio Desktop" to download the latest version. Check the
@@ -39,7 +39,7 @@ further examples from this tutorial, modifying and applying these
 techniques to your own data. Plain source code for the individual chapters
 of this book are available via
 [Github](https://github.com/microbiome/OMA/tree/master/R)
-  
+
 - If you have access to CSC notebook you can find instructions from
 [here](https://microbiome.github.io/outreach/).
 
@@ -49,8 +49,8 @@ We encourage you to familiarize yourself with the material and test examples
 in advance but this is optional:
 
 - [Introduction to data analysis with R and Bioconductor](https://carpentries-incubator.github.io/bioc-intro/) (for beginners with R)
- 
-- [Short online videos](https://www.youtube.com/playlist?list=PLjiXAZO27elAJEptP59BN3whVJ61XIkST) on microbiome data science with R/Bioconductor  
+
+- [Short online videos](https://www.youtube.com/playlist?list=PLjiXAZO27elAJEptP59BN3whVJ61XIkST) on microbiome data science with R/Bioconductor
 
 - [Quarto presentations](https://microbiome.github.io/outreach/index.html)
 
diff --git a/inst/pages/transformation.qmd b/inst/pages/transformation.qmd
index c56232980..bd0b83cd1 100644
--- a/inst/pages/transformation.qmd
+++ b/inst/pages/transformation.qmd
@@ -44,7 +44,7 @@ references.
    high-throughput assays (16S, metagenomic sequencing) is
    compositional by nature, even if the data is provided as counts
    [@Gloor2017].
- 
+
  * 'clr' Centered log ratio transformation [@Aitchison1986] is used to
    reduce data skewness and compositionality bias in relative
    abundances, while bringing the data to the logarithmic scale. This
@@ -69,14 +69,14 @@ references.
    Karwowska2024].
 
  * 'standardize' Standardize(or 'z-score') transformation scales data to zero
-   mean and unit variance. This is used to bring features (or samples) to more 
+   mean and unit variance. This is used to bring features (or samples) to more
    comparable levels in terms of mean and scale of the values. This can enhance
    visualization and interpretation of the data
 
 * 'log', 'log2', 'log10' Logarithmic transformations, used e.g. to
    reduce data skewness. With compositional data, the `clr` (or `rclr`)
    transformation is often preferred.
-   
+
  * 'hellinger' Hellinger transformation is equal to the square root of
    relative abundances. This ecological transformation can be useful
    if we are interested in changes in relative abundances.
@@ -102,16 +102,16 @@ available in the function
 
 ::: {.callout-important}
 
-Pseudocount is a small non-negative value added to the normalized data to avoid 
-taking the logarithm of zero. It's value can have a significant impact on the results when applying 
-a logarithm transformation to normalized data, as the logarithm transformation 
+Pseudocount is a small non-negative value added to the normalized data to avoid
+taking the logarithm of zero. It's value can have a significant impact on the results when applying
+a logarithm transformation to normalized data, as the logarithm transformation
 is a nonlinear operation that can fundamentally change the data distribution [@Costea2014].
 
 
-Pseudocount should be chosen consistently across all normalization methods being 
-compared, for example, by setting it to a value smaller than the minimum abundance 
-value before transformation. Some tools, like ancombc2, take into account the effect 
-of the pseudocount by performing sensitivity tests using multiple pseudocount 
+Pseudocount should be chosen consistently across all normalization methods being
+compared, for example, by setting it to a value smaller than the minimum abundance
+value before transformation. Some tools, like ancombc2, take into account the effect
+of the pseudocount by performing sensitivity tests using multiple pseudocount
 values. See [@sec-differential-abundance].
 
 :::
@@ -125,12 +125,12 @@ data("GlobalPatterns", package = "mia")
 tse <- GlobalPatterns
 
 # Transform "counts" assay to relative abundances ("relabundance"), with
-# pseudocount 1 
+# pseudocount 1
 tse <- transformAssay(
      tse, assay.type = "counts", method = "relabundance", pseudocount = 1)
 
 # Transform relative abundance assay ("relabundance") to "clr", using
-# pseudocount if necessary; name the resulting assay to "clr" 
+# pseudocount if necessary; name the resulting assay to "clr"
 tse <- transformAssay(
     x = tse, assay.type = "relabundance", method = "clr", pseudocount = TRUE,
     name = "clr")
@@ -177,4 +177,3 @@ You can find more information on normalization from
 [OSCA book](https://bioconductor.org/books/3.18/OSCA.basic/normalization.html#normalization-transformation).
 
 :::
-
diff --git a/inst/pages/visualization.qmd b/inst/pages/visualization.qmd
index 25abe54f9..a7a057f9d 100644
--- a/inst/pages/visualization.qmd
+++ b/inst/pages/visualization.qmd
@@ -306,14 +306,14 @@ plots <- plotAbundance(
     order.col.by = "Clostridiales",
     features = "SampleType")
 
-# Modify the legend of the first plot to be smaller 
+# Modify the legend of the first plot to be smaller
 plots[[1]] <- plots[[1]] +
     theme(
         legend.key.size = unit(0.3, 'cm'),
         legend.text = element_text(size = 6),
         legend.title = element_text(size = 8))
 
-# Modify the legend of the second plot to be smaller 
+# Modify the legend of the second plot to be smaller
 plots[[2]] <- plots[[2]] +
     theme(
         legend.key.height = unit(0.3, 'cm'),
@@ -324,17 +324,17 @@ plots[[2]] <- plots[[2]] +
 
 # Load required packages
 library(ggpubr)
-library(patchwork) 
+library(patchwork)
 # Combine legends
 legend <- wrap_plots(
     as_ggplot(get_legend(plots[[1]])),
     as_ggplot(get_legend(plots[[2]])),
-    ncol = 1) 
+    ncol = 1)
 
 # Removes legends from the plots
 plots[[1]] <- plots[[1]] + theme(legend.position = "none")
 plots[[2]] <- plots[[2]] +
-    theme(legend.position = "none", axis.title.x=element_blank()) 
+    theme(legend.position = "none", axis.title.x=element_blank())
 
 # Combine plots
 plot <- wrap_plots(plots[[2]], plots[[1]], ncol = 1, heights = c(2, 10))
@@ -371,7 +371,7 @@ tse_phylum_subset <- transformAssay(
     assay.type = "counts", pseudocount = 1)
 # Does standardize-transformation
 tse_phylum_subset <- transformAssay(
-    tse_phylum_subset, assay.type = "clr", MARGIN = "rows", 
+    tse_phylum_subset, assay.type = "clr", MARGIN = "rows",
     method = "standardize", name = "clr_z")
 
 # Get n most abundant taxa, and subsets the data by them
@@ -397,7 +397,7 @@ taxa_hclust <- hclust(dist(mat), method = "complete")
 taxa_tree <- as.phylo(taxa_hclust)
 
 # Plot taxa tree
-taxa_tree <- ggtree(taxa_tree) + 
+taxa_tree <- ggtree(taxa_tree) +
   theme(plot.margin=margin(0,0,0,0)) # removes margins
 
 # Get order of taxa in plot
@@ -407,7 +407,7 @@ taxa_ordered <- get_taxa_name(taxa_tree)
 # taxa_tree
 ```
 
-Based on phylo tree, we decide to create three clusters. 
+Based on phylo tree, we decide to create three clusters.
 
 ```{r}
 #| label = "pheatmap3"
@@ -419,7 +419,7 @@ taxa_clusters <- data.frame(clusters = taxa_clusters)
 taxa_clusters$clusters <- factor(taxa_clusters$clusters)
 
 # Order data so that it's same as in phylo tree
-taxa_clusters <- taxa_clusters[taxa_ordered, , drop = FALSE] 
+taxa_clusters <- taxa_clusters[taxa_ordered, , drop = FALSE]
 
 # Prints taxa and their clusters
 taxa_clusters
@@ -449,7 +449,7 @@ sample_hclust <- hclust(dist(t(mat)), method = "complete")
 sample_tree <- as.phylo(sample_hclust)
 
 # Plot sample tree
-sample_tree <- ggtree(sample_tree) + layout_dendrogram() + 
+sample_tree <- ggtree(sample_tree) + layout_dendrogram() +
   theme(plot.margin=margin(0,0,0,0)) # removes margins
 
 # Get order of samples in plot
@@ -465,9 +465,9 @@ sample_clusters <- factor(cutree(tree = sample_hclust, k = 3))
 sample_data <- data.frame(clusters = sample_clusters)
 
 # Order data so that it's same as in phylo tree
-sample_data <- sample_data[samples_ordered, , drop = FALSE] 
+sample_data <- sample_data[samples_ordered, , drop = FALSE]
 
-# Order data based on 
+# Order data based on
 tse_phylum_subset <- tse_phylum_subset[ , rownames(sample_data)]
 
 # Add sample type data
@@ -482,12 +482,12 @@ Now we can create heatmap with additional annotations.
 #| label = "pheatmap6"
 # Determines the scaling of colors
 # Scale colors
-breaks <- seq(-ceiling(max(abs(mat))), ceiling(max(abs(mat))), 
+breaks <- seq(-ceiling(max(abs(mat))), ceiling(max(abs(mat))),
               length.out = ifelse( max(abs(mat))>5, 2*ceiling(max(abs(mat))), 10 ) )
 colors <- colorRampPalette(c("darkblue", "blue", "white", "red", "darkred"))(length(breaks)-1)
 
 pheatmap(
-    mat, annotation_row = taxa_clusters, 
+    mat, annotation_row = taxa_clusters,
     annotation_col = sample_data,
     breaks = breaks,
     color = colors)
@@ -505,11 +505,11 @@ metadata(tse_phylum_subset)$anno_colors$SampleType <- c(
 
 # Create a plot
 sechm(
-    tse_phylum_subset, 
-    features = rownames(tse_phylum_subset), 
-    assayName = "clr", 
-    do.scale = TRUE, 
-    top_annotation = c("SampleType"), 
+    tse_phylum_subset,
+    features = rownames(tse_phylum_subset),
+    assayName = "clr",
+    do.scale = TRUE,
+    top_annotation = c("SampleType"),
     gaps_at = "SampleType",
     cluster_cols = TRUE, cluster_rows = TRUE)
 ```
@@ -525,7 +525,7 @@ taxa_clusters$Feature <- rownames(taxa_clusters)
 taxa_clusters$Feature <- factor(taxa_clusters$Feature, levels = taxa_clusters$Feature)
 
 # Create annotation plot
-row_annotation <- ggplot(taxa_clusters) + 
+row_annotation <- ggplot(taxa_clusters) +
     geom_tile(aes(x = NA, y = Feature, fill = clusters)) +
     coord_equal(ratio = 1) +
     theme(
@@ -584,7 +584,7 @@ limits <- c(-maxval, maxval)
 breaks <- seq(from = min(limits), to = max(limits), by = 0.5)
 colours <- c("darkblue", "blue", "white", "red", "darkred")
 
-heatmap <- ggplot(melted_mat) + 
+heatmap <- ggplot(melted_mat) +
     geom_tile(aes(x = Sample, y = Taxa, fill = clr_z)) +
     theme(
         axis.title.y = element_blank(),
@@ -595,10 +595,10 @@ heatmap <- ggplot(melted_mat) +
         legend.key.height = unit(1, 'cm')
     ) +
     scale_fill_gradientn(
-        name = "CLR + Z transform", 
-        breaks = breaks, 
-        limits = limits, 
-        colours = colours) + 
+        name = "CLR + Z transform",
+        breaks = breaks,
+        limits = limits,
+        colours = colours) +
     scale_y_discrete(position = "right")
 
 heatmap
@@ -629,16 +629,16 @@ plot <- row_annotation +
     plot_layout(
         design = design, guides = "collect",
         # Specify layout, collect legends
-        
+
         # Adjust widths and heights to align plots.
         # When annotation plot is larger, it might not fit into
         # its column/row.
         # Then you need to make column/row larger.
-        
+
         # Relative widths and heights of each column and row:
         # Currently, the width of the first column is 15 % and the height of
         # first two rows are 30 % the size of others
-        
+
         # To get this work most of the times, you can adjust all sizes to be
         # 1, i.e. equal, but then the gaps between plots are larger.
         widths = c(0.15, 1, 1),
@@ -666,9 +666,9 @@ design <- c(
 # plot(design)
 
 # Combine plots
-plot <- taxa_tree + 
+plot <- taxa_tree +
     row_annotation +
-    sample_tree + 
+    sample_tree +
     sample_clusters_annotation +
     sample_types_annotation +
     heatmap +
@@ -723,7 +723,7 @@ tse_Genus <- runMDS(
     method = "bray",
     ncomponents = 3, # calculates three principal coordinates
     assay.type = "relative_abundance", # calculates Bray-Curtis from relative abundace
-    name = "MDS_bray") #name of the PCoA within the tse 
+    name = "MDS_bray") #name of the PCoA within the tse
 
 e <- attr(reducedDim(tse_Genus, "MDS_bray"), "eig")
 rel_eig <- e / sum(e[e > 0])
diff --git a/inst/pages/wrangling.qmd b/inst/pages/wrangling.qmd
index 1f5fdd560..d92f27878 100644
--- a/inst/pages/wrangling.qmd
+++ b/inst/pages/wrangling.qmd
@@ -13,7 +13,7 @@ SummarizedExperiment objects. See [@sec-containers] for basics of `TreeSE`.
 ## Subsetting {#sec-treese_subsetting}
 
 **Subsetting** data helps to draw the focus of an analysis on particular
-sets of samples and / or features. When dealing with large datasets, 
+sets of samples and / or features. When dealing with large datasets,
 the subset of interest can be extracted and investigated
 separately. This might improve performance and reduce the
 computational load.
@@ -42,7 +42,7 @@ the number of rows to decrease.
 data("GlobalPatterns", package="mia")
 tse <- GlobalPatterns
 # Show dimensions (features x samples)
-dim(tse) 
+dim(tse)
 ```
 
 ### Subset by sample (column-wise)
@@ -52,7 +52,7 @@ samples of human origin (feces, skin or tongue), stored as `SampleType` within
 `colData(tse)` as well as in `tse`.
 
 First, we would like to see all the possible values that `SampleType` can have
-and how frequent those are: 
+and how frequent those are:
 
 ```{r}
 # Inspect possible values for SampleType
@@ -67,7 +67,7 @@ tse$SampleType %>% table()
 ```{r echo = FALSE}
 # Show the frequency of each value
 tse$SampleType %>% table() %>% kable() %>%
-    kableExtra::kable_styling("striped", latex_options="scale_down") %>% 
+    kableExtra::kable_styling("striped", latex_options="scale_down") %>%
     kableExtra::scroll_box(width = "100%")
 ```
 
@@ -116,7 +116,7 @@ possible need for agglomeration.
 
 As previously, we would first like to see all the possible values that `Phylum`
 can have and how frequent those are:
-  
+
 ```{r}
 # Inspect possible values for phylum
 # and show the first five values
@@ -126,9 +126,9 @@ cat(paste(unique(rowData(tse)$Phylum)[1:5], collapse = "\n"))
 
 ```{r}
 # Show the frequency of each value
-rowData(tse)$Phylum %>% table() %>% head() %>% 
+rowData(tse)$Phylum %>% table() %>% head() %>%
   kable() %>%
-  kableExtra::kable_styling("striped", latex_options="scale_down") %>% 
+  kableExtra::kable_styling("striped", latex_options="scale_down") %>%
   kableExtra::scroll_box(width = "100%")
 
 ```
@@ -184,7 +184,7 @@ features of phyla Actinobacteria or Chlamydiae.
 # Subset by sample and feature and remove NAs
 tse_sub <- tse[
   rowData(tse)$Phylum %in% c("Actinobacteria", "Chlamydiae") &
-      !is.na(rowData(tse)$Phylum), 
+      !is.na(rowData(tse)$Phylum),
   tse$SampleType %in% c("Feces", "Skin", "Tongue")]
 
 # Show dimensions
@@ -211,12 +211,12 @@ the samples. This can occur, for example, after data subsetting. In certain
 analyses, we might want to remove those instances.
 
 ```{r}
-# Agglomerate data at Genus level 
+# Agglomerate data at Genus level
 tse_genus <- agglomerateByRank(tse, rank = "Genus", na.rm = FALSE)
 # List bacteria that we want to include
 genera <- c(
-    "Class:Thermoprotei", 
-    "Genus:Sulfolobus", 
+    "Class:Thermoprotei",
+    "Genus:Sulfolobus",
     "Genus:Sediminicola")
 # Subset data
 tse_sub <- tse_genus[genera, ]
@@ -255,7 +255,7 @@ sum(rowSums(assay(tse_sub, "counts")) == 0)
 ```
 
 We can see that there are bacteria that are not present in the samples we chose.
-We can remove those bacteria from the data. 
+We can remove those bacteria from the data.
 
 ```{r}
 # Take only those bacteria that are present
@@ -267,10 +267,10 @@ tse_sub
 
 ## Splitting
 
-You can split the data based on variables by using the functions 
+You can split the data based on variables by using the functions
 `agglomerateByRanks` and `splitOn`.
 
-`agglomerateByRanks` splits the data based on taxonomic ranks. Since the 
+`agglomerateByRanks` splits the data based on taxonomic ranks. Since the
 elements of the output list share columns, they can be stored into `altExp` by
 setting `as.list = FALSE`.
 
@@ -278,7 +278,7 @@ setting `as.list = FALSE`.
 agglomerateByRanks(tse, as.list = TRUE)
 ```
 
-If you want to split the data based on another variable than taxonomic rank, use 
+If you want to split the data based on another variable than taxonomic rank, use
 `splitOn`. It works for row-wise and column-wise splitting.
 
 ```{r splitOn}
@@ -315,8 +315,8 @@ head(tse$NewVariable)
 ## Melting data
 
 For several custom analysis and visualization packages, such as those from
-`tidyverse`, the `SE` data can be converted to a long data.frame format with 
-`meltSE`.    
+`tidyverse`, the `SE` data can be converted to a long data.frame format with
+`meltSE`.
 
 ```{r}
 library(mia)
@@ -337,9 +337,9 @@ molten_tse[, 1:5]
 
 `mia` package has `mergeSEs` function that merges multiple
 `SummarizedExperiment` objects. For example, it is possible to combine
-multiple `TreeSE` objects which each includes one sample. 
+multiple `TreeSE` objects which each includes one sample.
 
-`mergeSEs` works much like standard joining operations. It combines rows and 
+`mergeSEs` works much like standard joining operations. It combines rows and
 columns and allows you to specify the merging method.
 
 ```{r merge1}