Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added post - gene set test #23

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
---
title: "Test and Visualise Gene Enrichment with GSEA over MSigDB"
author: 'Stefano Mangiola '
date: '2021-07-12'
output:
html_document:
df_print: paged
categories: Case study
tags:
- tidytranscriptomics
- gene enrichment
- GSEA
- MSigDB
- differential expression
lastmod: '2021-07-12T12:29:10+10:00'
keywords: []
description: ''
comment: no
toc: no
autoCollapseToc: no
postMetaInFooter: no
hiddenFromHomePage: no
contentCopyright: no
reward: no
mathjax: no
mathjaxEnableSingleDollar: no
mathjaxEnableAutoNumber: no
hideHeaderAndFooter: no
flowchartDiagrams:
enable: no
options: ''
sequenceDiagrams:
enable: no
options: ''
slug: []
---

```{r message=FALSE, warning=FALSE}
# dataset
library(airway)

# tidyverse core packages
library(tibble)
library(dplyr)
library(tidyr)
library(readr)
library(stringr)
library(ggplot2)
library(purrr)

# tidyverse-friendly packages
library(plotly)
library(ggrepel)
library(GGally)
library(tidybulk)
library(tidySummarizedExperiment) # we'll load this below to show what it can do
library(enrichplot)
library(patchwork)
```

tidySummarizedExperiment provides a bridge between Bioconductor [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) [@morgan2020summarized] and the tidyverse [@wickham2019welcome]. It enables viewing the
Bioconductor *SummarizedExperiment* object as a tidyverse tibble, and provides SummarizedExperiment-compatible *dplyr*, *tidyr*, *ggplot* and *plotly* functions. This allows users to get the best of both Bioconductor and tidyverse worlds.


Here we will demonstrate performing a bulk RNA sequencing analysis using *tidySummarizedExperiment* and *tidybulk*. We will use data from the *airway* package, which comes from the paper by [@himes2014rna]. It includes 8 samples from human airway smooth muscle cells, from 4 cell lines. For each cell line treated (with dexamethasone) and untreated (negative control) a sample has undergone RNA sequencing and gene counts have been generated.


```{r}
# load airway RNA sequencing data
data(airway)
# take a look
airway
```

## Full pipeline

In one modular step it is possible to go from raw counts to enriched pathways

```{r}
airway %>%

# Annotate
mutate(entrez = AnnotationDbi::mapIds(org.Hs.eg.db::org.Hs.eg.db,
keys = feature,
keytype = "ENSEMBL",
column = "ENTREZID",
multiVals = "first"
)) %>%

# Filter
keep_abundant(factor_of_interest = dex) %>%

# Test differential gene transcript abundance
test_differential_abundance(
~ dex + cell,
method="edger_robust_likelihood_ratio",
test_above_log2_fold_change = 1,
) %>%

# Test gene enrichment
filter(PValue %>% is.na %>% `!`) %>%
test_gene_rank(
.entrez = entrez,
.arrange_desc = logFC ,
species="Homo sapiens",
gene_sets = c("H", "C2", "C5")
)
```

## Step-by-step

We'll set up the airway data for our RNA sequencing analysis. We'll create a column with shorter sample names and a column with entrez ID. We can get the entrez ID for these Ensembl gene ids using the Bioconductor annotation package for human, `org.Hs.eg.db`.


```{r}
# setup data workflow
counts <-
airway %>%
mutate(entrez = AnnotationDbi::mapIds(org.Hs.eg.db::org.Hs.eg.db,
keys = feature,
keytype = "ENSEMBL",
column = "ENTREZID",
multiVals = "first"
))
# take a look
counts
```

We filter out lowly expressed genes using tidybulk `keep_abundant` or `identify_abundant`. These functions can use the *edgeR* `filterByExpr` function described in [@law2016rna] to automatically identify the genes with adequate abundance for differential expression testing.


```{r}
# Filtering counts
counts_abundant <- counts %>%
keep_abundant(factor_of_interest = dex)

# take a look
counts_abundant
```

*tidybulk* integrates several popular methods for differential transcript abundance testing: the edgeR quasi-likelihood [@chen2016reads] (tidybulk default method), edgeR likelihood ratio [@mccarthy2012differential], limma-voom [@law2014voom] and DESeq2 [@love2014moderated]. A common question researchers have is which method to choose. With tidybulk, we can easily run multiple methods and compare.

We give `test_differential_abundance` our tidybulk counts object and a formula, specifying the column that contains our groups to be compared. If all our samples were from the same cell line, and there were no additional factors contributing variance, such as batch differences, we could use the formula `~ dex`. However, each treated and untreated sample is from a different cell line, so we add the cell line as an additional factor `~ dex + cell`.

```{r message=FALSE}
de_all <-
counts_abundant %>%
test_differential_abundance(
~ dex + cell,
method="edger_robust_likelihood_ratio",
test_above_log2_fold_change = 1,
)

```

Execute `GSEA` using `MSigDB`, `clusterProfiler` and `enrichplot`

```{r}

de_all_gene_rank =
de_all %>%
filter(PValue %>% is.na %>% `!`) %>%
test_gene_rank(
.entrez = entrez,
.arrange_desc = logFC ,
species="Homo sapiens",
gene_sets = c("H", "C2", "C5")
)
```

Examine significantly enriched gene sets

```{r}
de_all_gene_rank %>%
filter(gs_cat == "C2" ) %>%
dplyr::select(-fit) %>%
unnest(test) %>%
filter(p.adjust < 0.05)

```

Visualise enrichment

```{r}
de_all_gene_rank_plots =
de_all_gene_rank %>%
unnest(test) %>%

# Select top 10
slice(1:10) %>%
mutate(plot = pmap(
list(fit, ID, idx_for_plotting, p.adjust),
~ enrichplot::gseaplot2(
..1,
geneSetID = ..3,
title = sprintf("%s \nadj pvalue %s", ..2, round(..4, 2)),
base_size = 6, rel_heights = c(1.5, 0.5), subplots = c(1, 2)
)
))

# Visualise the first plot
de_all_gene_rank_plots %>%
pull(plot) %>%
wrap_plots()

```

List all citations used in this analysis
```{r}
get_bibliography(de_all_gene_rank)

```

2 changes: 1 addition & 1 deletion blog/layouts/shortcodes/blogdown/postref.html
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{{ .Page.Permalink }}
{{ if eq (getenv "BLOGDOWN_POST_RELREF") "true" }}{{ .Page.RelPermalink }}{{ else }}{{ .Page.Permalink }}{{ end }}
9 changes: 4 additions & 5 deletions blog/public/404.html
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@



<meta name="generator" content="Hugo 0.80.0 with theme even" />
<meta name="generator" content="Hugo 0.85.0 with theme even" />


<link rel="canonical" href="https://stemangiola.github.io/tidytranscriptomics/404.html" />
Expand All @@ -35,7 +35,7 @@



<link href="/tidytranscriptomics/sass/main.min.21eb984d56dca3b1630720eb15d66f4f5c668318df572a68582d3e5d6f0540fa.css" rel="stylesheet">
<link href="/tidytranscriptomics/sass/main.min.7431b79066cbc03d3c6828a31c3df6dcfa144fc83335aaf63956af9650280d8b.css" rel="stylesheet">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fancyapps/[email protected]/dist/jquery.fancybox.min.css" integrity="sha256-7TyXnr2YU040zfSP+rEcz29ggW4j56/ujTPwjMzyqFY=" crossorigin="anonymous">


Expand All @@ -45,8 +45,7 @@
<meta property="og:url" content="https://stemangiola.github.io/tidytranscriptomics/404.html" />

<meta itemprop="name" content="404 Page not found">
<meta itemprop="description" content="The blog about tidy transcriptomics.">
<meta name="twitter:card" content="summary"/>
<meta itemprop="description" content="The blog about tidy transcriptomics."><meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="404 Page not found"/>
<meta name="twitter:description" content="The blog about tidy transcriptomics."/>

Expand Down Expand Up @@ -182,7 +181,7 @@ <h1 class="error-emoji"></h1>



<script type="text/javascript" src="/tidytranscriptomics/js/main.min.c12618f9a600c40bd024996677e951e64d3487006775aeb22e200c990006c5c7.js"></script>
<script type="text/javascript" src="/tidytranscriptomics/js/main.min.c99b103c33d1539acf3025e1913697534542c4a5aa5af0ccc20475ed2863603b.js"></script>



Expand Down
18 changes: 7 additions & 11 deletions blog/public/about/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@



<meta name="generator" content="Hugo 0.80.0 with theme even" />
<meta name="generator" content="Hugo 0.85.0 with theme even" />


<link rel="canonical" href="https://stemangiola.github.io/tidytranscriptomics/about/" />
Expand All @@ -37,7 +37,7 @@



<link href="/tidytranscriptomics/sass/main.min.21eb984d56dca3b1630720eb15d66f4f5c668318df572a68582d3e5d6f0540fa.css" rel="stylesheet">
<link href="/tidytranscriptomics/sass/main.min.7431b79066cbc03d3c6828a31c3df6dcfa144fc83335aaf63956af9650280d8b.css" rel="stylesheet">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fancyapps/[email protected]/dist/jquery.fancybox.min.css" integrity="sha256-7TyXnr2YU040zfSP+rEcz29ggW4j56/ujTPwjMzyqFY=" crossorigin="anonymous">


Expand All @@ -46,21 +46,17 @@
It makes use of a variety of open source projects including:
Cobra Viper J Walter Weatherman Cast Learn more and contribute on GitHub." />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://stemangiola.github.io/tidytranscriptomics/about/" />
<meta property="og:url" content="https://stemangiola.github.io/tidytranscriptomics/about/" /><meta property="article:section" content="" />
<meta property="article:published_time" content="2017-08-20T21:38:52+08:00" />
<meta property="article:modified_time" content="2017-08-28T21:41:52+08:00" />

<meta itemprop="name" content="About">
<meta itemprop="description" content="Hugo is a static site engine written in Go.
It makes use of a variety of open source projects including:
Cobra Viper J Walter Weatherman Cast Learn more and contribute on GitHub.">
<meta itemprop="datePublished" content="2017-08-20T21:38:52+08:00" />
Cobra Viper J Walter Weatherman Cast Learn more and contribute on GitHub."><meta itemprop="datePublished" content="2017-08-20T21:38:52+08:00" />
<meta itemprop="dateModified" content="2017-08-28T21:41:52+08:00" />
<meta itemprop="wordCount" content="32">



<meta itemprop="keywords" content="" />
<meta name="twitter:card" content="summary"/>
<meta itemprop="keywords" content="" /><meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="About"/>
<meta name="twitter:description" content="Hugo is a static site engine written in Go.
It makes use of a variety of open source projects including:
Expand Down Expand Up @@ -202,7 +198,7 @@



<script type="text/javascript" src="/tidytranscriptomics/js/main.min.c12618f9a600c40bd024996677e951e64d3487006775aeb22e200c990006c5c7.js"></script>
<script type="text/javascript" src="/tidytranscriptomics/js/main.min.c99b103c33d1539acf3025e1913697534542c4a5aa5af0ccc20475ed2863603b.js"></script>



Expand Down
9 changes: 4 additions & 5 deletions blog/public/categories/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@



<meta name="generator" content="Hugo 0.80.0 with theme even" />
<meta name="generator" content="Hugo 0.85.0 with theme even" />


<link rel="canonical" href="https://stemangiola.github.io/tidytranscriptomics/categories/" />
Expand All @@ -37,7 +37,7 @@



<link href="/tidytranscriptomics/sass/main.min.21eb984d56dca3b1630720eb15d66f4f5c668318df572a68582d3e5d6f0540fa.css" rel="stylesheet">
<link href="/tidytranscriptomics/sass/main.min.7431b79066cbc03d3c6828a31c3df6dcfa144fc83335aaf63956af9650280d8b.css" rel="stylesheet">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fancyapps/[email protected]/dist/jquery.fancybox.min.css" integrity="sha256-7TyXnr2YU040zfSP+rEcz29ggW4j56/ujTPwjMzyqFY=" crossorigin="anonymous">


Expand All @@ -47,8 +47,7 @@
<meta property="og:url" content="https://stemangiola.github.io/tidytranscriptomics/categories/" />

<meta itemprop="name" content="Categories">
<meta itemprop="description" content="The blog about tidy transcriptomics.">
<meta name="twitter:card" content="summary"/>
<meta itemprop="description" content="The blog about tidy transcriptomics."><meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Categories"/>
<meta name="twitter:description" content="The blog about tidy transcriptomics."/>

Expand Down Expand Up @@ -177,7 +176,7 @@



<script type="text/javascript" src="/tidytranscriptomics/js/main.min.c12618f9a600c40bd024996677e951e64d3487006775aeb22e200c990006c5c7.js"></script>
<script type="text/javascript" src="/tidytranscriptomics/js/main.min.c99b103c33d1539acf3025e1913697534542c4a5aa5af0ccc20475ed2863603b.js"></script>



Expand Down
11 changes: 5 additions & 6 deletions blog/public/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@



<meta name="generator" content="Hugo 0.80.0 with theme even" />
<meta name="generator" content="Hugo 0.85.0 with theme even" />


<link rel="canonical" href="https://stemangiola.github.io/tidytranscriptomics/" />
Expand All @@ -37,18 +37,17 @@



<link href="/tidytranscriptomics/sass/main.min.21eb984d56dca3b1630720eb15d66f4f5c668318df572a68582d3e5d6f0540fa.css" rel="stylesheet">
<link href="/tidytranscriptomics/sass/main.min.7431b79066cbc03d3c6828a31c3df6dcfa144fc83335aaf63956af9650280d8b.css" rel="stylesheet">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fancyapps/[email protected]/dist/jquery.fancybox.min.css" integrity="sha256-7TyXnr2YU040zfSP+rEcz29ggW4j56/ujTPwjMzyqFY=" crossorigin="anonymous">


<meta property="og:title" content="Tidy transcriptomics" />
<meta property="og:description" content="The blog about tidy transcriptomics." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://stemangiola.github.io/tidytranscriptomics/" />
<meta property="og:updated_time" content="2021-02-18T15:30:19+11:00" />

<meta itemprop="name" content="Tidy transcriptomics">
<meta itemprop="description" content="The blog about tidy transcriptomics.">
<meta name="twitter:card" content="summary"/>
<meta itemprop="description" content="The blog about tidy transcriptomics."><meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Tidy transcriptomics"/>
<meta name="twitter:description" content="The blog about tidy transcriptomics."/>

Expand Down Expand Up @@ -202,7 +201,7 @@ <h1 class="post-title"><a class="post-link" href="/tidytranscriptomics/post/2021



<script type="text/javascript" src="/tidytranscriptomics/js/main.min.c12618f9a600c40bd024996677e951e64d3487006775aeb22e200c990006c5c7.js"></script>
<script type="text/javascript" src="/tidytranscriptomics/js/main.min.c99b103c33d1539acf3025e1913697534542c4a5aa5af0ccc20475ed2863603b.js"></script>



Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading