-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #206 from AmandaKwok28/main
EC1
- Loading branch information
Showing
4 changed files
with
124 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
--- | ||
layout: post | ||
title: "Dimensionality Reduction using GGanimate" | ||
author: Amanda Kwok | ||
jhed: akwok1 | ||
categories: [ HW EC1 ] | ||
image: homework/hwEC1/akwok1.gif | ||
featured: false | ||
--- | ||
|
||
### Description of figure | ||
In the gif, I visualized the pca (linear) dimensionality reduction in 2D space which transitions into the tSNE (nonlinear) dimensionality reduciton in 2D space. The axes for the pcs are PC1 and PC2 since those principle components capture the most variance in the data. The two kinds of dimensionality reduction seem to show that they can tell you drastically different things about a dataset. The tSNE seems to suggest there are several clusters/cell types or bundles of genes that are related whereas the PCA tells you that there are distinctly 2 groups in the data. Whether or not there are further groupings in those 2 groups isn't told by the PCA. The benefit of pca is that you can be confident that your results are correct in the sense that the dimension reduction is linear which means it preserves spatial relationships. On the other hand, the tSNE does not necessarily preserve spatial relationships and aims more to minimize the KL divergence between distributions which could lead to incorrectly interpretted results. | ||
|
||
|
||
### Code: | ||
# gganimate EC | ||
|
||
# clear everything | ||
rm(list = ls()) | ||
|
||
# load data | ||
data <- read.csv(paste('C:/Users/Amand/OneDrive/Documents/JHU-undergrad', | ||
'/Junior Year/Sem2/Genomic Data/Amanda-K/data/pikachu.csv.gz', sep = ''), | ||
row.names = 1) | ||
|
||
# load relevant Libraries | ||
library(ggplot2) | ||
library(gganimate) | ||
library(Rtsne) | ||
library(patchwork) | ||
|
||
# normalize the data | ||
pos <- data[,4:5] | ||
gexp <- data[,6:ncol(data)] | ||
|
||
gexp[1:5,1:5] | ||
gexpnorm <- log10(gexp/rowSums(gexp) * mean(rowSums(gexp))+1) | ||
hist(rowSums(gexpnorm)) | ||
|
||
# linear dimensionality reduction (pca) | ||
pcs <- prcomp(gexpnorm) | ||
df <- data.frame(pcs$x) | ||
p1 <- ggplot(df) + geom_point(aes(x=PC1, y=PC2)) | ||
|
||
p1 | ||
|
||
# nonlinear dimensionality reduction (emb) | ||
set.seed(123) | ||
emb <- Rtsne(gexpnorm)$Y | ||
df2 <- data.frame(emb) | ||
p2 <- ggplot(df2) + geom_point(aes(x=X1, y=X2)) | ||
|
||
p2 | ||
|
||
|
||
# use gganimate | ||
pc12 <- df[,1:2] | ||
colnames(pc12) <- c('Dim1', 'Dim2') | ||
tsne <- df2 | ||
colnames(tsne) <- c('Dim1', 'Dim2') | ||
|
||
df3 <- rbind(cbind(pc12, order="PCA"), | ||
cbind(tsne, order="tSNE")) | ||
df3$order <- factor(df3$order) | ||
|
||
# Create animated plot | ||
animated_plot <- ggplot(df3) + | ||
geom_point(aes(Dim1,Dim2)) + | ||
labs(title = "Linear vs. NonLinear \n Dimensionality Reduction") + | ||
transition_states( | ||
order, | ||
transition_length = 2, | ||
state_length = 1 | ||
) + | ||
view_follow() + | ||
theme(plot.title = element_text(hjust=0.5)) # center the title | ||
|
||
|
||
animated_plot | ||
anim_save("EC1.gif", animated_plot, path = 'C:/Users/Amand/OneDrive/Documents/JHU-undergrad/Junior Year/Sem2/Genomic Data/EC') | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.