This repo contains tutorial for quantifying expression of transposable elements from bulk RNA-seq experiments.
wget https://labshare.cshl.edu/shares/mhammelllab/www-data/TEtranscripts/TE_GTF/GRCh38_GENCODE_rmsk_TE.gtf.gz
utilize featureCounts function in the subread package to quantify all - unique and multimapped - reads to the reduced GTF repeat annotation file
module load subread
featureCounts -T 16 -F GTF -a GRCh38_GENCODE_rmsk_TE.gtf -t exon -g gene_id --extraAttributes transcript_id,family_id,class_id -O -M --fraction --primary -s 0 -p -o te_counts.txt *.bam
awk -F'\t' 'BEGIN{OFS=FS}{$2=$3=$4=$5=$6=$7=""; gsub("[\t]+","\t")}1' te_counts.txt > output_reduced.txt
deg <- as.data.frame(ddx_P1[["table"]])
deg$sig <- "No Change"
deg$sig[deg$logFC > .1 | deg$logFC < -.1 & deg$FDR < .05] <- "Significant"
ggplot(deg, aes(x = logFC, y = -log(FDR, 10), col = sig)) +
geom_point() +
theme_classic()