Skip to content
This repository has been archived by the owner on Mar 4, 2020. It is now read-only.

Latest commit

 

History

History
68 lines (46 loc) · 2.32 KB

rna_expression.md

File metadata and controls

68 lines (46 loc) · 2.32 KB

megSAP - RNA Expression Analysis

Basics

Single sample RNA expression analysis is performed using the analyze_rna.php script. Please have a look at the help using:

php megSAP/src/Pipelines/analyze_rna.php --help

The main parameters that you have to provide are:

  • folder - The output folder containing all result files.
  • name - Basename/prefix for all output files.

In addition, you may want to specify:

  • steps - Analysis steps to perform. Use ma,rc,an,fu to perform mapping, read counting, annotation and fusion detection.
  • system - The processing system INI file.

RNA-seq Expression Pipeline gives a detailed description of the pipeline. Downstream analysis is facilitated by the structured output, for a primer see Downstream Analysis.

Running an analysis

If all data to analyze resides in a sample folder as produced by Illumina's bcl2fastq tool, the whole analysis is performed with one command, for example like this:

php megSAP/src/Pipelines/analyze_rna.php \
  -folder Sample_X_01 -name X_01 \
  -system truseq.ini -steps ma,rc,an

In the example above, the configuration of the pipeline is done using the truseq.ini file, which contains all necessary information (see processing system INI file).

Output

After the analysis, these files are created in the output folder:

  1. mapped reads in BAM format
  2. raw read counts, in featureCounts tabular output format
  3. normalized read counts, annotated with gene symbols
  4. QC data in qcML format, which can be opened with a web browser

Using other genomes

To use the RNA expression pipeline with other genomes, you need to provide

  • the genome FASTA file, e.g. megSAP/data/genomes/CustomGenome.fa
  • the STAR genome index, e.g. megSAP/data/genomes/STAR/CustomGenome/
  • the gene annotation file in Ensembl-like GTF format, e.g. megSAP/data/dbs/gene_annotations/CustomGenome.gtf

The genome can by specified in the processing system INI file.

back to the start page