Skip to content

CommandLineMode

Nathan Salomonis edited this page Dec 14, 2021 · 16 revisions

Running AltAnalyze from Command Line Interface

In addition to the graphical user interface (GUI), AltAnalyze can be easily run by command-line. This includes jobs run locallly, on a remote Linux server or cluster. This works fine given that the user knows the file paths of the directories containing input files, the output directory and has already created files containing the groups and comparisons for all samples analyzed.

Creating Groups and Comparison Files - Creating groups and comparison files is needed beforehand, but is fairly easy. Just follow the directions listed here. This can be done in an automated fashion as well, if input files have a defined naming structure.

Running Command Line from Source or Compiled Versions - The command-line can be run from the source code or OS-specific binaries. The binaries are recommended since these already contain graphical, statistical and webservice dependencies that need to be separately installed for the source code (see more information here).

When running with OS-specific binaries of AltAnalyze directly call the binary files themselves:

  • Windows OS AltAnalyze.exe
  • Mac OS X AltAnalyze.app/Contents/MacOS/AltAnalyze
  • PiPy (pip) installed altanalyze or AltAnalyze
  • Python source code python AltAnalyze.py

Example Options

Downloading and installing a species specific database (mouse)

python AltAnalyze.py --species Mm --update Official --version EnsMart72
  --additional all

Analyzing RNA-Seq files – FASTQ file directory using ICGS Population Discovery

python AltAnalyze.py --runICGS yes --platform "RNASeq" --species Mm 
  --column_method hopach --rho 0.2 --ExpressionCutoff 1 --FoldDiff 4 
  --SamplesDiffering 4 --excludeCellCycle conservative --output "C:/FASTQ_Files/" 
  --expname "Mm_HSCs" --fastq_dir "C:/FASTQ_Files/" --runKallisto yes

Analyzing RNA-Seq files – FASTQ file directory using known groups

python AltAnalyze.py --platform "RNASeq" --species Mm 
  --column_method hopach --rho 0.2 --ExpressionCutoff 1 --FoldDiff 4 
  --SamplesDiffering 4 --excludeCellCycle conservative --output "C:/FASTQ_Files/" 
  --expname "HSCs" --fastq_dir "C:/FASTQ_Files/" --runKallisto yes
  --groupdir "C:/FASTQ_Files/groups.HSCs.txt" 
  --compdir "C:/FASTQ_Files/comps.HSCs.txt" --GEelitefold 1.5 
  --GEelitepval 0.05 --GEeliteptype "adjp"

Analyzing RNA-Seq files – BAM files using default options and GO-Elite

python AltAnalyze.py --species Hs --platform RNASeq --bedDir "C:/BAMFiles"
  --groupdir "C:/BAMFiles/groups.YourExperiment.txt"
  --compdir "C:/BAMFiles/comps.YourExperiment.txt" --output "C:/BAMFiles"
  --expname "YourExperiment" --runGOElite yes" --GEelitefold 1.5 
  --GEelitepval 0.05 --GEeliteptype "adjp"

Analyzing CEL files – Affymetrix 3’ array using default options and GO-Elite

python AltAnalyze.py --species Mm --platform "3'array" --celdir "C:/CELFiles"
  --groupdir "C:/CELFiles/groups.YourExperiment.txt"
  --compdir "C:/CELFiles/comps.YourExperiment.txt" --output "C:/CELFiles"
  --expname "YourExperiment" --runGOElite yes

Analyzing RNA-Seq files – TPM text file using ICGS Population Discovery

python AltAnalyze.py --platform RNASeq --species Mm --column_method hopach
  --ExpressionCutoff 1 --FoldDiff 4 --SamplesDiffering 4 --rho 0.2 
  --excludeCellCycle conservative --removeOutliers no --row_method hopach
  --expdir tests/demo_data/Fluidigim_TPM/input/BoneMarrow-scRNASeq.txt
  --output tests/demo_data/Fluidigim_TPM/output/ --restrictBy protein_coding
  --runICGS yes --expname BoneMarrow-scRNASeq --column_metric cosine

Analyzing RNA-Seq files – BAM file directory using ICGS Population Discovery

python AltAnalyze.py --platform RNASeq --species Hs --column_method hopach 
  --column_metric cosine --rho 0.2 --removeOutliers no --row_method hopach
  --SamplesDiffering 3 --restrictBy protein_coding --excludeCellCycle no
  --bedDir tests/demo_data/BAM/input/ --expname cancer --ExpressionCutoff 1 
  --FoldDiff 4 --output /tests/demo_data/BAM/input/ --runICGS yes

Analyzing RNA-Seq files – 10X Genomics Sparse Matrix file using ICGS Population Discovery

python AltAnalyze.py --platform RNASeq --species Hs --column_method hopach 
  --column_metric cosine --rho 0.2 --removeOutliers no --row_method hopach
  --SamplesDiffering 3 --restrictBy protein_coding --excludeCellCycle no
  --ChromiumSparseMatrix tests/demo_data/10X/input/filtered_feature_bc_matrix.h5 
  --expname cancer --ExpressionCutoff 1 --output /tests/demo_data/FASTQ/output/ 
  --FoldDiff 4 --runICGS yes
  • Other Options:

--k 50 (increase the > estimated number of ICGS predicted clusters for NMF)

--downsample 5000 (increase/decrease the number of cells to downsample to [default=2500])

--numVarGenes 500 (increase/decrease the number of variable genes for downsampling [default=500])

--numGenesExp 500 (increase/decrease the number of genes/cell expressed for filtering [default=500])

Separate custom UMAPs colored for specific genes and restricted to cells from certain samples

python AltAnalyze.py --image "UMAP" --plotType 2D --display False --species Mm
  --input "/Users/exp/ICGS-NMF/exp.MarkerHeatmap.txt" --platform RNASeq --zscore no
  --labels no --maskGroups "/Users/exp/ICGS-NMF/biological-replicates.txt" 
  --genes "Gfi1 Irf8 Vwf" --reimportModelScores False --separateGenePlots yes

Cluster and expression file to produce a heatmap with a groups file

python AltAnalyze.py --image hierarchical --platform RNASeq --species Mm
  --input "/Users/exp/exp.cancer_genes.txt" --contrast 5
  --column_method ward --row_method ward --column_metric cosine
  --color_gradient yellow_black_blue --row_metric correlation 
  --normalization median

Create a custom Heatmap with enrichment of single-cell marker gene sets (BioMarkers)

python AltAnalyze.py --image hierarchical --platform RNASeq --species Mm  
  --input "/Users/exp/ICGS-NMF/exp.MarkerHeatmap.txt" --contrast 5 
  --display False --color_gradient yellow_black_blue --row_method None
  --column_method None --column_metric cosine --row_metric correlation 
  --normalization median --clusterGOElite BioMarkers --justShowTheseIDs 
  "Hnrnpa2b1 Hnrnpc Rbm10 Sf3b1 Srsf10 Srsf7 Irf8"

Create a custom Heatmap for genes correlated and anti-correlated to a target gene Prdm1

python AltAnalyze.py --image hierarchical --platform RNASeq --species Mm
  --input "/Users/exp/ICGS-NMF/exp.MarkerHeatmap.txt" --contrast 5
  --column_method None --row_method None --column_metric cosine
  --color_gradient yellow_black_blue --row_metric correlation 
  --normalization median --genes "amplify Prdm1" --rho 0.3

Details

Many more additional example workflow analysis options and detailed option descriptions for various AltAnalyze functions are provided in the below links.

Full AltAnalyze Workflows

Pathway Enrichment Analysis and Visualization

Clustering, QC, and Alternative Exons Visualization

File comparison, ID translation and visualization

LineageProfiler and Sample Classification

Clone this wiki locally