Skip to content

FischbachLab/nf-dsqc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NF-DSQC

This pipeline is based on the results of ninjaMap. It blasts both of the unampped and missed reads on a defined community and retruns the LCA counts.

A DSQC workflow is here

MITI Drug Substance QC batch job

Below is an example seedfile showing the sample name and s3 path in a csv file without headers.

DS001,s3://genomics-workflow-core/Results/Ninjamap/MITI-001/20240223_DS-mNGS_updated
DS003,s3://genomics-workflow-core/Results/Ninjamap/MITI-001/20240223_DS-mNGS_updated

Example 1: The batch submission example using --seedfile option

aws batch submit-job \
    --job-name nf-dsqc \
    --job-queue priority-maf-pipelines \
    --job-definition nextflow-production \
    --container-overrides command="FischbachLab/nf-dsqc, \
"--seedfile", "s3://nextflow-pipelines/nf-DSQC/test/test_seedfile.csv"
"--project","20240223", \
"--outdir","s3://genomics-workflow-core/Results/DSQC/" "

Example 2: The aws batch job parameters can also be configured using the -params-file option. A copy of the params will be automatically saved to a json file (parameters.json) in the run output bucket.

aws batch submit-job \
    --job-name nf-dsqc \
    --job-queue priority-maf-pipelines \
    --job-definition nextflow-production \
    --container-overrides command="fischbachlab/nf-dsqc, \
    "-params-file", "s3://genomics-workflow-core/Results/DSQC/parameters/example_parameters.json" " 

The structure of the output directory

01_preprocess/
02_blast/
03_postprocess/
04_report/

Example output: The QC reports of missed reads and unmapped are saved in the 04_REPORT/missed/ and 04_report/unmapped/, respectively.

s3://genomics-workflow-core/Results/DSQC/project_name/04_report/missed/
s3://genomics-workflow-core/Results/DSQC/project_name/04_report/unmapped/

Sample BLAST LCA output

LCA,Rank,20231116_DS003_B04_REDO,20231128_DS004_C05_REDO
Bacillota,phylum,1,1
Bacteriophage sp.,species,1,1
Bifidobacterium breve,species,0,2
Bifidobacterium longum subsp. longum KACC 91563,strain,0,1
Blautia producta ATCC 27340 = DSM 2950,strain,1,0
Blautia producta,species,62,0
Caudoviricetes sp.,species,7,7
Clostridia,class,0,3
Clostridiaceae bacterium,species,0,1
Faecalitalea cylindroides T2-87,strain,0,1
Lachnospiraceae bacterium,species,2,1
Lachnospiraceae,family,0,1
Oscillospiraceae bacterium D1,species,0,2
root,no rank,1,3
Segatella hominis,species,0,2
Subdoligranulum variabile,species,0,2
uncultured bacterium,species,2,1
uncultured human fecal virus,species,3,4
uncultured organism,species,0,1
unidentified plasmid,species,0,1
Viruses,superkingdom,1,1

About

Drug Substance QC pipeline

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published