This pipeline is based on the results of ninjaMap. It blasts both of the unampped and missed reads on a defined community and retruns the LCA counts.
A DSQC workflow is here
DS001,s3://genomics-workflow-core/Results/Ninjamap/MITI-001/20240223_DS-mNGS_updated
DS003,s3://genomics-workflow-core/Results/Ninjamap/MITI-001/20240223_DS-mNGS_updated
aws batch submit-job \
--job-name nf-dsqc \
--job-queue priority-maf-pipelines \
--job-definition nextflow-production \
--container-overrides command="FischbachLab/nf-dsqc, \
"--seedfile", "s3://nextflow-pipelines/nf-DSQC/test/test_seedfile.csv"
"--project","20240223", \
"--outdir","s3://genomics-workflow-core/Results/DSQC/" "
Example 2: The aws batch job parameters can also be configured using the -params-file option. A copy of the params will be automatically saved to a json file (parameters.json) in the run output bucket.
aws batch submit-job \
--job-name nf-dsqc \
--job-queue priority-maf-pipelines \
--job-definition nextflow-production \
--container-overrides command="fischbachlab/nf-dsqc, \
"-params-file", "s3://genomics-workflow-core/Results/DSQC/parameters/example_parameters.json" "
01_preprocess/
02_blast/
03_postprocess/
04_report/
Example output: The QC reports of missed reads and unmapped are saved in the 04_REPORT/missed/ and 04_report/unmapped/, respectively.
s3://genomics-workflow-core/Results/DSQC/project_name/04_report/missed/
s3://genomics-workflow-core/Results/DSQC/project_name/04_report/unmapped/
Sample BLAST LCA output
LCA,Rank,20231116_DS003_B04_REDO,20231128_DS004_C05_REDO
Bacillota,phylum,1,1
Bacteriophage sp.,species,1,1
Bifidobacterium breve,species,0,2
Bifidobacterium longum subsp. longum KACC 91563,strain,0,1
Blautia producta ATCC 27340 = DSM 2950,strain,1,0
Blautia producta,species,62,0
Caudoviricetes sp.,species,7,7
Clostridia,class,0,3
Clostridiaceae bacterium,species,0,1
Faecalitalea cylindroides T2-87,strain,0,1
Lachnospiraceae bacterium,species,2,1
Lachnospiraceae,family,0,1
Oscillospiraceae bacterium D1,species,0,2
root,no rank,1,3
Segatella hominis,species,0,2
Subdoligranulum variabile,species,0,2
uncultured bacterium,species,2,1
uncultured human fecal virus,species,3,4
uncultured organism,species,0,1
unidentified plasmid,species,0,1
Viruses,superkingdom,1,1