-
Notifications
You must be signed in to change notification settings - Fork 6
Home
This documentation is under construction, and reflects changes that are not yet (but soon will be) on the main branch of the code. Please see the README for instructions that do match the code.
Clone this repository and then from its root, build either a Singularity or Docker container.
To build a singularity container:
singularity build viridian_workflow.img Singularity.def
To build a docker container:
docker build --network=host .
(without --network=host
you will likely get pip install
timing out and
the build failing).
Both the Docker and Singularity container will have the main script
viridian_workflow
installed.
The examples below will run the default pipeline, using the built-in SARS-CoV-2 amplicon schemes ARTIC V3, ARTIC V4, and Midnight-1200. The pipeline automatically detects the scheme that best matches the input reads. To use your own amplicon scheme and/or force the choice of scheme, please read the amplicon schemes page.
To run on paired Illumina reads:
viridian_workflow run_one_sample \
--tech illumina
--ref_fasta data/MN908947.fasta \
--reads1 reads_1.fastq.gz \
--reads2 reads_2.fastq.gz \
--outdir OUT
To run on unpaired nanopore reads:
viridian_workflow run_one_sample \
--tech ont
--ref_fasta data/MN908947.fasta \
--reads reads.fastq.gz \
--outdir OUT
The FASTA file in those commands can be found in the data/
directory of this repository.
Other options:
-
--sample_name MY_NAME
: use this to change the sample name (default is "sample") that is put in the final FASTA file, BAM file, and VCF file. -
--keep_bam
: use this option to keep the BAM file of original input reads mapped to the reference genome. -
--force
: use with caution - it will overwrite the output directory if it already exists.
The default files in the output directory are:
-
consensus.fa
: a FASTA file of the consensus sequence. -
variants.vcf
: a VCF file of the identified variants between the consensus sequence and the reference genome. -
log.json
: contains logging information for the viridian workflow run.
If the option --keep_bam
is used, then a sorted BAM file of the reads mapped
to the reference will also be present, called
reference_mapped.bam
(and its index file reference_mapped.bam.bai
).