Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drop trinity from de novo assembly pipeline #168

Merged
merged 2 commits into from
Nov 15, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions docs/description.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,7 @@ Viral genome assembly
~~~~~~~~~~~~~~~~~~~~~

The filtered and trimmed reads are subsampled to at most 100,000 pairs.
*de novo* assemby is performed using Trinity_. SPAdes_ is also offered as
an alternative *de novo* assembler.
*de novo* assemby is performed using SPAdes_.
Reference-assisted assembly improvements follow (contig scaffolding, orienting, etc.)
with MUMMER_ and MUSCLE_ or MAFFT_. Gap2Seq_ is used to seal gaps between scaffolded *de novo* contigs with sequencing reads.

Expand All @@ -51,7 +50,6 @@ reads were changed to N.

This align-call-refine cycle is iterated twice, to minimize reference bias in the assembly.

.. _Trinity: http://trinityrnaseq.github.io/
.. _SPAdes: http://bioinf.spbau.ru/en/spades
.. _MUMMER: https://mummer4.github.io/
.. _MUSCLE: https://www.drive5.com/muscle/
Expand Down
40 changes: 3 additions & 37 deletions pipes/WDL/tasks/tasks_assembly.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,10 @@ task assemble {
File reads_unmapped_bam
File trim_clip_db

Int? trinity_n_reads=250000
Int? spades_n_reads=10000000
Int? spades_min_contig_len=0

String? assembler="trinity" # trinity, spades, or trinity-spades
String? assembler="spades"
Boolean? always_succeed=false

# do this in two steps in case the input doesn't actually have "taxfilt" in the name
Expand All @@ -28,18 +27,7 @@ task assemble {

assembly.py --version | tee VERSION

if [[ "${assembler}" == "trinity" ]]; then
assembly.py assemble_trinity \
${reads_unmapped_bam} \
${trim_clip_db} \
${sample_name}.assembly1-${assembler}.fasta \
${'--n_reads=' + trinity_n_reads} \
${true='--alwaysSucceed' false="" always_succeed} \
--JVMmemory "$mem_in_mb"m \
--outReads=${sample_name}.subsamp.bam \
--loglevel=DEBUG

elif [[ "${assembler}" == "spades" ]]; then
if [[ "${assembler}" == "spades" ]]; then
assembly.py assemble_spades \
${reads_unmapped_bam} \
${trim_clip_db} \
Expand All @@ -50,28 +38,6 @@ task assemble {
--memLimitGb $mem_in_gb \
--outReads=${sample_name}.subsamp.bam \
--loglevel=DEBUG

elif [[ "${assembler}" == "trinity-spades" ]]; then
assembly.py assemble_trinity \
${reads_unmapped_bam} \
${trim_clip_db} \
${sample_name}.assembly1-trinity.fasta \
${'--n_reads=' + trinity_n_reads} \
--JVMmemory "$mem_in_mb"m \
--outReads=${sample_name}.subsamp.bam \
${true='--always_succeed' false='' always_succeed} \
--loglevel=DEBUG
assembly.py assemble_spades \
${reads_unmapped_bam} \
${trim_clip_db} \
${sample_name}.assembly1-${assembler}.fasta \
--contigsUntrusted=${sample_name}.assembly1-trinity.fasta \
${'--nReads=' + spades_n_reads} \
${true='--alwaysSucceed' false='' always_succeed} \
${'--minContigLen=' + spades_min_contig_len} \
--memLimitGb $mem_in_gb \
--loglevel=DEBUG

else
echo "unrecognized assembler ${assembler}" >&2
exit 1
Expand Down Expand Up @@ -117,7 +83,7 @@ task scaffold {
String docker="quay.io/broadinstitute/viral-assemble:2.1.4.0"

# do this in multiple steps in case the input doesn't actually have "assembly1-x" in the name
String sample_name = basename(basename(basename(contigs_fasta, ".fasta"), ".assembly1-trinity"), ".assembly1-spades")
String sample_name = basename(basename(contigs_fasta, ".fasta"), ".assembly1-spades")
}

command {
Expand Down