-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shortread subworkflow #716
base: dev
Are you sure you want to change the base?
Conversation
This PR is against the
|
Think this is ready for review @jfy133. Simple PR moving short read preprocessing into subworkflow. |
…to always declare the ch_short_reads_assembly even if assembly-input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a straight forward conversion to me! thank you @muabnezor you're making the code base much cleaner!
include { FASTQC as FASTQC_RAW } from '../../modules/nf-core/fastqc/main' | ||
include { FASTQC as FASTQC_TRIMMED } from '../../modules/nf-core/fastqc/main' | ||
include { FASTP } from '../../modules/nf-core/fastp/main' | ||
include { ADAPTERREMOVAL as ADAPTERREMOVAL_PE } from '../../modules/nf-core/adapterremoval/main' | ||
include { ADAPTERREMOVAL as ADAPTERREMOVAL_SE } from '../../modules/nf-core/adapterremoval/main' | ||
include { BOWTIE2_REMOVAL_BUILD as BOWTIE2_HOST_REMOVAL_BUILD } from '../../modules/local/bowtie2_removal_build' | ||
include { BOWTIE2_REMOVAL_ALIGN as BOWTIE2_HOST_REMOVAL_ALIGN } from '../../modules/local/bowtie2_removal_align' | ||
include { BOWTIE2_REMOVAL_BUILD as BOWTIE2_PHIX_REMOVAL_BUILD } from '../../modules/local/bowtie2_removal_build' | ||
include { BOWTIE2_REMOVAL_ALIGN as BOWTIE2_PHIX_REMOVAL_ALIGN } from '../../modules/local/bowtie2_removal_align' | ||
include { CAT_FASTQ } from '../../modules/nf-core/cat/fastq/main' | ||
include { SEQTK_MERGEPE } from '../../modules/nf-core/seqtk/mergepe/main' | ||
include { BBMAP_BBNORM } from '../..//modules/nf-core/bbmap/bbnorm/main' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please align :)
If you use VSCode, you should install the Nextflow extension which now includes a formatter you can activate via the command palette and should do that for you :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. thanks for suggestion.
ch_multiqc_files = ch_multiqc_files.mix(FASTQC_RAW.out.zip) | ||
|
||
ch_bowtie2_removal_host_multiqc = Channel.empty() | ||
if (!params.assembly_input) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!params.assembly_input)
in the future, it feels like we can move away from this. Could the assembly_input logic be determined by the users input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about this actually myself:
If you've pre-assembled reads, they should already be cleaned. Which means you could skip this entire subworkflow, so maybe this if else statement could be pushed to the main mag.nf
around the new subworkflow? @prototaxites what do you think, is that a reasonable assuemption to make (i.e., skip preprocessing of reads)
But what do you mean by 'users input'? This is basically that, if you give the additional extra sheet to --assembly_input
(that contains the paths to contig FASTAS) then we can skip assembly. It's not a boolean flag, if that is what you were meaning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what do you mean by 'users input'? This is basically that, if you give the additional extra sheet to --assembly_input (that contains the paths to contig FASTAS) then we can skip assembly. It's not a boolean flag, if that is what you were meaning.
Ah, then this makes more sense. I thought it was a boolean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you've pre-assembled reads, they should already be cleaned. Which means you could skip this entire subworkflow, so maybe this if else statement could be pushed to the main mag.nf around the new subworkflow?
yes, I actually did this first, but then moved it back into the subworkflow. Not sure what makes the most sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you've pre-assembled reads, they should already be cleaned. Which means you could skip this entire subworkflow, so maybe this if else statement could be pushed to the main mag.nf around the new subworkflow? @prototaxites what do you think, is that a reasonable assuemption to make (i.e., skip preprocessing of reads)
Yes, I think this probably makes the most sense. While I'm sure there's someone out there with the use-case that they want to map a new set of reads against an existing assembly? I think we should assume that the reads accompanying pre-computed assemblies are already clean - someone with that use case probably knows what they are doing!
Personally I'm in favour of putting the if!(assembly_input)
line in the main WF rather than the subWF; the subWF code should only operate when we actually want to use it. Plus I think it helps make the logic of the main WF clearer by stating when and when not things are run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for input. I think it makes sense to have it in the main workflow also. If you think this looks fine @jfy133 and @prototaxites I'll go ahead and merge.
ch_versions = ch_versions.mix(FASTQC_RAW.out.versions.first()) | ||
ch_multiqc_files = ch_multiqc_files.mix(FASTQC_RAW.out.zip) | ||
|
||
ch_bowtie2_removal_host_multiqc = Channel.empty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we have a dedicated channel for that... maybe can be mixed directly into ch_multiqc_files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't even seem to be used anyway so can be removed :)
ch_versions = ch_versions.mix(FASTQC_RAW.out.versions.first()) | ||
ch_multiqc_files = ch_multiqc_files.mix(FASTQC_RAW.out.zip) | ||
|
||
ch_bowtie2_removal_host_multiqc = Channel.empty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ch_bowtie2_removal_host_multiqc = Channel.empty() |
def new_reads = meta.single_end ? reads[0] : reads.flatten() | ||
[meta, new_reads] | ||
} | ||
SHORTREAD_PREPROCESSING( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this stil needs the !params.assembly_input
bit? And the else
should then be assigning the raw reads directly to ch_short_reads
and ch_short_reads_assembly
etc (assuming it has the right channel construct)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is there, line 191, and 205?
PR checklist
nf-core pipelines lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).This PR moves the short read preprocessing into a subworkflow