-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CZID-7790: Expand cov viz threshold for ONT workflow #307
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good and ran successfully on staging. Just a nit on the new attribute name. Otherwise looks good to me.
How was this reviewed?
- Read through the code
- Ran the code and examined output files
- Stepped through the code
Are there files that were not reviewed?
- Ignored whitespace changes in run.wdl
Are there tests included
- Yes
- No - and why? - I'm not sure it's possible to test the
idseq-dag
functions anymore?
@@ -35,6 +35,7 @@ def run(self): | |||
max_num_bins_coverage = self.additional_attributes.get("max_num_bins_coverage", MAX_NUM_BINS_COVERAGE) | |||
num_accessions_per_taxon = self.additional_attributes.get("num_accessions_per_taxon", NUM_ACCESSIONS_PER_TAXON) | |||
min_contig_size = self.additional_attributes.get("min_contig_size", MIN_CONTIG_SIZE) | |||
is_long_read_run = self.additional_attributes.get("is_long_read_run", False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more straightforward to name this to directly describe what the flag does. i.e. keep_taxons_with_only_reads
or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
CZID-7790
Expands the creation of coverage visualization files in the ONT workflow to happening for any number of NT reads, not just for when a taxon has NT contigs present.
Testing
Most of the work around this ticket was in verifying the results of the change. Here was my process.
coverage_viz_summary.json
, and the specific viz files for each accession that we might show to the user${ACCESSION_ID}coverage_viz.json
(egCP065714.1_coverage_viz.json
). The cov viz files are generated in the WDL taskGenerateCoverageViz
.gsnap.blast.top.m8
file, so I downloaded that file from the OriginalRun as well. I then manually went through reading the m8 file for those accessions and comparing it to what was present in the corresponding cov viz JSON files. Everything matched as expected. (It's conceivable I missed something; manually reading those files is a bit difficult, but as far as I can tell everything matched as expected. Plus, from what I see in the way the code works, I think this part should either work or fail in a pretty obvious way.)gsnap.blast.top.m8
file gets produced the same either way in a previous task, and the meat of GenerateCoverageViz is pretty much just reading those results out and formatting them to how we structure our cov viz JSON files, no heavy algorithmic stuff in there.)