-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scaffolding regression fixes plus docker updates #547
Conversation
…tes all the time" This reverts commit 60ffe0e.
~{true='--allow_incomplete_output' false="" allow_incomplete_output} \ | ||
--loglevel=DEBUG | ||
|
||
cut -f 3 "~{sample_name}.refs_skani_dist.full.tsv" | tail +2 | head -1 > SKANI_ANI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 323-325 are bit repetitive with line 305 above, and it may be nice to also have these values in the log for debugging—what about something like:
chosen_ref_data_fields=($(sed -n '2{p;q;}' "~{sample_name}.refs_skani_dist.full.tsv"))
CHOSEN_REF_FASTA="${chosen_ref_data_fields[0]}"
tee <<< "${CHOSEN_REF_FASTA}" CHOSEN_REF
tee <<< "${CHOSEN_REF_FASTA%.*}" CHOSEN_REF_BASENAME
tee <<< "${chosen_ref_data_fields[2]}" SKANI_ANI
tee <<< "${chosen_ref_data_fields[3]}" SKANI_REF_AF
tee <<< "${chosen_ref_data_fields[4]}" SKANI_CONTIGS_AF
The merge of this to |
This PR:
assemble_denovo
on example exercises. This PR now has thescaffold
task fallback to old brute-force reference selection if ANI-based reference selection fails to find any matches at all. This will work fine (same as before) in most historically-normal use cases, but will not behave well if given a very large array of reference genomes to choose from. (this change does not impact the behavior ofscaffold_and_refine_multitaxa
which does not pass multiple references to thescaffold
task anyway.util.file.zstd_open
in favor ofzstandard.open
reports.alignment_metrics
samtools ampliconstats
in the case of multi-segment targets -- also allowsamtools ampliconstats
to fail silently in the case that it's getting too picky about the bed filemafft
workflow to usemulti_align_mafft_ref
instead ofmulti_align_mafft
terra_tsv_to_table
workflow resilient to non-existent TSVs in its input array, to simplify running it on a Terra table full of tsvs that may or may not exist