Skip to content

Commit

Permalink
pipeline: new option "-report-all-metrics" added
Browse files Browse the repository at this point in the history
  • Loading branch information
alexeigurevich committed Jun 7, 2022
1 parent efc98ab commit 7d94c96
Show file tree
Hide file tree
Showing 5 changed files with 25 additions and 3 deletions.
6 changes: 4 additions & 2 deletions CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,10 @@
- auN, auNG, auNA, auNGA (areas under the Nx/NGx/NAx/NGx curves; for more detail see
https://lh3.github.io/2020/04/08/a-new-metric-on-assembly-contiguity or the manual).

3. New option:
- "--local-mis-size" for setting minimal local misassembly size (default is 200, was 86).
3. New options:
- "--local-mis-size" for setting minimal local misassembly size (default is 200, was 86);
- "--report-all-metrics" for keeping the same content (list of metrics) in the main report
independently of inputs/options.

4. MetaQUAST change:
- preserving explicitly specified reference genomes in the reports (they were previously
Expand Down
9 changes: 9 additions & 0 deletions manual.html
Original file line number Diff line number Diff line change
Expand Up @@ -626,6 +626,15 @@ <h3>2.3 Command line options</h3>
By default, the value is automatically detected as the median insert size of provided paired-end reads.
If no paired-end reads are provided, 255 is used as the default value.

<div class='option'>
<code><b>--report-all-metrics</b></code>
</div>
Keep all quality metrics in the main report. Usually, all not-relevant metrics are not included in the report, e.g., reference-based metrics in the no-reference mode.
Also, if metric values are undefined ('-') for all input assemblies, the metric is removed from the report.
The only exception from the latter rule is NG/NGA/LG/LGA-like metrics that explicitly contain '-' if reference was specified but (the aligned parts of) all assemblies are too small to reach, e.g., NG50 (NGA50).
<br>
The <code>--report-all-metrics</code> option changes this behaviour and forces QUAST (metaQUAST) to keep all metrics that can be reported in principle in the report. In this case, the number of rows in the main report is always the same independently of inputs and running mode/options, which simplifies automatic parsing of the report.

<div class='option'>
<code><b>--plots-format</b></code> <code>&lt;format&gt;</code>
</div>
Expand Down
4 changes: 4 additions & 0 deletions quast_libs/options_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -625,6 +625,10 @@ def parse_options(logger, quast_args):
callback_kwargs={'min_value': qconfig.optimal_assembly_min_IS,
'max_value': qconfig.optimal_assembly_max_IS})
),
(['--report-all-metrics'], dict(
dest='report_all_metrics',
action='store_true')
),
(['--plots-format'], dict(
dest='plot_extension',
type='string',
Expand Down
5 changes: 4 additions & 1 deletion quast_libs/qconfig.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@
run_busco = False
large_genome = False
use_kmc = False
report_all_metrics = False

# ideal assembly section
optimal_assembly = False
Expand Down Expand Up @@ -486,6 +487,8 @@ def usage(show_hidden=False, mode=None, short=True, stream=sys.stdout):
stream.write(" --upper-bound-min-con <int> Minimal number of 'connecting reads' needed for joining upper bound contigs into a scaffold\n")
stream.write(" [default: %d for mate-pairs and %d for long reads]\n" % (MIN_CONNECT_MP, MIN_CONNECT_LR))
stream.write(" --est-insert-size <int> Use provided insert size in upper bound assembly simulation [default: auto detect from reads or %d]\n" % optimal_assembly_default_IS)
stream.write(" --report-all-metrics Keep all quality metrics in the main report even if their values are '-' for all assemblies or \n"
" if they are not applicable (e.g., reference-based metrics in the no-reference mode)\n")
stream.write(" --plots-format <str> Save plots in specified format [default: %s].\n" % plot_extension)
stream.write(" Supported formats: %s\n" % ', '.join(supported_plot_extensions))
stream.write(" --memory-efficient Run everything using one thread, separately per each assembly.\n")
Expand All @@ -506,7 +509,7 @@ def usage(show_hidden=False, mode=None, short=True, stream=sys.stdout):
stream.write(" --sam <filename,filename,...> Comma-separated list of SAM alignment files obtained by aligning reads to assemblies\n"
" (use the same order as for files with contigs)\n")
stream.write(" --bam <filename,filename,...> Comma-separated list of BAM alignment files obtained by aligning reads to assemblies\n"
" (use the same order as for files with contigs)\n")
" (use the same order as for files with contigs)\n")
stream.write(" Reads (or SAM/BAM file) are used for structural variation detection and\n")
stream.write(" coverage histogram building in Icarus\n")
stream.write(" --sv-bedpe <filename> File with structural variations (in BEDPE format)\n")
Expand Down
4 changes: 4 additions & 0 deletions quast_libs/reporting.py
Original file line number Diff line number Diff line change
Expand Up @@ -457,6 +457,10 @@ def table(order=Fields.order, ref_name=None):
required_fields = []

def define_required_fields():
if qconfig.report_all_metrics:
required_fields.extend(Fields.order)
return

# if a reference is specified, keep the same number of Nx/Lx-like genome-based metrics in different reports
# (no matter what percent of the genome was assembled)
report = get(assembly_fpaths[0], ref_name=ref_name)
Expand Down

0 comments on commit 7d94c96

Please sign in to comment.