sample_analysis_opts fraction.min is confusing #59

ressy · 2021-11-18T21:09:39Z

The output data tables, both per-file and per-sample, have FractionOfTotal and FractionOfLocus columns, and we have a configurable threshold for the fraction of reads required to consider a peak as a candidate allele, fraction.min. But this fraction isn't either of those two listed columns; instead the denominator is the sum of the read counts in each processed-samples table, which is a more stringent set than just the matching locus via primer(s).

To summarize:

FractionOfTotal: denominator is the number of reads in the whole input file
FractionOfLocus: denominator is the number of reads for all entries sharing a MatchingLocus column (determined by forward primer and optionally reverse primer)
fraction applied when categorizing each row via analyze_sample(), which currently has no explicit column defined: denominator is the number of reads matching per-locus primer(s), repeat motif, and length range

This should be clarified in the documentation and outputs.

The text was updated successfully, but these errors were encountered:

ressy added the enhancement label Nov 18, 2021

ressy added this to the Version 0.3.2 milestone Nov 18, 2021

ressy modified the milestones: Version 0.3.2, Version 0.4.0 Feb 10, 2022

ressy modified the milestones: Version 0.4.0, Version 0.5.0 Apr 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sample_analysis_opts fraction.min is confusing #59

sample_analysis_opts fraction.min is confusing #59

ressy commented Nov 18, 2021

sample_analysis_opts fraction.min is confusing #59

sample_analysis_opts fraction.min is confusing #59

Comments

ressy commented Nov 18, 2021