Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sample_analysis_opts fraction.min is confusing #59

Open
ressy opened this issue Nov 18, 2021 · 0 comments
Open

sample_analysis_opts fraction.min is confusing #59

ressy opened this issue Nov 18, 2021 · 0 comments

Comments

@ressy
Copy link
Member

ressy commented Nov 18, 2021

The output data tables, both per-file and per-sample, have FractionOfTotal and FractionOfLocus columns, and we have a configurable threshold for the fraction of reads required to consider a peak as a candidate allele, fraction.min. But this fraction isn't either of those two listed columns; instead the denominator is the sum of the read counts in each processed-samples table, which is a more stringent set than just the matching locus via primer(s).

To summarize:

  • FractionOfTotal: denominator is the number of reads in the whole input file
  • FractionOfLocus: denominator is the number of reads for all entries sharing a MatchingLocus column (determined by forward primer and optionally reverse primer)
  • fraction applied when categorizing each row via analyze_sample(), which currently has no explicit column defined: denominator is the number of reads matching per-locus primer(s), repeat motif, and length range

This should be clarified in the documentation and outputs.

@ressy ressy added this to the Version 0.3.2 milestone Nov 18, 2021
@ressy ressy modified the milestones: Version 0.3.2, Version 0.4.0 Feb 10, 2022
@ressy ressy modified the milestones: Version 0.4.0, Version 0.5.0 Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant