Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in allele frequency output in `WC_FST_Diploids_2Alleles' #28

Open
akijarl opened this issue Mar 16, 2021 · 0 comments
Open

Error in allele frequency output in `WC_FST_Diploids_2Alleles' #28

akijarl opened this issue Mar 16, 2021 · 0 comments

Comments

@akijarl
Copy link

akijarl commented Mar 16, 2021

Hello,

I found a minor bug in the WC_FST_Diploids_2Alleles function. Presently the p_freqs object tallies the homozygous absent counts (Sample_Mat[, 1], instead of Sample_Mat[, 3]), along with half the heterozygotes to tabulate the meanAlleleFreq output.
Here's an example of the Sample_Mat object layout:

> Sample_Mat
     0    1    2
1  701 1595 2704
2 4524  180  296

So the meanAlleleFreq output is actually tallying up the frequency of the ancestral, or REF, alleles, instead of the the derived, or ALT, alleles. This leads to a meanAlleleFreq value that's 1-p if you're interested in the ALT allele frequency. I highlighted the change in the function below with #'s.

WC_FST_Diploids_2Alleles <- function (Sample_Mat) 
{
  sample_sizes = rowSums(Sample_Mat)
  n_ave = mean(sample_sizes)
  n_pops = nrow(Sample_Mat)
  r = n_pops
  n_c = (n_pops * n_ave - sum(sample_sizes^2)/(n_pops * n_ave))/(n_pops - 1)
  #####################
  p_freqs = (Sample_Mat[, 3] + Sample_Mat[, 2]/2)/sample_sizes ### changed this - was Sample_Mat[, 1] which is hom. absent, instead of hom. present
  #####################
  p_ave = sum(sample_sizes * p_freqs)/(n_ave * n_pops)
  s2 = sum(sample_sizes * (p_freqs - p_ave)^2)/((n_pops - 1) * n_ave)
  if (s2 == 0) {
    return(0)
    break
  }
  h_freqs = Sample_Mat[, 2]/sample_sizes
  h_ave = sum(sample_sizes * h_freqs)/(n_ave * n_pops)
  a <- n_ave/n_c * (s2 - 1/(n_ave - 1) * (p_ave * (1 - p_ave) - ((r - 1)/r) * s2 - (1/4) * h_ave))
  b <- n_ave/(n_ave - 1) * (p_ave * (1 - p_ave) - (r - 1)/r * s2 - (2 * n_ave - 1)/(4 * n_ave) * h_ave)
  c <- 1/2 * h_ave
  aNoCorr <- n_ave/n_c * (s2)
  bNoCorr <- (p_ave * (1 - p_ave) - (r - 1)/r * s2 - (2 * n_ave)/(4 * n_ave) * h_ave)
  cNoCorr <- 1/2 * h_ave
  He <- 1 - sum(p_ave^2, (1 - p_ave)^2)
  FST <- a/(a + b + c)
  FSTNoCorr = aNoCorr/(aNoCorr + bNoCorr + cNoCorr)
  return(list(He = He, FST = FST, T1 = a, T2 = (a + b + c), FSTNoCorr = FSTNoCorr, T1NoCorr = aNoCorr, T2NoCorr = (aNoCorr + bNoCorr + cNoCorr), meanAlleleFreq = p_ave))
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant