You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As we increase the mismatch rates, we increase the number of mutations compared to edges in the TS. When dating, all other things being equal, this will appear to make most node times older. We suspect that the human mutation rate we use, 1e-8, is calculated on the basis of infinite sites assumptions. We suspect that this might be pushing the OOA peak in our inferred+dated tree sequences too early. So there are a few ways we could see if this makes a difference:
Simulate the OOA model with error, and test some different mismatch rates. I have a large number of OOA inferred CSs on cycloid. We can just run tsdate on those.
Tsdate the TGP tree sequence that was produced without mismatch, and plot the Afr/Afr vs Non-afr/Non-afr tMRCA histograms to see if they have the same pattern as we see in the merged data, but shifted in time (we don't need to do this for all individuals, just one or two)
Remove the non IS sites and reduce the equivalent mutation rate, and run tsdate again on just those sites, again plotting the same histograms. To find the IS sites to keep, we could either remove sample mutations first (leaving 92% IS sites), or if we are worried that this will bias the estimates, use only the 40% of sites that have single mutations. Either way, we will need to decrease the mutation rate in tsdate by multiplying by 0.92 or 0.4 respectively.
The text was updated successfully, but these errors were encountered:
Here's a basic answer to 1. using the data from OOA simulated trees inferred with sequencing & ancestral state error. It looks like we underestimate (not overestimate) the OOA even in general, possibly because of not accounting for the demography? Tsdate was run with mutation_rate=1.29e-08, Ne=10000 (the mutation rate is that used in the OOA simulations).
It looks like the extra mutations don't make much of a difference to the OOA peak, increasing it from 2000 generations to ~ 3500 generations (it's actually at 5600 generations in the model: where the red line is). Interesting that whatever the mismatch rate, we put a high peak at 15000 generations in the CEU/CEU plot, whereas it's actually much lower in the original data, about the same height as the OOA peak.
As we increase the mismatch rates, we increase the number of mutations compared to edges in the TS. When dating, all other things being equal, this will appear to make most node times older. We suspect that the human mutation rate we use, 1e-8, is calculated on the basis of infinite sites assumptions. We suspect that this might be pushing the OOA peak in our inferred+dated tree sequences too early. So there are a few ways we could see if this makes a difference:
Simulate the OOA model with error, and test some different mismatch rates. I have a large number of OOA inferred CSs on cycloid. We can just run tsdate on those.
Tsdate the TGP tree sequence that was produced without mismatch, and plot the Afr/Afr vs Non-afr/Non-afr tMRCA histograms to see if they have the same pattern as we see in the merged data, but shifted in time (we don't need to do this for all individuals, just one or two)
Remove the non IS sites and reduce the equivalent mutation rate, and run tsdate again on just those sites, again plotting the same histograms. To find the IS sites to keep, we could either remove sample mutations first (leaving 92% IS sites), or if we are worried that this will bias the estimates, use only the 40% of sites that have single mutations. Either way, we will need to decrease the mutation rate in tsdate by multiplying by 0.92 or 0.4 respectively.
The text was updated successfully, but these errors were encountered: