Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with the number of populations #8

Open
katymoo opened this issue Aug 11, 2017 · 3 comments
Open

Problems with the number of populations #8

katymoo opened this issue Aug 11, 2017 · 3 comments

Comments

@katymoo
Copy link

katymoo commented Aug 11, 2017

Dear Micheal and Katie,

I am trying to run OutFlank on my SNP dataset and am running into a problem with setting the number of populations. I think there must be something wrong with the way I am specifying the population names, but I can't figure out what I'm doing wrong. I have 10 populations and 132 sampled individuals.

I tried using the following script:

SNPmat <- read.table("SNPmat.txt")
locusNames <- seq(from=1, to= 108586, by=1)
popNames <- c(replicate(22,"CAM"), replicate(2,"GAM"),replicate(9,"KAS"),replicate(13,"LOB"), replicate(9,"LOP"),replicate(19,"MCR"),replicate(10,"MBD"),replicate(14,"MIN"),replicate(22,"NDI"), replicate(12,"TAK"))
FstDataFrame <- MakeDiploidFSTMat(SNPmat,locusNames,popNames)
OutFLANK(FstDataFrame, LeftTrimFraction=0.05,RightTrimFraction=0.05, Hmin=0.1, 10,qthreshold=0.05)

However I get the following error message:
Error in optim(NumberOfSamples, localNLLAllData, lower = 2, method = "L-BFGS-B") :
L-BFGS-B needs finite values of 'fn'
In addition: Warning messages:
1: In IncompleteGammaFunction(df/2, df * HighTrimPoint/(2 * Fstbar)) :
value out of range in 'gammafn'
2: In IncompleteGammaFunction(df/2, df * LowTrimPoint/(2 * Fstbar)) :
value out of range in 'gammafn'

When I change the number of sampled populations to 132 (which is the number of individuals), the error message goes away. So OutFLANK(FstDataFrame, LeftTrimFraction=0.05,RightTrimFraction=0.05, Hmin=0.1, 132,qthreshold=0.05) runs fine, but then OutFlank is presumably seeing each individual as a sampled population so the results would be meaningless.

My popNames look like this:

[1] "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM"
[13] "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "CAM" "GAM" "GAM"
[25] "KAS" "KAS" "KAS" "KAS" "KAS" "KAS" "KAS" "KAS" "KAS" "LOB" "LOB" "LOB"
[37] "LOB" "LOB" "LOB" "LOB" "LOB" "LOB" "LOB" "LOB" "LOB" "LOB" "LOP" "LOP"
[49] "LOP" "LOP" "LOP" "LOP" "LOP" "LOP" "LOP" "MCR" "MCR" "MCR" "MCR" "MCR"
[61] "MCR" "MCR" "MCR" "MCR" "MCR" "MCR" "MCR" "MCR" "MCR" "MCR" "MCR" "MCR"
[73] "MCR" "MCR" "MBD" "MBD" "MBD" "MBD" "MBD" "MBD" "MBD" "MBD" "MBD" "MBD"
[85] "MIN" "MIN" "MIN" "MIN" "MIN" "MIN" "MIN" "MIN" "MIN" "MIN" "MIN" "MIN"
[97] "MIN" "MIN" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI"
[109] "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI" "NDI"
[121] "TAK" "TAK" "TAK" "TAK" "TAK" "TAK" "TAK" "TAK" "TAK" "TAK" "TAK" "TAK"

I have also tried creating an input file specifying the population names (see attached file)
PopNames.txt

Then I used the following commands to try and run OutFLANK:

pops<-read.csv("PopNames.txt")
FstDataFrame<-MakeDiploidFSTMat(SNPmat,locusNames,popNames=pops$x)

However I still have the same problem - OutFLANK only runs if I set the number of sampled populations to 132.

I am sure I'm making a silly little mistake but I really can't figure out what it might be. Any help would be greatly appreciated!

Thank you!

Katy

@DrK-Lo
Copy link
Collaborator

DrK-Lo commented Aug 13, 2017 via email

@katymoo
Copy link
Author

katymoo commented Aug 15, 2017

Hi, many thanks for your response!

I'm still really confused then, maybe something is incorrect in my SNPmat file. I have zipped it up and attached it here.

SNPmat2.txt.tar.gz

I am trying to set the NumberOfSamples to 10 (there are 10 populations, CM and GC are two separate pops) but I get the following error message:
Error in optim(NumberOfSamples, localNLLAllData, lower = 2, method = "L-BFGS-B")
The analysis only seems to run if I set NumberOfSamples to 132, which is the number of individuals rather than populations. So I'm still not sure what I'm doing wrong.

This is my code:

SNPmat <- read.table("SNPmat2.txt")
locusNames <- seq(from=1, to= 108586, by=1)
pops<-read.csv("PopNames.txt")
FstDataFrame<-MakeDiploidFSTMat(SNPmat,locusNames,popNames=pops$x)
OutFLANK(FstDataFrame, LeftTrimFraction=0.05, RightTrimFraction=0.05 ,Hmin=0.1, 10, qthreshold=0.05)

Many thanks for your help!

@EveTC
Copy link

EveTC commented Sep 4, 2020

Hi @DrK-Lo,

I am receiving the same error as @katymoo. Did anyone find a solution to this error?

I have made sure to include the number of populations in the NumberOfSamples argument as so:

sw_out <- OutFLANK(FstDataFrame=sw.Fs, LeftTrimFraction=0.05, RightTrimFraction=0.05, Hmin=0.1, NumberOfSamples=36, qthreshold=0.1)

but it outputs this error:

Error in optim(NumberOfSamples, localNLLAllData, lower = 2, method = "L-BFGS-B") :
  L-BFGS-B needs finite values of 'fn'

I also recieve the same error message when I set the argument to the number of individuals.
Is there a limit to how many popualtion OutFLANK can handle?

Any help would be greatly appreciated!
Thanks,
Eve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants