Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in quantile.default(pi0, prob = 0.1) : missing values and NaN's not allowed if 'na.rm' is FALSE #27

Open
CelineReisser opened this issue Mar 10, 2021 · 10 comments

Comments

@CelineReisser
Copy link

Hi,

With my colleague, we are trying to use OUTFLANKS on a set of 30000 loci in 18 samples.
We prepared the input, selected the pruned loci, and all goes well until we reach the outflanks function, where we get the following error message:

out_trim <- OutFLANK(FstDataFrame=my_fst[which_pruned,], LeftTrimFraction=0.05, RightTrimFraction=0.05,NumberOfSamples=18, qthreshold = 0.05).
Error in quantile.default(pi0, prob = 0.1) : missing values and NaN's not allowed if 'na.rm' is FALSE

We trimmed our VCF of all NA genotypes, and we selected a MAF>0.15.

I am not sure of what is happening here. I tried to look at the source codes for the different functions used in the outflanks function, but couldn't identify the source of the problem.

Any ideas?

Thank you very much for any help.

Celine

@DrK-Lo
Copy link
Collaborator

DrK-Lo commented Mar 10, 2021 via email

@CelineReisser
Copy link
Author

Hi there,

Thank you for the very quick answer.

I tried as mentionned to generate a random variable containing 100000 values using the rchisq function, and then submit it to qvalue function, and it worked... So this might not be the reason?

We however just found out a weird behavior:
We have two large datasets of 11 million SNPs (with missing values) and 6 million SNPs (with no missing values, as we saw that bigsnpr does not handle them properly). We created a subset of each file containing 30,000 SNPs for testing purposes. The outflanks function works on the dataset containing missing data, but not on the one without NA... Everything in those files are identical, except that there is no NA in the latter.

We visualized the R objects created along the pipeline, and they look identical to each other, the Fst calculation goes well for both, only the outflanks function does not work...

@DrK-Lo
Copy link
Collaborator

DrK-Lo commented Mar 10, 2021 via email

@CelineReisser
Copy link
Author

Apparently no, I did the following command:

table(is.na(my_fst$FST))

FALSE
30000

@yvanpapa
Copy link

yvanpapa commented May 6, 2021

Hi, am encountering the same error when using the wrapper for outflank as implemented in DARTR.
gl.outflank(gl_new,plot=T,na.rm=T)->outflank
Error in quantile.default(pi0, prob = 0.1) : missing values and NaN's not allowed if 'na.rm' is FALSE
Is the origin of this error still unknown?

@CelineReisser
Copy link
Author

We still don't know on our side.

We have been working around the problem using other packages to do the outlier detection, and I wanted to come back to it in the next few weeks to try and understand it better. But it seems the error is generated by the package q-value...

@jpfontenelle
Copy link

Heya. Anyone found a solution to this? I get the same error as people above

@CelineReisser
Copy link
Author

Hi there,
No solution so far.

@jpfontenelle
Copy link

I've been playing around a bit with it and while I still can´t figure out a way to pass na.rm=T to the quantile() function that is called internally by OutFLANK, I could "hack" it by playing with the LeftTrimFraction and RightTrimFraction parameters. Mostly by passing higher values than the default ones. Might be worth a try, since it appears to be dataset related. Not ideal, but it is something.

@Afei99357
Copy link

does anyone figure out the issue? I also have the same error so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants