Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing data -- allowed in OutFLANK but not bigsnpr #15

Open
CassinSackett opened this issue Sep 8, 2018 · 0 comments
Open

Missing data -- allowed in OutFLANK but not bigsnpr #15

CassinSackett opened this issue Sep 8, 2018 · 0 comments

Comments

@CassinSackett
Copy link

CassinSackett commented Sep 8, 2018

Perhaps this is a philosophical rather than a technical question.

I would like to use OutFLANK to detect possible outliers in several population pairs, each of which has up to 20% missing data (70k - 200k SNPs per population pair). OutFLANK handles missing data just fine, but the entire basis of the program relies on trimming SNPs, which is suggested to be performed in bigsnpr. However, bigsnpr does not accept missing data (instead recommending imputation of genotypes).

Do you have any reason to believe imputing data might reduce the power to detect outliers because it should tend to homogenize loci? Or, conversely, could imputation lead to false positives at loci that did not have missing data, if many other sites were imputed?

Alternatively, do you know of other R packages that could trim SNPs that do allow missing data?

Thank you for your thoughts,
Loren

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant