Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reading of gzipped files #12

Closed
lckarssen opened this issue Nov 13, 2015 · 2 comments
Closed

Allow reading of gzipped files #12

lckarssen opened this issue Nov 13, 2015 · 2 comments

Comments

@lckarssen
Copy link
Member

At least for genetic data the option of reading gzipped files would be great. With current imputed data sets this would save a lot of disk space.

@maarten-k has proof-of-principle code in his fork at https://github.com/maarten-k/ProbABEL

@lckarssen
Copy link
Member Author

See also Issue #20.

lckarssen added a commit to PolyOmica/ProbABEL that referenced this issue Mar 1, 2016
According to Yurii's reply on a forum topic, this option never worked as
intended and should have been removed long time ago: "Never use
'--interaction_only' option! This is something form our early
experiments with GxE, and it has proven to generate crap results (GC
lambda << 1)."
@lckarssen
Copy link
Member Author

Given that with PR #42 the read-gzipped-genotypes branch now allows users to used gzipped info, map, invsigma and dose/prob files, I think we have enough to close this issue.
Reading gzipped phenotype data is not possible yet, but also not really required, IMHO. See my motivation in the comment on PR #42:

We could add reading of gzipped phenotype files as well, but since most people create these from R, I guess they won't bother to zip them. Moreover, the phenotype file is currently opened and closed several times in the process of extracting all phenotype info (determining nr of samples and covariates, finding out which lines have NAs, etc), so doing that for a zipped file would be more time consuming. Implementing a "read phenotype data once" strategy in the current code isn't trivial either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant