Here we create a new version of GEMMA that acts like a toolbox for GWA, mapping and inference.
GEMMA: Genome-wide efficient ‘exact’ mixed-model analysis for association studies that correct that accounts for population stratification and sample structure.
In this source code repository we work on the next generation of GEMMA and LMM related software with the goal of creating a library of functionality which can be used from languages such as R and Python.
GEMMA2/lib is therefore meant to be a library written in Rust, C and D. The front-end is initially written in Python. R and Packet FFIs may follow.
On our http://genenetwork.org/ systems we run GEMMA every day. As part of a Systems Genetics and Precision Medicine Project we are targetting GEMMA2/lib to be a faster and more flexible tool.
IMPORTANT NOTICE: this software is no longer under active development. YMMV. Part of the functionality has moved to gemma-wrapper.
For more information contact Pjotr Prins. A BLOG about the project can be found here.
See ./INSTALL.org.
GEMMA is called from the command line using the gemma2
command. Try
gemma2 --help
for a list of commands.
GEMMA2 differs from GEMMA1 but adds a compatibility layer interpreting GEMMA1 switches. To do a pass-through to GEMMA1 simply use the gemma1 command:
gemma2 gemma1 [old switches]
To try the examples:
# compute Kinship matrix
gemma2 gemma1 -g ./example/mouse_hs1940.geno.txt.gz -p ./example/mouse_hs1940.pheno.txt \
-gk -o mouse_hs1940
# run univariate LMM
gemma2 gemma1 -g ./example/mouse_hs1940.geno.txt.gz \
-p ./example/mouse_hs1940.pheno.txt -n 1 -a ./example/mouse_hs1940.anno.txt \
-k ./output/mouse_hs1940.cXX.txt -lmm -o mouse_hs1940_CD8_lmm
You can set the gemma1 binary with the –bin switch or GEMMA1_BIN environment variable.
gemma2
is always invoked with a command (e.g. filter
, grm
and lmm
)
followed by specific switches. Before command a number of generic
information switches can be used
gemma2 [-vv] [--log INFO] [--debug] command [specific switches]
Where repeated -v
switches increase verbosity. The --log
switch
shows the log level (DEBUG|INFO|WARNING|ERROR) which is set to WARNING
by default. The --debug
switch puts gemma2
in debug mode.
The convert
command can convert from plink and BIMBAM formats to
R/qtl2 format. GEMMA2, unlike GEMMA1, uses a R/qtl2 based (from now on
GEMMA2) format where genotypes and phenotypes are stored in a ‘tidy’
format and metadata is represented in YAML/JSON.
If you want a quick preview use the --debug
switch:
gemma2 -vv --log INFO --debug ALL convert --plink example/mouse_hs1940 cat mouse_hs1940.json
{
"description": "mouse_hs1940",
"crosstype": "hs",
"sep": "\t",
"na.strings": [
"-",
"NA"
],
"comment.char": "#",
"individuals": 1940,
"markers": 12226,
"phenotypes": 7,
"geno": "mouse_hs1940_geno.tsv.gz",
"pheno": "mouse_hs1940_pheno.tsv",
"alleles": [
"A",
"B",
"H"
],
"genotypes": {
"A": 1,
"H": 2,
"B": 3
},
"geno_sep": false,
"geno_transposed": true
}
Note that this format has no concept of minor/major allele encoding as is used in plink and BIMBAM formats.
GEMMA2 can write BIMBAM from GEMMA2 (R/qtl2) format using the export
function and a control file. E.g.
gemma2 -vv --log INFO --debug ALL export --bimbam -c control
Using the control file generated from convert
:
gemma2 --debug --log INFO -vv grm -c mouse_hs1940.json
GEMMA and GEMMA2/lib are published under the GPLv3 LICENSE.