Sleipnir is a C++ library enabling efficient analysis, integration, mining, and machine learning over genomic data. This includes a particular focus on microarrays, since they make up the bulk of available data for many organisms, but Sleipnir can also integrate a wide variety of other data types, from pairwise physical interactions to sequence similarity or shared transcription factor binding sites.
Main documentation:
The Sleipnir wiki and bug reporting system are at: (TBD)
The file README.developer has notes for Sleipnir developers.
Sleipnir also includes the code to compile SEEK (the human coexpression search engine). See the link for information on its installation.
The latest version of Sleipnir software can be obtained by issuing the following command:
git clone
Install g++, cmake
Install libraries
- On Mac:
brew install libsvm
brew install libomp
brew install thrift
brew install gsl
brew install boost
- On CentOS Linux:
sudo yum install libsvm
sudo yum install libgomp
sudo yum install thrift-devel
sudo yum install gsl
sudo yum install boost
- On Ubuntu Linux:
apt-get update
apt-get install build-essential
apt-get install libsvm-dev
apt-get install libomp-dev
apt-get install libthrift-dev
apt-get install libgsl-dev
apt-get install libboost-dev
apt-get install libboost-graph-dev
apt-get install libboost-regex-dev
apt-get install libreadline-dev
- On Mac:
Clone repository
git clone
cd sleipnir
git submodule init
git submodule update
Prep make files with cmake
mkdir Debug
cd Debug/
cmake -DCMAKE_BUILD_TYPE=Debug ..
- Alternately replace 'Debug' with 'Release' in all the above commands to make the release build
Build the code
- (On Mac) - Edit sleipnir/src/libsvm.h
- Replace: #include <libsvm/svm.h>
- With: #include <svm.h>
cd Debug/
- In case of errors:
make clean
make VERBOSE=1
- In case of errors:
- (On Mac) - Edit sleipnir/src/libsvm.h
[Optional] Install SVM_PERF libraries to build: Data2SVM, SVMperfer, SVMperfing, SVMfe, SVMer
mkdir svm_perf; cd svm_perf; tar xzvf ../svm_perf.tar.gz
ar rcs libsvmperf.a *.o /.o
cd ..; cp -a svm_perf /usr/local/lib/
ln -s /usr/local/lib/svm_perf/libsvmperf.a /usr/local/lib
ln -s /usr/local/lib/svm_perf /usr/local/include
One-time prep: create the conda environment (by default this will create the 'genomics' conda env)
conda env create --file scripts/seek/conda_environment.yml
Run the c++ unit tests
Test the scripts for building and merging SEEK database compendiums
conda activate genomics
python -m pytest -s -v scripts/seek/tests
Run the SEEK system tests (test SeekMiner and SeekRPC)
conda activate genomics
python -m pytest -s -v tests/
Run Seek DB tests (test that the database gives expected bio-informative results). These tests can only be run where the full SEEK database is installed.
cd tests/bioinform_tests
- PREP: Install and init Git LFS (Large File Storage)
- On Mac:
brew install git-lfs
- On Centos:
yum install git-lfs
- On Ubuntu:
apt-get install git-lfs
- Initialize git-lfs:
git lfs install
- Refresh the gold standard tgz files (should be multipe MB in size)
rm gold_standard_results/*
git restore gold_standard_results/*
- On Mac:
- Run the tests:
(The bioinform test has an option for different lengths of test, i.e. how many queries are run)
bash -v -s <path_to_seek_db> -b <path_to_seek_binaries>
bash -v -s <path_to_seek_db> -b <path_to_seek_binaries>
bash -v -s <path_to_seek_db> -b <path_to_seek_binaries> -t [tiny,short,medium,long]