Skip to content

Code required to reproduce results from the book chapter "Machine learning from omics data" (Springer Protocols, 2021)

License

Notifications You must be signed in to change notification settings

Evotec-Bioinformatics/ml-from-omics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Machine learning from omics data

This repository contains the code required to reproduce all results and figures presented in the book chapter "Machine learning from omics data" (Springer Protocols, 2021).

This is an updated version of the original code. It uses a newer Python version, more recent dependencies and Optuna for hyper-parameter search.

Instructions

First, install Python 3.10.10 (newer versions may also work) and then clone this repository. Next, install the dependencies using pip (preferably in a new virtualenv):

pip install -r requirements.txt

Optionally, set the number of processes to use during the hyper-parameter search. The default is 4. If you use SLURM, this will be done automatically.

export SLURM_CPUS_ON_NODE=20

Now, you can cd src/ and train the models:

python main.py

This will take a while. Note, that this script will also download and extract all required datasets and thus needs about 41 GB of disk space. Afterwards generate all four figures by calling:

python create_plots.py

Neither the SVM nor UMAP are fully deterministic methods. Hence, minor deviations from the published results are to be expected.

About

Code required to reproduce results from the book chapter "Machine learning from omics data" (Springer Protocols, 2021)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages