Machine learning from omics data

This repository contains the code required to reproduce all results and figures presented in the book chapter "Machine learning from omics data" (Springer Protocols, 2021).

This is an updated version of the original code. It uses a newer Python version, more recent dependencies and Optuna for hyper-parameter search.

Instructions

First, install Python 3.10.10 (newer versions may also work) and then clone this repository. Next, install the dependencies using pip (preferably in a new virtualenv):

pip install -r requirements.txt

Optionally, set the number of processes to use during the hyper-parameter search. The default is 4. If you use SLURM, this will be done automatically.

export SLURM_CPUS_ON_NODE=20

Now, you can cd src/ and train the models:

python main.py

This will take a while. Note, that this script will also download and extract all required datasets and thus needs about 41 GB of disk space. Afterwards generate all four figures by calling:

python create_plots.py

Neither the SVM nor UMAP are fully deterministic methods. Hence, minor deviations from the published results are to be expected.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine learning from omics data

Instructions

About

Releases

Packages

Languages

License

Evotec-Bioinformatics/ml-from-omics

Folders and files

Latest commit

History

Repository files navigation

Machine learning from omics data

Instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages