Microbiome toolbox is a collection of tools and methods for microbiome data and it includes data analysis and exploration, data preparation, microbiome trajectory modeling, outlier discovery and intervention. Our toolbox also encompasses most of the common machine learning algorithms that exist in different packages.
Features:
- Data analysis and exploration of microbiota data: analise the bacteria from a given dataset including the longitudinal analysis
- Data preparation: different ways of preparing data for machine learning model training to build microbiome trajectory
- Feature extraction: decreasing the number of features used for building the trajectory by selecting the top important ones
- Reference vs. non-reference data analysis: using different techniques to classify samples into these two groups (optional)
- Log-ratios data transformation: transforming bacteria abundances to log-ratio w.r.t. the chosen bacteria
- Microbiome trajectory: build different microbiome trajectories with machine learning models and compare with statistical tests
- Anomaly detection: detect samples that are labeled as reference but based on the bacteria analysis are actually outliers
- Boxplot with time: check what are top important bacteria in a reference group of samples w.r.t. time
- Intervention simulation: what are a few bacteria that should be modified to move the non-reference sample into reference group
Dataset
Input |
|
---|---|
Processing |
|
Output |
|
Reference | https://github.com/JelenaBanjac/microbiome-toolbox/blob/main/microbiome/dataset.py |
Trajectory
Input |
|
---|---|
Processing |
|
Output |
|
Reference | https://github.com/JelenaBanjac/microbiome-toolbox/blob/main/microbiome/trajectory.py |
The microbiome toolbox has a PyPi package available.
Python virtual environment:
# locate yourself to project root
cd $ROOT
# create virtual environment for python
python3.9 -m venv venv
# activate the environment
. venv/bin/activate
# make sure pip is the latest version!
pip install --upgrade pip
# install microbiome-toolbox in edit mode for development
pip install -e .
Note: if you add new libraries to requirements.in
, you need to re-generate the requirements.txt
in order to reproduce the working state.
# install pip-compile
pip install pip-tools
# generate requirements.txt from requiremenrs.in
pip-compile --allow-unsafe
After you successfully installed the microbiome-toolbox and activated the environment, just execute the following command/s:
# run the server
python webapp/index.py
# if complains on LegacyVersion, install packaging manually
# pip install packaging==21.3.0
For the toolbox usage, check the notebooks:
- Paper publication available on Bioinformatics.
- We are on the curated set and Plotly&Dash 500 of STEM focused Plotly Dash apps (as Microbiome-Toolbox)!
- The dashboard is also hosted on Azure (temporarily down).
If you notice any issues, please report them at Github issues. Also, feel free to contribute!
The project is licensed under the MIT license.
Jelena Banjac, Shaillay Kumar Dogra, Norbert Sprenger
Please cite our paper if you use it.
BibTeX citation style:
@article{10.1093/bioinformatics/btac781,
author = {Banjac, Jelena and Sprenger, Norbert and Dogra, Shaillay Kumar},
title = "{Microbiome Toolbox: methodological approaches to derive and visualize microbiome trajectories}",
journal = {Bioinformatics},
volume = {39},
number = {1},
year = {2022},
month = {12},
abstract = "{The gut microbiome changes rapidly under the influence of different factors such as age, dietary changes or medications to name just a few. To analyze and understand such changes, we present a Microbiome Toolbox. We implemented several methods for analysis and exploration to provide interactive visualizations for easy comprehension and reporting of longitudinal microbiome data.Based on the abundance of microbiome features such as taxa as well as functional capacity modules, and with the corresponding metadata per sample, the Microbiome Toolbox includes methods for (i) data analysis and exploration, (ii) data preparation including dataset-specific preprocessing and transformation, (iii) best feature selection for log-ratio denominators, (iv) two-group analysis, (v) microbiome trajectory prediction with feature importance over time, (vi) spline and linear regression statistical analysis for testing universality across different groups and differentiation of two trajectories, (vii) longitudinal anomaly detection on the microbiome trajectory and (viii) simulated intervention to return anomaly back to a reference trajectory.The software tools are open source and implemented in Python. For developers interested in additional functionality of the Microbiome Toolbox, it is modular allowing for further extension with custom methods and analysis. The code, python package and the link to the interactive dashboard of Microbiome Toolbox are available on GitHub https://github.com/JelenaBanjac/microbiome-toolboxSupplementary data are available at Bioinformatics online.}",
issn = {1367-4811},
doi = {10.1093/bioinformatics/btac781},
url = {https://doi.org/10.1093/bioinformatics/btac781},
note = {btac781},
eprint = {https://academic.oup.com/bioinformatics/article-pdf/39/1/btac781/48520839/btac781.pdf},
}