Ms2 basEd saMple vectOrization (MEMO) is a method allowing a Retention Time (RT) agnostic alignment of metabolomics samples using the fragmentation spectra (MS2) of their consituents. The occurence of MS2 peaks and neutral losses (to the precursor) in each sample is counted and used to generate an MS2 fingerprint of the sample. These fingerprints can in a second stage be aligned to compare different samples. Once obtained, different filtering (remove peaks/losses from blanks for example) and visualization techniques (MDS/PCoA, TMAP, Heatmap, ...) can be used. MEMO suits particularly well to compare chemodiverse samples, ie with a poor features overlap, or to compare samples with a strong RT shift, acquired using different LC methods or even different mass spectrometers technology (Maxiis Q-ToF vs Q-Exactive Orbitrap).
For documentation, see our readthedocs. Different examples of application and comparison to other MS/MS based metrics are available here and the corresponding notebooks are available on GitHub.
- If you use MEMO, please cite the following papers:
- Gaudry A, Huber F, Nothias L-F, Cretton S, Kaiser M, Wolfender J-L, et al. MEMO: Mass Spectrometry-Based Sample Vectorization to Explore Chemodiverse Datasets. Frontiers in Bioinformatics. 2022;2. https://doi.org/10.3389/fbinf.2022.842964
- Huber, Florian, Stefan Verhoeven, Christiaan Meijer, Hanno Spreeuw, Efraín Castilla, Cunliang Geng, Justin van der Hooft, et al. 2020. “Matchms - Processing and Similarity Evaluation of Mass Spectrometry Data.” Journal of Open Source Software 5 (52): 2411. https://doi.org/10.21105/joss.02411
- Huber, Florian, Lars Ridder, Stefan Verhoeven, Jurriaan H. Spaaks, Faruk Diblen, Simon Rogers, and Justin J. J. van der Hooft. 2021. “Spec2Vec: Improved Mass Spectral Similarity Scoring through Learning of Structural Relationships.” PLoS Computational Biology 17 (2): e1008724. https://doi.org/10.1371/journal.pcbi.1008724
First make sure to have anaconda installed.
A.1. Create a new conda environment to avoid clashes:
conda create --name memo python=3.8
conda activate memo
A.2. Install with pip:
pip install numpy
pip install memo-ms
If you have an error, try installing scikit-bio from conda-forge (available for Mac and Linux users) or pip (for Windows users) before installing the package with pip. For Windows users, you will need to install C++ build tools (download here: https://visualstudio.microsoft.com/visual-cpp-build-tools/, see this answer for help https://stackoverflow.com/a/50210015):
conda install -c conda-forge scikit-bio
# or for Windows user
pip install scikit-bio
pip install memo-ms
You can clone the repository to get the demo spectra and quant table files and test the package using the Tutorial notebook!
NB: If you have this error when loading the memo package:
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
Uninstall and reinstall scikit-bio with no dependencies using this command:
pip uninstall scikit-bio
pip install scikit-bio --no-cache-dir --no-binary :all:
B.1. First clone the repository using git clone in command line:
git clone https://github.com/mandelbrot-project/memo.git # or ssh
B.2. Create a new conda environment to avoid clashes:
conda create --name memo python=3.8
conda activate memo
B.3. Install the package locally using pip
pip install .
It is located in the tutorial folder
You can also find a list of notebook to reproduce results of the MEMO paper. The repo is over there https://github.com/mandelbrot-project/memo_publication_examples
Create an environment with
git clone https://github.com/mandelbrot-project/memo.git
cd memo
conda create --name memo-dev python=3.8
conda activate memo-dev
Then install dependencies and memo:
python -m pip install --upgrade pip
pip install numpy
pip install --editable .[dev]
# pip install -e .'[dev]' (on mac)
Memo tests can be run by:
pytest
And the code linter with
prospector
MEMO is licensed under the GNU General Public License v3.0. Permissions of this strong copyleft license are conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.