diff --git a/docs/404.html b/docs/404.html index ce9bef3..e52b936 100644 --- a/docs/404.html +++ b/docs/404.html @@ -23,7 +23,7 @@ - + @@ -76,7 +76,7 @@ } @media print { pre > code.sourceCode { white-space: pre-wrap; } -pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } +pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; } } pre.numberSource code { counter-reset: source-line 0; } @@ -174,209 +174,209 @@
When you get the peaks table or features table, annotation of the peaks would help you. Check this review(Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) or other reviews(Chaleckis et al. 2019; Lai et al. 2018; Nash and Dunn 2019; Mark R. Viant et al. 2017; Allard, Genta-Jouve, and Wolfender 2017; Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) for a detailed notes on annotation. The first paper proposed five levels regarding currently computational annotation strategies.
Level 1: Peak Grouping: MS Psedospectra extraction based on peak shape similarity and peak abundance correlation
The major issue in annotation is the redundancy peaks from same metabolite. Unlike genomes, peaks or features from peak selection are not independent with each other. Adducts, in-source fragments and isotopes would lead to wrong annotation. A common solution is that use known adducts, neutral losses, molecular multimers or multiple charged ions to compare mass distances.
Another issue is about the MS/MS database. Only 10% of known metabolites in databases have experimental spectral data. Thus in silico prediction is required. Some works try to fill the gap between experimental data, theoretical values(from chemical database like chemspider) and prediction together. Here is a nice review about MS/MS prediction(Hufsky, Scheubert, and Böcker 2014).
According to the definition from the Chemical Analysis Working Group of the Metabolomics Standards Intitvative(Lloyd W. Sumner et al. 2007; Mark R. Viant et al. 2017). Four levels of confidence could be assigned to identification:
For specific group of compounds such as PFASs, the communication of confidence level could be slightly different(Charbonnet et al. 2022).
Through MS/MS seemed a required step for identification, recent study found ESI might also generate fragments ions for structure identification (Xue, Guijas, et al. 2020; Xue et al. 2021, 2023; Bernardo-Bermejo et al. 2023).
Cheminformatics will help for MS annotation. The first task is molecular formula assignment. For a given accurate mass, the formula should be constrained by predefined element type and atom number, mass error window and rules of chemical bonding, such as double bond equivalent (DBE) and the nitrogen rule. The nitrogen rule is that an odd nominal molecular mass implies also an odd number of nitrogen. This rule should only be used with nominal (integer) masses. Degree of unsaturation or DBE use rings-plus-double-bonds equivalent (RDBE) values, which should be interger. The elements oxygen and sulphur were not taken into account. Otherwise the molecular formula will not be true.
\[RDBE = C+Si - 1/2(H+F+Cl+Br+I) + 1/2(N+P)+1 \]
To assign molecular formula to a mass to charge ratio, Seven Golden Rules (Kind and Fiehn 2007) for heuristic filtering of molecular formulas should be considered:
@@ -542,171 +542,171 @@Full scan mass spectra always contain lots of redundant peaks such as adducts, isotope, fragments, multiple charged ions and other oligomers. Such peaks dominated the features table(Xu, Lu, and Rabinowitz 2015; Sindelar and Patti 2020; Mahieu and Patti 2017). Annotation tools could label those peaks either by known list or frequency analysis of the paired mass distances(Ju et al. 2020; Kouřil et al. 2020).
-You could find adducts list here from commonMZ project.
Here is Isotope pattern prediction.
Common annotation for xcms workflow(Kuhl et al. 2012).
The software could be found here (C. D. Broeckling et al. 2014; Corey D. Broeckling et al. 2016). The package included a vignette to follow.
BioCAn combines the results from database searches and in silico fragmentation analyses and places these results into a relevant biological context for the sample as captured by a metabolic model (Alden et al. 2017).
mzMatch is a modular, open source and platform independent data processing pipeline for metabolomics LC/MS data written in the Java language. (Chokkathukalam et al. 2013; Scheltema et al. 2011) and MetAssign is a probabilistic annotation method using a Bayesian clustering approach, which is part of mzMatch(Daly et al. 2014).
The software could be found here(Uppal, Walker, and Jones 2017).
mWise is an Algorithm for Context-Based Annotation of Liquid Chromatography–Mass Spectrometry Features through Diffusion in Graphs(Barranco-Altirriba et al. 2021).
You could find source code here(Fernández-Albert et al. 2014).
Paired Mass Distance(PMD) analysis for GC/LC-MS based nontarget analysis to remove redundant peaks(M. Yu, Olkowicz, and Pawliszyn 2019).
nontarget could find Isotope & adduct peak grouping, and perform homologue series detection (Loos and Singer 2017).
Binner Deep annotation of untargeted LC-MS metabolomics data (Kachman et al. 2020)
You could find source code here (Mahieu et al. 2016) and it’s for detecting and exploring complex relationships in accurate-mass mass spectrometry data.
ms-flo A Tool To Minimize False Positive Peak Reports in Untargeted Liquid Chromatography–Mass Spectroscopy (LC-MS) Data Processing (DeFelice et al. 2017).
CliqueMS is a computational tool for annotating in-source metabolite ions from LC-MS untargeted metabolomics data based on a coelution similarity network (Senan et al. 2019).
This package is for annotate and interpret deconvoluted mass spectra (mass*intensity pairs) from high resolution mass spectrometry devices. You could use this package to find molecular ions for GC-MS (Jaeger et al. 2016).
NetID is a global network optimization approach to annotate untargeted LC-MS metabolomics data(L. Chen et al. 2021).
De Novo Recognition of In-Source Fragments for Liquid Chromatography–Mass Spectrometry Data(J. Guo et al. 2021)
Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library(Qiong Yang et al. 2023)
Three step workflow: MS1 full scan peak-picking, GlobalStd algorithm to select precursor ions for MS2 from MS1 data and collect the MS2 data and annotation with GNPS(M. Yu, Dolios, and Petrick 2022).
A molecular-formula-oriented method to target the metabolome(Giné et al. 2021).
Similar work can be found here with inclusion list of differential and preidentified ions (dpDDA)(Y. Zhang et al. 2023).
A computational approach to generate adatabase of high-resolution-MS n spectra by converting existing low-resolution MSn spectra using complementary high-resolution-MS2 spectra generated by beam-type CAD(Lieng et al. 2023).
MS/MS annotation is performed to generate a matching score with library spectra. The most popular matching algorithm is dot product similarity. A recent study found spectral entropy algorithm outperformed dot product similarity [Y. Li et al. (2021);Y. Li and Fiehn (2023);]. Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment showed modified cosine similarity outperformed neutral loss matching and the cosine similarity in all cases. The performance of MS/MS spectrum alignment depends on the location and type of the modification, as well as the chemical compound class of fragmented molecules(Bittremieux et al. 2022). This work proposed a method weighting low-intensity MS/MS ions and m/z frequency for spectral library annotation, which will be help to annotate unknown spectra(Engler Hart et al. 2024). BLINK enables ultrafast tandem mass spectrometry cosine similarity scoring(Harwood et al. 2023). MS2Query enable the reliable and scalable MS2 mass spectra-based analogue search by machine learning(de Jonge et al. 2023). However, A spectroscopic test suggests that fragment ion structure annotations in MS/MS libraries are frequently incorrect(van Tetering et al. 2024).
Machine learning can also be applied for MS2 annotation(Codrean et al. 2023; H. Guo et al. 2023; Bilbao et al. 2023).
You could check \[Workflow\] section for popular platform. Here are some stand-alone annotation software:
-Matchms is an open-source Python package to import, process, clean, and compare mass spectrometry data (MS/MS). It allows to implement and run an easy-to-follow, easy-to-reproduce workflow from raw mass spectra to pre- and post-processed spectral data. Spectral data can be imported from common formats such mzML, mzXML, msp, metabolomics-USI, MGF, or json (e.g. GNPS-syle json files). Matchms then provides filters for metadata cleaning and checking, as well as for basic peak filtering. Finally, matchms was build to import and apply different similarity measures to compare large amounts of spectra. This includes common Cosine scores, but can also easily be extended by custom measures. Example for spectrum similarity measures that were designed to work in matchms are Spec2Vec and MS2DeepScore(Huber et al. 2020).
MetDNA is the Metabolic reaction network-based recursive metabolite annotation for untargeted metabolomics (Shen et al. 2019).
Java based integration of compound identification strategies. You could access the application here (Gerlich and Neumann 2013).
MS2Analyzer could annotate small molecule substructure from accurate tandem mass spectra. (Ma et al. 2014)
MetFrag could be used to make in silico prediction/match of MS/MS data(Ruttkies et al. 2016; Wolf et al. 2010).
CFM-ID use Metlin’s data to make prediction (Allen et al. 2014) and 4.0 (Allen et al. 2014).
A machine learning framework for structural annotation of small-molecule data arising from liquid chromatography–tandem mass spectrometry (LC-MS2) measurements.(Bach, Schymanski, and Rousu 2022)
LipidFrag could be used to make in silico prediction/match of lipid related MS/MS data (Witting et al. 2017).
in silico: in silico lipid mass spectrum search (Koelmel et al. 2017).
Bar coding select mass-to-charge regions containing the most informative metabolite fragments and designate them as bins. Then translate each metabolite fragmentation pattern into a binary code by assigning 1’s to bins containing fragments and 0’s to bins without fragments. Such coding annotation could be used for MRM data (Spalding et al. 2016).
This online application is a network-based computation method for annotation (Aguilar-Mogas et al. 2017).
XGBoost based MS/MS spectral cleaning tool using intensity ratio fluctuation, appearance rate, and relative intensity(T. Zhao et al. 2023).
Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets(Baygi, Kumar, and Barupal 2023).
Physicochemical Property can be used for annotation with a specific experimental design(Abrahamsson et al. 2023).