diff --git a/docs/404.html b/docs/404.html index ce9bef3..e52b936 100644 --- a/docs/404.html +++ b/docs/404.html @@ -23,7 +23,7 @@ - + @@ -76,7 +76,7 @@ } @media print { pre > code.sourceCode { white-space: pre-wrap; } -pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } +pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; } } pre.numberSource code { counter-reset: source-line 0; } @@ -174,209 +174,209 @@
  • Preface
  • -
  • 2 Introduction +
  • 1 Introduction
  • -
  • 3 Experimental design(DoE) +
  • 2 Experimental design(DoE)
  • -
  • 4 Pretreatment +
  • 3 Pretreatment
  • -
  • 5 Instrumental analysis +
  • 4 Instrumental analysis
  • -
  • 6 Workflow +
  • 5 Workflow
  • -
  • 7 Raw data pretreatment +
  • 6 Raw data pretreatment
  • -
  • 8 Annotation +
  • 7 Annotation
  • -
  • 9 Omics analysis +
  • 8 Omics analysis
  • -
  • 10 Peaks normalization +
  • 9 Peaks normalization
  • -
  • 11 Statistical analysis +
  • 10 Statistical analysis
  • -
  • 12 Exposome +
  • 11 Exposome
  • References
  • diff --git a/docs/Metabolomics_files/figure-html/bem-1.png b/docs/Metabolomics_files/figure-html/bem-1.png index e38ffff..a70df89 100644 Binary files a/docs/Metabolomics_files/figure-html/bem-1.png and b/docs/Metabolomics_files/figure-html/bem-1.png differ diff --git a/docs/annotation.html b/docs/annotation.html index e32ee88..3a58a6f 100644 --- a/docs/annotation.html +++ b/docs/annotation.html @@ -4,18 +4,18 @@ - Chapter 8 Annotation | Meta-Workflow + Chapter 7 Annotation | Meta-Workflow - + - + @@ -23,7 +23,7 @@ - + @@ -76,7 +76,7 @@ } @media print { pre > code.sourceCode { white-space: pre-wrap; } -pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } +pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; } } pre.numberSource code { counter-reset: source-line 0; } @@ -174,209 +174,209 @@
  • Preface
  • -
  • 2 Introduction +
  • 1 Introduction
  • -
  • 3 Experimental design(DoE) +
  • 2 Experimental design(DoE)
  • -
  • 4 Pretreatment +
  • 3 Pretreatment
  • -
  • 5 Instrumental analysis +
  • 4 Instrumental analysis
  • -
  • 6 Workflow +
  • 5 Workflow
  • -
  • 7 Raw data pretreatment +
  • 6 Raw data pretreatment
  • -
  • 8 Annotation +
  • 7 Annotation
  • -
  • 9 Omics analysis +
  • 8 Omics analysis
  • -
  • 10 Peaks normalization +
  • 9 Peaks normalization
  • -
  • 11 Statistical analysis +
  • 10 Statistical analysis
  • -
  • 12 Exposome +
  • 11 Exposome
  • References
  • @@ -400,8 +400,8 @@

    -
    -

    Chapter 8 Annotation

    +
    +

    Chapter 7 Annotation

    When you get the peaks table or features table, annotation of the peaks would help you. Check this review(Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) or other reviews(Chaleckis et al. 2019; Lai et al. 2018; Nash and Dunn 2019; Mark R. Viant et al. 2017; Allard, Genta-Jouve, and Wolfender 2017; Domingo-Almenara, Montenegro-Burke, Benton, et al. 2018) for a detailed notes on annotation. The first paper proposed five levels regarding currently computational annotation strategies.

    • Level 1: Peak Grouping: MS Psedospectra extraction based on peak shape similarity and peak abundance correlation

    • @@ -411,13 +411,13 @@

      Chapter 8 Annotation -

      8.1 Issues in annotation

      +
      +

      7.1 Issues in annotation

      The major issue in annotation is the redundancy peaks from same metabolite. Unlike genomes, peaks or features from peak selection are not independent with each other. Adducts, in-source fragments and isotopes would lead to wrong annotation. A common solution is that use known adducts, neutral losses, molecular multimers or multiple charged ions to compare mass distances.

      Another issue is about the MS/MS database. Only 10% of known metabolites in databases have experimental spectral data. Thus in silico prediction is required. Some works try to fill the gap between experimental data, theoretical values(from chemical database like chemspider) and prediction together. Here is a nice review about MS/MS prediction(Hufsky, Scheubert, and Böcker 2014).

      -
      -

      8.2 Peak misidentification

      +
      +

      7.2 Peak misidentification

      • Isomer
      @@ -430,8 +430,8 @@

      8.2 Peak misidentificationIn-source degradation products

    -
    -

    8.3 Annotation v.s. identification

    +
    +

    7.3 Annotation v.s. identification

    According to the definition from the Chemical Analysis Working Group of the Metabolomics Standards Intitvative(Lloyd W. Sumner et al. 2007; Mark R. Viant et al. 2017). Four levels of confidence could be assigned to identification:

    -
    -

    8.4 Molecular Formula Assignment

    +
    +

    7.4 Molecular Formula Assignment

    Cheminformatics will help for MS annotation. The first task is molecular formula assignment. For a given accurate mass, the formula should be constrained by predefined element type and atom number, mass error window and rules of chemical bonding, such as double bond equivalent (DBE) and the nitrogen rule. The nitrogen rule is that an odd nominal molecular mass implies also an odd number of nitrogen. This rule should only be used with nominal (integer) masses. Degree of unsaturation or DBE use rings-plus-double-bonds equivalent (RDBE) values, which should be interger. The elements oxygen and sulphur were not taken into account. Otherwise the molecular formula will not be true.

    \[RDBE = C+Si - 1/2(H+F+Cl+Br+I) + 1/2(N+P)+1 \]

    To assign molecular formula to a mass to charge ratio, Seven Golden Rules (Kind and Fiehn 2007) for heuristic filtering of molecular formulas should be considered:

    @@ -542,171 +542,171 @@

    8.4 Molecular Formula Assignment<
  • BUDDY can perform molecular formula discovery via bottom-up MS/MS interrogation(Xing et al. 2023).
  • -
    -

    8.5 Redundant peaks

    +
    +

    7.5 Redundant peaks

    Full scan mass spectra always contain lots of redundant peaks such as adducts, isotope, fragments, multiple charged ions and other oligomers. Such peaks dominated the features table(Xu, Lu, and Rabinowitz 2015; Sindelar and Patti 2020; Mahieu and Patti 2017). Annotation tools could label those peaks either by known list or frequency analysis of the paired mass distances(Ju et al. 2020; Kouřil et al. 2020).

    -
    -

    8.5.1 Adducts list

    +
    +

    7.5.1 Adducts list

    You could find adducts list here from commonMZ project.

    -
    -

    8.5.2 Isotope

    +
    +

    7.5.2 Isotope

    Here is Isotope pattern prediction.

    -
    -

    8.5.3 CAMERA

    +
    +

    7.5.3 CAMERA

    Common annotation for xcms workflow(Kuhl et al. 2012).

    -
    -

    8.5.4 RAMClustR

    +
    +

    7.5.4 RAMClustR

    The software could be found here (C. D. Broeckling et al. 2014; Corey D. Broeckling et al. 2016). The package included a vignette to follow.

    -
    -

    8.5.5 BioCAn

    +
    +

    7.5.5 BioCAn

    BioCAn combines the results from database searches and in silico fragmentation analyses and places these results into a relevant biological context for the sample as captured by a metabolic model (Alden et al. 2017).

    -
    -

    8.5.6 mzMatch

    +
    +

    7.5.6 mzMatch

    mzMatch is a modular, open source and platform independent data processing pipeline for metabolomics LC/MS data written in the Java language. (Chokkathukalam et al. 2013; Scheltema et al. 2011) and MetAssign is a probabilistic annotation method using a Bayesian clustering approach, which is part of mzMatch(Daly et al. 2014).

    -
    -

    8.5.7 xMSannotator

    +
    +

    7.5.7 xMSannotator

    The software could be found here(Uppal, Walker, and Jones 2017).

    -
    -

    8.5.8 mWise

    +
    +

    7.5.8 mWise

    mWise is an Algorithm for Context-Based Annotation of Liquid Chromatography–Mass Spectrometry Features through Diffusion in Graphs(Barranco-Altirriba et al. 2021).

    -
    -

    8.5.9 MAIT

    +
    +

    7.5.9 MAIT

    You could find source code here(Fernández-Albert et al. 2014).

    -
    -

    8.5.10 pmd

    +
    +

    7.5.10 pmd

    Paired Mass Distance(PMD) analysis for GC/LC-MS based nontarget analysis to remove redundant peaks(M. Yu, Olkowicz, and Pawliszyn 2019).

    -
    -

    8.5.11 nontarget

    +
    +

    7.5.11 nontarget

    nontarget could find Isotope & adduct peak grouping, and perform homologue series detection (Loos and Singer 2017).

    -
    -

    8.5.12 Binner

    +
    +

    7.5.12 Binner

    Binner Deep annotation of untargeted LC-MS metabolomics data (Kachman et al. 2020)

    -
    -

    8.5.13 mz.unity

    +
    +

    7.5.13 mz.unity

    You could find source code here (Mahieu et al. 2016) and it’s for detecting and exploring complex relationships in accurate-mass mass spectrometry data.

    -
    -

    8.5.14 MS-FLO

    +
    +

    7.5.14 MS-FLO

    ms-flo A Tool To Minimize False Positive Peak Reports in Untargeted Liquid Chromatography–Mass Spectroscopy (LC-MS) Data Processing (DeFelice et al. 2017).

    -
    -

    8.5.15 CliqueMS

    +
    +

    7.5.15 CliqueMS

    CliqueMS is a computational tool for annotating in-source metabolite ions from LC-MS untargeted metabolomics data based on a coelution similarity network (Senan et al. 2019).

    -
    -

    8.5.16 InterpretMSSpectrum

    +
    +

    7.5.16 InterpretMSSpectrum

    This package is for annotate and interpret deconvoluted mass spectra (mass*intensity pairs) from high resolution mass spectrometry devices. You could use this package to find molecular ions for GC-MS (Jaeger et al. 2016).

    -
    -

    8.5.17 NetID

    +
    +

    7.5.17 NetID

    NetID is a global network optimization approach to annotate untargeted LC-MS metabolomics data(L. Chen et al. 2021).

    -
    -

    8.5.18 ISfrag

    +
    +

    7.5.18 ISfrag

    De Novo Recognition of In-Source Fragments for Liquid Chromatography–Mass Spectrometry Data(J. Guo et al. 2021)

    -
    -

    8.5.19 FastEI

    +
    +

    7.5.19 FastEI

    Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library(Qiong Yang et al. 2023)

    -
    -

    8.6 MS1 MS2 connection

    -
    -

    8.6.1 PMDDA

    +
    +

    7.6 MS1 MS2 connection

    +
    +

    7.6.1 PMDDA

    Three step workflow: MS1 full scan peak-picking, GlobalStd algorithm to select precursor ions for MS2 from MS1 data and collect the MS2 data and annotation with GNPS(M. Yu, Dolios, and Petrick 2022).

    -
    -

    8.6.2 HERMES

    +
    +

    7.6.2 HERMES

    A molecular-formula-oriented method to target the metabolome(Giné et al. 2021).

    -
    -

    8.6.3 dpDDA

    +
    +

    7.6.3 dpDDA

    Similar work can be found here with inclusion list of differential and preidentified ions (dpDDA)(Y. Zhang et al. 2023).

    -
    -

    8.7 MS2 MSn connection

    +
    +

    7.7 MS2 MSn connection

    A computational approach to generate adatabase of high-resolution-MS n spectra by converting existing low-resolution MSn spectra using complementary high-resolution-MS2 spectra generated by beam-type CAD(Lieng et al. 2023).

    -
    -

    8.8 MS/MS annotation

    +
    +

    7.8 MS/MS annotation

    MS/MS annotation is performed to generate a matching score with library spectra. The most popular matching algorithm is dot product similarity. A recent study found spectral entropy algorithm outperformed dot product similarity [Y. Li et al. (2021);Y. Li and Fiehn (2023);]. Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment showed modified cosine similarity outperformed neutral loss matching and the cosine similarity in all cases. The performance of MS/MS spectrum alignment depends on the location and type of the modification, as well as the chemical compound class of fragmented molecules(Bittremieux et al. 2022). This work proposed a method weighting low-intensity MS/MS ions and m/z frequency for spectral library annotation, which will be help to annotate unknown spectra(Engler Hart et al. 2024). BLINK enables ultrafast tandem mass spectrometry cosine similarity scoring(Harwood et al. 2023). MS2Query enable the reliable and scalable MS2 mass spectra-based analogue search by machine learning(de Jonge et al. 2023). However, A spectroscopic test suggests that fragment ion structure annotations in MS/MS libraries are frequently incorrect(van Tetering et al. 2024).

    Machine learning can also be applied for MS2 annotation(Codrean et al. 2023; H. Guo et al. 2023; Bilbao et al. 2023).

    You could check \[Workflow\] section for popular platform. Here are some stand-alone annotation software:

    -
    -

    8.8.1 Matchms

    +
    +

    7.8.1 Matchms

    Matchms is an open-source Python package to import, process, clean, and compare mass spectrometry data (MS/MS). It allows to implement and run an easy-to-follow, easy-to-reproduce workflow from raw mass spectra to pre- and post-processed spectral data. Spectral data can be imported from common formats such mzML, mzXML, msp, metabolomics-USI, MGF, or json (e.g. GNPS-syle json files). Matchms then provides filters for metadata cleaning and checking, as well as for basic peak filtering. Finally, matchms was build to import and apply different similarity measures to compare large amounts of spectra. This includes common Cosine scores, but can also easily be extended by custom measures. Example for spectrum similarity measures that were designed to work in matchms are Spec2Vec and MS2DeepScore(Huber et al. 2020).

    -
    -

    8.8.2 MetDNA

    +
    +

    7.8.2 MetDNA

    MetDNA is the Metabolic reaction network-based recursive metabolite annotation for untargeted metabolomics (Shen et al. 2019).

    -
    -

    8.8.3 MetFusion

    +
    +

    7.8.3 MetFusion

    Java based integration of compound identification strategies. You could access the application here (Gerlich and Neumann 2013).

    -
    -

    8.8.4 MS2Analyzer

    +
    +

    7.8.4 MS2Analyzer

    MS2Analyzer could annotate small molecule substructure from accurate tandem mass spectra. (Ma et al. 2014)

    -
    -

    8.8.5 MetFrag

    +
    +

    7.8.5 MetFrag

    MetFrag could be used to make in silico prediction/match of MS/MS data(Ruttkies et al. 2016; Wolf et al. 2010).

    -
    -

    8.8.6 CFM-ID

    +
    +

    7.8.6 CFM-ID

    CFM-ID use Metlin’s data to make prediction (Allen et al. 2014) and 4.0 (Allen et al. 2014).

    -
    -

    8.8.7 LC-MS2Struct

    +
    +

    7.8.7 LC-MS2Struct

    A machine learning framework for structural annotation of small-molecule data arising from liquid chromatography–tandem mass spectrometry (LC-MS2) measurements.(Bach, Schymanski, and Rousu 2022)

    -
    -

    8.8.8 LipidFrag

    +
    +

    7.8.8 LipidFrag

    LipidFrag could be used to make in silico prediction/match of lipid related MS/MS data (Witting et al. 2017).

    -
    -

    8.8.9 Lipidmatch

    +
    +

    7.8.9 Lipidmatch

    in silico: in silico lipid mass spectrum search (Koelmel et al. 2017).

    -
    -

    8.8.10 BarCoding

    +
    +

    7.8.10 BarCoding

    Bar coding select mass-to-charge regions containing the most informative metabolite fragments and designate them as bins. Then translate each metabolite fragmentation pattern into a binary code by assigning 1’s to bins containing fragments and 0’s to bins without fragments. Such coding annotation could be used for MRM data (Spalding et al. 2016).

    -
    -

    8.8.11 iMet

    +
    +

    7.8.11 iMet

    This online application is a network-based computation method for annotation (Aguilar-Mogas et al. 2017).

    -
    -

    8.8.12 DNMS2Purifier

    +
    +

    7.8.12 DNMS2Purifier

    XGBoost based MS/MS spectral cleaning tool using intensity ratio fluctuation, appearance rate, and relative intensity(T. Zhao et al. 2023).

    -
    -

    8.8.13 IDSL.CSA

    +
    +

    7.8.13 IDSL.CSA

    Composite Spectra Analysis for Chemical Annotation of Untargeted Metabolomics Datasets(Baygi, Kumar, and Barupal 2023).

    -
    -

    8.9 Knowledge based annotation

    -
    -

    8.9.1 Experimental design

    +
    +

    7.9 Knowledge based annotation

    +
    +

    7.9.1 Experimental design

    Physicochemical Property can be used for annotation with a specific experimental design(Abrahamsson et al. 2023).

    -