FeatureFinderMetabo - RASpicer/MetabolomicsTools GitHub Wiki

FeatureFinderMetabo

Version: 2.2.0

Website

http://ftp.mi.fu-berlin.de/pub/OpenMS/release-documentation/html/TOPP_FeatureFinderMetabo.html

Description

This provides label-free quantification of metabolites of LC-MS data. It is available as part of OpenMS. There are two stages to the algorithm: mass trace detection and feature assembly. Each data set of peaks is sorted by decreasing intensity and peaks that are below a user-defined threshold are filtered. Each of the most intense peaks is then considered a potential seeding point for mass trace construction. Mass traces are features with a similar m/z, which occur in adjacent scans. The mass traces are extended along the retention time axis from the seeding point, in both directions. This results in more peaks with similar m/z being recruited into the mass trace, depending upon an intensity-weighted Gaussian model (that reflects that low-intensity peaks are less reliable than higher intensity peaks). The algorithm is aborted when a specified number of scans have been searched without finding an adequate peak. Once a peak has been added to a trace, it cannot be used as a seeding point or be added to another trace. At this stage the mass traces need to be separated, so that each trace only contains a single feature. LOWESS with a polynomial of degree 2 is used as a smoothing technique. It is then tested whether or not peaks are sufficiently separated in order to be considered separate peaks. If peaks need to be separated, the minima between the 2 chromatographic peaks is used as a splitting point. During feature assembly the aim is to find features originating from the same progenitor metabolite and to cluster these adducts. To identify isotopes a HiRes generated library of isotopic masses is used. A scoring function is used to assess how likely it is a set of mass traces are caused by the same metabolite based on precomputed mass difference distributions of potential metabolite compositions. A correlation similarity score is used to assess whether mass traces grouped by m/z are also compatible in their elution times (it must be at least 70% full width at half-maximum). For each hypothesis a combined score is calculated. To give preference to high intensity signals, the scores are weighted to peak area normalised by the total sum of mass traces. A list of features is then produced that should not contain multiple peaks for a single metabolite.

Functionality

  • Preprocessing

Instrument Data Type

  • MS/LC-MS/Centroid LC-MS

Approaches

Computer Skills

  • Advanced
  • Medium

Software Type

Package

Interface

  • Command line interface
  • Graphical User Interface

Operating System (OS)

Unix/Linux

Language

C++

Dependencies

N/A

Input Formats - Open

ConsensusXML, DTA, DTA2D, EDTA, featureXML, KROENIK, MGF, ms2, mzML, mzXML, mzData, PEPLIST, TSV

Input Formats - Proprietary

fid/XMASS

Published

2014

Last Updated

2017

License

Three-clause BSD license

Paper

http://www.ncbi.nlm.nih.gov/pubmed/24176773

PMID

24176773