An R implementation of these classifiers and the overall computational approach is available as the MetaClean package at. As a complementary approach, applying our framework to peaks retained after filtering by 30% RSD across pooled QC samples was able to further distinguish poorly integrated peaks that were not removed from filtering alone. The best performing classifier was found to be a combination of the AdaBoost algorithm and a set of 11 peak quality metrics previously explored in untargeted metabolomics and proteomics studies. These classifiers were compared to using a residual standard deviation (RSD) cut-off in pooled quality-control (QC) samples, which aims to remove peaks with analytical error. Specifically, we comprehensively and systematically compared the performance of 24 different classifiers generated by combining eight classification algorithms and three sets of peak quality metrics on the task of distinguishing reliably integrated peaks from poorly integrated ones. To address this problem, we propose a computational methodology that combines machine learning and peak quality metrics to filter out low quality peaks. As a result, the output of these pre-processing software may retain incorrectly calculated metabolite abundances that can perpetuate in downstream analyses. Despite the availability of several pre-processing software, poor peak integration remains a prevalent problem in untargeted metabolomics data generated using liquid chromatography high–resolution mass spectrometry (LC–MS).
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
January 2023
Categories |