Early Detection of Glutaric Acidemia Type I by Urinary Metabolomics Analysis Based on Gas Chromatography-Mass Spectrometry Coupled with Chemometrics
  
View Full Text    Download reader
DOI:
KeyWord:glutaric acidemia type Ⅰ  early detection  gas chromatography-mass spectrometry  partial least squares discriminant analysis  bootstrap  stable feature selection
  
AuthorInstitution
XIAO Wen,NIU Qian-qian,SUN Zhi-yong,YANG Qin,WU Ben-qing 1.School of Physics and Optoelectronic Engineering,Yangtze University,Jingzhou ,China;2.Rare Disease Engineering Research Center of Metabolomics in Precision Medicine,Shenzhen Aone Medical Laboratory Co.,Ltd.,Shenzhen ,China;3.Shenzhen Hospital,University of Chinese Academy of Sciences,Shenzhen ,China
Hits: 906
Download times: 1104
Abstract:
      An efficient early detection framework for glutaric acidemia type Ⅰ(GA-Ⅰ) was developed by utilizing urinary metabolomics analysis based on gas chromatography-mass spectrometry(GC-MS) coupled with chemometrics,aiming to overcome small samples and high dimension modeling problems.In the proposed framework,assisted by the capability of partial least squares discriminant analysis(PLS-DA) in collinearity processing and data interpretation,bootstrap was introduced to perform data perturbation and induce multiple base classifiers,integrating their feature selection strengths and forming a novel algorithm of BS-PLSDA.Based on three informative vectors of loading weights(LW),variable importance in the projection(VIP) and significance multivariate correlation(sMC),the formed novel algorithm BS-PLSDA enabled the screening of discriminative features that were so strong to survive across multiple base classifiers.Investigated by GC-MS urinary metabolomic profiling of GA-Ⅰ,the results showed that BS-PLSDAs of three informative vectors all outperformed their corresponding PLS-DAs modeled by single classifier in selection stability,even if the ratio of sample partitioning was altered from 7∶3 to 6∶4,gradually increasing the sample difference among training sets.When the ratio of sample partitioning was 7∶3,the Kuncheva index of BS-VIP-PLSDA could reach to 0.807 5.Furthermore,the screened stable discriminative features exhibited close biological correlations to the metabolic mechanism of GA-Ⅰ,in which several reported diagnostic organic acids were searched.Meanwhile,they yielded desired predictive powers that the averages of area under receiver operating characteristic curve(AUC) were 0.773 9,0.854 8 and 0.847 1,while Matthews correlation coefficient(MCC) were 0.671 9,0.783 8 and 0.801 3 for BS-LW-PLSDA,BS-VIP-PLSDA and BS-sMC-PLSDA,respectively.Finally,a comparison was performed between PLS-DA and support vector machine recursive feature elimination(SVM-RFE).Equipped with the same ensemble feature selection strategy,the model BS-RBF-SVMRFE using nonlinear radial basis function(RBF) was superior to BS-PLSDAs in classification performance.Nevertheless,it obtained poor model interpretability.All the results revealed that the proposed BS-PLSDA exhibited its modeling feasibilities both in classification performance and data interpretation,resulting in good meet in clinical demand.It suitably guided the early detection,and aided clinical diagnosis and disease mechanism understanding for GA-Ⅰ.
Close