Origin Discrimination Model of Crataegi Folium Based on Sparse Principal Component Analysis for Feature Selection
  
View Full Text    Download reader
DOI:
KeyWord:near infrared spectroscopy  feature selection  Crataegi Folium  geographic origin discrimination  sparse principal component analysis for feature selection  support vector machines
  
AuthorInstitution
LIANG Xiao-juan,WANG Ya-ni,MA Jin-fang,SUN Peng,GUO Tuo,YAN Shi-kai,XIAO Xue 1. School of Electronic Information and Artificial Intelligence,Shaanxi University of Science and Technology,Xi'an ,China; 2. Institute of Traditional Chinese Medicine,Guangdong Pharmaceutical University,Guangzhou ,China; 3. Department of Electro-Optical Engineering,Jinan University,Guangzhou ,China; 4. School of Pharmacy,Shanghai Jiao Tong University,Shanghai ,China; 5. Innovative Institute of Chinese Medicine and Pharmacy,Shandong University of Traditional Chinese Medicine,Jinan ,China
Hits: 668
Download times: 1103
Abstract:
      A qualitative analysis method based on sparse principal component analysis feature selection(SPCAFS) and support vector machine(SVM) modeling was proposed in this paper,in order to realize the rapid discrimination on the origin of Crataegi Folium. Near infrared integrative sphere diffuse reflection spectroscopy was used to collect the near-infrared spectrograms of 123 Crataegi Folium samples from 6 regions in 41 batches. After data preprocessing,the representative characteristic bands were selected by SPCAFS,and the near infrared origin discrimination model for Crataegi Folium was established by SVM. The model was compared with three feature selection algorithms,i.e. continuous projection algorithm(SPA),regularized self representation algorithm(RSR) and sparse subspace clustering(SSC),to evaluate the prediction performance of the proposed model with accuracy,precision and sensitivity as evaluation criteria. The results showed that the numbers of characteristic band for SPCAFS were reduced from 1 500 to 21 compared with those for full wavelength modeling,but the accuracy and precision of prediction results were improved from 78% and 76% to 97% and 100%,respectively. Meanwhile,compared with those of SPA,RSR and SSC algorithms,the accuracy was improved by 6%,3% and 3%,while the precision was improved by 13%,10% and 5%,respectively. The prediction ability of the model was significantly improved. The SVM discrimination model based on SPCAFS could realize the rapid discrimination on the northern and southern geographic origins of Crataegi Folium.
Close