The following is an abstract for the selected article. A PDF download of the full text of this article is available here. Members may download full texts at no charge. Non-members may be charged a small fee for certain articles.
Classification of Raw and Roasted Semen Cassiae Samples with the Use of Fourier Transform Infrared Fingerprints and Least Squares Support Vector Machines
Volume 64, Number 6 (June 2010) Page 649-656
Lai, Yanhua; Ni, Yongnian; Kokot, Serge
Raw and roasted Semen Cassiae seeds, a complex traditional Chinese medicine (TCM), are used as examples to research and develop a method of classification analysis based on measurements of Fourier transform infrared (FT-IR) spectral fingerprints. Eighty samples of the TCM were measured in the mid-infrared range, 400-2000 cm−1 (KBr pellets), and the complex overlapping spectra were submitted for interpretation to a principal component analysis least squares support vector machine (PC-LS-SVM), kernel principal component analysis least squares support vector machine (KPC-LS-SVM), and radial basis function artificial neural networks (RBF-ANN). The LS-SVM models were developed with an RBF kernel function and a grid search technique. Training models were constructed with the use of raw and first-derivative spectra and these were then verified by another data set containing both raw and roasted spectral objects. It was demonstrated that the first-derivative data set produced the best separation of the spectral objects. In general, satisfactory analytical performance was obtained with the PC-LS-SVM, KPC-LS-SVM, and RBF-ANN training models and with the classification of the verification spectral objects. With regard to chemometrics modeling, the performance of KPC-LS-SVM was somewhat more economical than that of the PC-LS-SVM model. It would appear that the latter relatively simple model would be sufficient for application to most small to medium sized FT-IR fingerprint data sets, but with larger matrices the more complex models, such as the RBF-ANN and KPC-LS-SVM, may be more advantageous on a computational basis.