The following is an abstract for the selected article. A PDF download of the full text of this article is available here. Members may download full texts at no charge. Non-members may be charged a small fee for certain articles.

Multi-Objective Genetic Algorithm-Based Sample Selection for Partial Least Squares Model Building with Applications to Near-Infrared Spectroscopic Data

Volume 60, Number 6 (June 2006) Page 631-640

Shinzawa, Hideyuki; Li, Boyan; Nakagawa, Takehiro; Maruo, Katsuhiko; Ozaki, Yukihiro

In this study, multi-objective genetic algorithms (GAs) are introduced to partial least squares (PLS) model building. This method aims to improve the performance and robustness of the PLS model by removing samples with systematic errors, including outliers, from the original data. Multi-objective GA optimizes the combination of these samples to be removed. Training and validation sets were used to reduce the undesirable effects of over-fitting on the training set by multi-objective GA. The reduction of the over-fitting leads to accurate and robust PLS models. To clearly visualize the factors of the systematic errors, an index defined with the original PLS model and a specific Pareto-optimal solution is also introduced. This method is applied to three kinds of near-infrared (NIR) spectra to build PLS models. The results demonstrate that multi-objective GA significantly improves the performance of the PLS models. They also show that the sample selection by multi-objective GA enhances the ability of the PLS models to detect samples with systematic errors.