Comparison Between Linear and Non-linear Variable Selection Methods with Applications to Spectroscopic (UV-Vis/NIR) Data
Variable selection aims to identify important parameters in relation to predicted responses. Selection outcomes of the important variables could be different depending on the methods used. In this research, the important variables identified using linear and non-linear variable selection methods bas...
Saved in:
Main Authors: | , , , , , |
---|---|
Language: | English |
Published: |
Science Faculty of Chiang Mai University
2020
|
Subjects: | |
Online Access: | http://epg.science.cmu.ac.th/ejournal/dl.php?journal_id=10583 http://cmuir.cmu.ac.th/jspui/handle/6653943832/67342 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Chiang Mai University |
Language: | English |
Summary: | Variable selection aims to identify important parameters in relation to predicted responses. Selection outcomes of the important variables could be different depending on the methods used. In this research, the important variables identified using linear and non-linear variable selection methods based on partial least squares-variable important in prediction (PLS-VIP) and self organizing mapdiscrimination index (SOM-DI) were compared. Two datasets, near-infrared (NIR) spectra of adulterated Thai Jasmine rice and ultraviolet-visible (UV-Vis) spectra of food colorant mixtures were used for the demonstration. The advantages and disadvantages for the use of the different algorithms were compared and discussed. For the NIR data, the calibration model using supervised self organizing map (SSOM) offered better prediction results and the SOM-DI variable selection method identified the spectral changes in NIR overtone regions as significance. On the other hand, PLS calibration model resulted in higher predictive errors while the PLS-VIP variable selection captured variation from the visible region between 664 nm and 884 nm. Using the UV-Vis data, PLS appeared to put attention on only the highest absorbance region of the peak maximum absorbance. In contrast, SSOM model highlighted the variation around the isosbestic spectral regions between the mixture components. The drawback for the use of a mixture design to construct the calibration models, leading to wrong interpretation of the important variables, was also discussed. |
---|