A sparse QSRR model for predicting retention indices of essential oils based on robust screening approach
A robust screening approach and a sparse quantitative structure–retention relationship (QSRR) model for predicting retention indices (RIs) of 169 constituents of essential oils is proposed. The proposed approach is represented in two steps. First, dimension reduction was performed using the proposed...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Published: |
Taylor and Francis Ltd.
2017
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/75754/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-85030458483&doi=10.1080%2f1062936X.2017.1375010&partnerID=40&md5=47d22807f6a4795a52fa1244310bb90b |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Malaysia |
Summary: | A robust screening approach and a sparse quantitative structure–retention relationship (QSRR) model for predicting retention indices (RIs) of 169 constituents of essential oils is proposed. The proposed approach is represented in two steps. First, dimension reduction was performed using the proposed modified robust sure independence screening (MR-SIS) method. Second, prediction of RIs was made using the proposed robust sparse QSRR with smoothly clipped absolute deviation (SCAD) penalty (RSQSRR). The RSQSRR model was internally and externally validated based on (Formula presented.), (Formula presented.), (Formula presented.), (Formula presented.), Y-randomization test, (Formula presented.), (Formula presented.), and the applicability domain. The validation results indicate that the model is robust and not due to chance correlation. The descriptor selection and prediction performance of the RSQSRR for training dataset outperform the other two used modelling methods. The RSQSRR shows the highest (Formula presented.), (Formula presented.), and (Formula presented.), and the lowest (Formula presented.). For the test dataset, the RSQSRR shows a high external validation value ((Formula presented.)), and a low value of (Formula presented.) compared with the other methods, indicating its higher predictive ability. In conclusion, the results reveal that the proposed RSQSRR is an efficient approach for modelling high dimensional QSRRs and the method is useful for the estimation of RIs of essential oils that have not been experimentally tested. |
---|