Identifying significant metabolic pathways using multi-block partial least-squares analysis

In metabolomics, identification of metabolic pathways altered by disease, genetics, or environmental perturbations is crucial to uncover the underlying biological mechanisms. A number of pathway analysis methods are currently available, which are generally based on equal-probability, topological-cen...

Full description

Saved in:
Bibliographic Details
Main Authors: Deng, Lingli, Guo, Fanjing, Cheng, Kian-Kai, Zhu, Jiangjiang, Gu, Haiwei, Raftery, Daniel, Dong, Jiyang
Format: Article
Published: American Chemical Society 2020
Subjects:
Online Access:http://eprints.utm.my/id/eprint/87246/
http://dx.doi.org/10.1021/acs.jproteome.9b00793
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
id my.utm.87246
record_format eprints
spelling my.utm.872462020-10-31T12:27:04Z http://eprints.utm.my/id/eprint/87246/ Identifying significant metabolic pathways using multi-block partial least-squares analysis Deng, Lingli Guo, Fanjing Cheng, Kian-Kai Zhu, Jiangjiang Gu, Haiwei Raftery, Daniel Dong, Jiyang TP Chemical technology In metabolomics, identification of metabolic pathways altered by disease, genetics, or environmental perturbations is crucial to uncover the underlying biological mechanisms. A number of pathway analysis methods are currently available, which are generally based on equal-probability, topological-centrality, or model-separability methods. In brief, prior identification of significant metabolites is needed for the first two types of methods, while each pathway is modeled separately in the model-separability-based methods. In these methods, interactions between metabolic pathways are not taken into consideration. The current study aims to develop a novel metabolic pathway identification method based on multi-block partial least squares (MB-PLS) analysis by including all pathways into a global model to facilitate biological interpretation. The detected metabolites are first assigned to pathway blocks based on their roles in metabolism as defined by the KEGG pathway database. The metabolite intensity or concentration data matrix is then reconstructed as data blocks according to the metabolite subsets. Then, a MB-PLS model is built on these data blocks. A new metric, named the pathway importance in projection (PIP), is proposed for evaluation of the significance of each metabolic pathway for group separation. A simulated dataset was generated by imposing artificial perturbation on four pre-defined pathways of the healthy control group of a colorectal cancer study. Performance of the proposed method was evaluated and compared with seven other commonly used methods using both an actual metabolomics dataset and the simulated dataset. For the real metabolomics dataset, most of the significant pathways identified by the proposed method were found to be consistent with the published literature. For the simulated dataset, the significant pathways identified by the proposed method are highly consistent with the pre-defined pathways. The experimental results demonstrate that the proposed method is effective for identification of significant metabolic pathways, which may facilitate biological interpretation of metabolomics data. American Chemical Society 2020 Article PeerReviewed Deng, Lingli and Guo, Fanjing and Cheng, Kian-Kai and Zhu, Jiangjiang and Gu, Haiwei and Raftery, Daniel and Dong, Jiyang (2020) Identifying significant metabolic pathways using multi-block partial least-squares analysis. Journal of Proteome Research, 19 (5). pp. 1965-1974. ISSN 1535-3893 http://dx.doi.org/10.1021/acs.jproteome.9b00793
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic TP Chemical technology
spellingShingle TP Chemical technology
Deng, Lingli
Guo, Fanjing
Cheng, Kian-Kai
Zhu, Jiangjiang
Gu, Haiwei
Raftery, Daniel
Dong, Jiyang
Identifying significant metabolic pathways using multi-block partial least-squares analysis
description In metabolomics, identification of metabolic pathways altered by disease, genetics, or environmental perturbations is crucial to uncover the underlying biological mechanisms. A number of pathway analysis methods are currently available, which are generally based on equal-probability, topological-centrality, or model-separability methods. In brief, prior identification of significant metabolites is needed for the first two types of methods, while each pathway is modeled separately in the model-separability-based methods. In these methods, interactions between metabolic pathways are not taken into consideration. The current study aims to develop a novel metabolic pathway identification method based on multi-block partial least squares (MB-PLS) analysis by including all pathways into a global model to facilitate biological interpretation. The detected metabolites are first assigned to pathway blocks based on their roles in metabolism as defined by the KEGG pathway database. The metabolite intensity or concentration data matrix is then reconstructed as data blocks according to the metabolite subsets. Then, a MB-PLS model is built on these data blocks. A new metric, named the pathway importance in projection (PIP), is proposed for evaluation of the significance of each metabolic pathway for group separation. A simulated dataset was generated by imposing artificial perturbation on four pre-defined pathways of the healthy control group of a colorectal cancer study. Performance of the proposed method was evaluated and compared with seven other commonly used methods using both an actual metabolomics dataset and the simulated dataset. For the real metabolomics dataset, most of the significant pathways identified by the proposed method were found to be consistent with the published literature. For the simulated dataset, the significant pathways identified by the proposed method are highly consistent with the pre-defined pathways. The experimental results demonstrate that the proposed method is effective for identification of significant metabolic pathways, which may facilitate biological interpretation of metabolomics data.
format Article
author Deng, Lingli
Guo, Fanjing
Cheng, Kian-Kai
Zhu, Jiangjiang
Gu, Haiwei
Raftery, Daniel
Dong, Jiyang
author_facet Deng, Lingli
Guo, Fanjing
Cheng, Kian-Kai
Zhu, Jiangjiang
Gu, Haiwei
Raftery, Daniel
Dong, Jiyang
author_sort Deng, Lingli
title Identifying significant metabolic pathways using multi-block partial least-squares analysis
title_short Identifying significant metabolic pathways using multi-block partial least-squares analysis
title_full Identifying significant metabolic pathways using multi-block partial least-squares analysis
title_fullStr Identifying significant metabolic pathways using multi-block partial least-squares analysis
title_full_unstemmed Identifying significant metabolic pathways using multi-block partial least-squares analysis
title_sort identifying significant metabolic pathways using multi-block partial least-squares analysis
publisher American Chemical Society
publishDate 2020
url http://eprints.utm.my/id/eprint/87246/
http://dx.doi.org/10.1021/acs.jproteome.9b00793
_version_ 1683230713835421696