Group-shrinkage feature selection with a spatial network for mining DNA methylation data
Identifying disease-related biomarkers from high-dimensional DNA methylation data helps in reducing early screening costs and inferring pathogenesis mechanisms. Good discovery results have been achieved through spatial correlation methods of methylation sites, group-based regularization, and network...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/170639 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Identifying disease-related biomarkers from high-dimensional DNA methylation data helps in reducing early screening costs and inferring pathogenesis mechanisms. Good discovery results have been achieved through spatial correlation methods of methylation sites, group-based regularization, and network constraints. However, these methods still have some key limitations as they cannot exclude isolated differential sites and only consider adjacent site ordering. Therefore, we propose a group-shrinkage feature selection algorithm to encourage the selection of clustered sites and discourage the selection of isolated differential sites. Specifically, a network-guided group-shrinkage strategy is developed to penalize weakly-correlated isolated methylation sites through a network structure constraint. The spatial network is constructed based on spatial correlation information of DNA methylation sites, where this information accounts for the uneven site distribution. The experimental simulations and applications demonstrated that the proposed method outperforms the advanced regularization methods, especially in rejecting isolated methylation sites; hence this study provides an efficient and clinical-valuable method for biomarker candidate discovery in DNA methylation data. Additionally, the proposed method exhibits enhanced reliability due to introducing biological prior knowledge into a regularization-based feature selection framework and could promote more research in the integration between biological prior knowledge and classical feature selection methods, thus facilitating their clinical application. Our source codes will be released at https://github.com/SJTUBME-QianLab/Group-shrinkage-Spatial-Network once this manuscript is accepted for publication. |
---|