Network-based screening for ultra-high dimensional survival data subject to semi-competing risks

As a result of the current proliferation of scientific data of unprecedented magnitude and complexity, ultrahigh dimensional data has become recurrent in a multitude of biological studies. With biomarker identification being a key concern for early disease detection, the ultrahigh dimensionality...

Full description

Saved in:
Bibliographic Details
Main Author: Chin, Nicholas Wei Lun
Other Authors: Xiang Liming
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156912
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:As a result of the current proliferation of scientific data of unprecedented magnitude and complexity, ultrahigh dimensional data has become recurrent in a multitude of biological studies. With biomarker identification being a key concern for early disease detection, the ultrahigh dimensionality of data further complicates the complexity of the problem. Feature screening has become increasingly significant in many scientific research but very limited studies consider two types of survival endpoints, consider gene-gene dependencies and ac- count for outliers. In this paper, we enhance joint correlation rank (JCR) screening by utilising Google’s PageRank matrix to incorporate covariate-covariate network information. A nonparanormal approach was also adopted to enable the screening to be more robust to outliers. Through a series of simulations, we highlight its improved performance on identi- fying active covariates accurately. For illustration, the proposed method is applied to colon cancer data, where it is assessed based on prediction performance.