Dual-dropout graph convolutional network for predicting synthetic lethality in human cancers

Motivation: Synthetic lethality (SL) is a promising form of gene interaction for cancer therapy, as it is able to identify specific genes to target at cancer cells without disrupting normal cells. As high-throughput wet-lab settings are often costly and face various challenges, computational approac...

Full description

Saved in:
Bibliographic Details
Main Authors: CAI, Ruichu, CHEN, Xuexin, FANG, Yuan, WU, Min, HAO, Yuexing
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/5127
https://ink.library.smu.edu.sg/context/sis_research/article/6130/viewcontent/BIOINFO20_DDGCN_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Motivation: Synthetic lethality (SL) is a promising form of gene interaction for cancer therapy, as it is able to identify specific genes to target at cancer cells without disrupting normal cells. As high-throughput wet-lab settings are often costly and face various challenges, computational approaches have become a practical complement. In particular, predicting SLs can be formulated as a link prediction task on a graph of interacting genes. Although matrix factorization techniques have been widely adopted in link prediction, they focus on mapping genes to latent representations in isolation, without aggregating information from neighboring genes. Graph convolutional networks (GCN) can capture such neighborhood dependency in a graph. However, it is still challenging to apply GCN for SL prediction as SL interactions are extremely sparse, which is more likely to cause overfitting. Results: In this article, we propose a novel dual-dropout GCN (DDGCN) for learning more robust gene representations for SL prediction. We employ both coarse-grained node dropout and fine-grained edge dropout to address the issue that standard dropout in vanilla GCN is often inadequate in reducing overfitting on sparse graphs. In particular, coarse-grained node dropout can efficiently and systematically enforce dropout at the node (gene) level, while finegrained edge dropout can further fine-tune the dropout at the interaction (edge) level. We further present a theoretical framework to justify our model architecture. Finally, we conduct extensive experiments on human SL datasets and the results demonstrate the superior performance of our model in comparison with state-of-the-art methods. Availability and implementation: DDGCN is implemented in Python 3.7, open-source and freely available at https://github.com/CXX1113/Dual-DropoutGCN. Supplementary information: Supplementary data are available at Bioinformatics online.