Disease gene classification with metagraph representations

This chapter is based on exploiting the network-based representations of proteins, metagraphs, in protein-protein interaction network to identify candidate disease-causing proteins. Protein-protein interaction (PPI) networks are effective tools in studying the functional roles of proteins in the dev...

Full description

Saved in:
Bibliographic Details
Main Authors: KIRCALI ATA, Sezin, FANG, Yuan, WU, Min, LI, Xiao-Li, XIAO, Xiaokui
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4230
https://ink.library.smu.edu.sg/context/sis_research/article/5233/viewcontent/Disease_gene_manuscript.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5233
record_format dspace
spelling sg-smu-ink.sis_research-52332019-02-07T04:11:57Z Disease gene classification with metagraph representations KIRCALI ATA, Sezin FANG, Yuan WU, Min LI, Xiao-Li XIAO, Xiaokui This chapter is based on exploiting the network-based representations of proteins, metagraphs, in protein-protein interaction network to identify candidate disease-causing proteins. Protein-protein interaction (PPI) networks are effective tools in studying the functional roles of proteins in the development of various diseases. However, they are insufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To enhance PPI networks, we utilize biological properties of individual proteins as well. More specifically, we integrate keywords from UniProt database describing protein properties into the PPI network and construct a novel heterogeneous PPI-Keyword (PPIK) network consisting of both proteins and keywords. As proteins with similar functional duties or involving in the same metabolic pathway tend to have similar topological characteristics, we propose to represent them with metagraphs. Compared to the traditional network motif or subgraph, a metagraph can capture the topological arrangements through not only the protein-protein interactions but also protein-keyword associations. We feed those novel metagraph representations into classifiers for disease protein prediction and conduct our experiments on three different PPI databases. They show that the proposed method consistently increases disease protein prediction performance across various classifiers, by 15.3% in AUC on average. It outperforms the diffusion-based (e.g., RWR) and the module-based baselines by 13.8–32.9% in overall disease protein prediction. Breast cancer protein prediction outperforms RWR, PRINCE, and the module-based baselines by 6.6–14.2%. Finally, our predictions also exhibit better correlations with literature findings from PubMed database. 2018-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4230 info:doi/10.1007/978-1-4939-8561-6_16 https://ink.library.smu.edu.sg/context/sis_research/article/5233/viewcontent/Disease_gene_manuscript.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Disease protein prediction Metagraph Protein representations Protein-protein interaction Uniprot keywords Databases and Information Systems Medicine and Health Sciences
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Disease protein prediction
Metagraph
Protein representations
Protein-protein interaction
Uniprot keywords
Databases and Information Systems
Medicine and Health Sciences
spellingShingle Disease protein prediction
Metagraph
Protein representations
Protein-protein interaction
Uniprot keywords
Databases and Information Systems
Medicine and Health Sciences
KIRCALI ATA, Sezin
FANG, Yuan
WU, Min
LI, Xiao-Li
XIAO, Xiaokui
Disease gene classification with metagraph representations
description This chapter is based on exploiting the network-based representations of proteins, metagraphs, in protein-protein interaction network to identify candidate disease-causing proteins. Protein-protein interaction (PPI) networks are effective tools in studying the functional roles of proteins in the development of various diseases. However, they are insufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To enhance PPI networks, we utilize biological properties of individual proteins as well. More specifically, we integrate keywords from UniProt database describing protein properties into the PPI network and construct a novel heterogeneous PPI-Keyword (PPIK) network consisting of both proteins and keywords. As proteins with similar functional duties or involving in the same metabolic pathway tend to have similar topological characteristics, we propose to represent them with metagraphs. Compared to the traditional network motif or subgraph, a metagraph can capture the topological arrangements through not only the protein-protein interactions but also protein-keyword associations. We feed those novel metagraph representations into classifiers for disease protein prediction and conduct our experiments on three different PPI databases. They show that the proposed method consistently increases disease protein prediction performance across various classifiers, by 15.3% in AUC on average. It outperforms the diffusion-based (e.g., RWR) and the module-based baselines by 13.8–32.9% in overall disease protein prediction. Breast cancer protein prediction outperforms RWR, PRINCE, and the module-based baselines by 6.6–14.2%. Finally, our predictions also exhibit better correlations with literature findings from PubMed database.
format text
author KIRCALI ATA, Sezin
FANG, Yuan
WU, Min
LI, Xiao-Li
XIAO, Xiaokui
author_facet KIRCALI ATA, Sezin
FANG, Yuan
WU, Min
LI, Xiao-Li
XIAO, Xiaokui
author_sort KIRCALI ATA, Sezin
title Disease gene classification with metagraph representations
title_short Disease gene classification with metagraph representations
title_full Disease gene classification with metagraph representations
title_fullStr Disease gene classification with metagraph representations
title_full_unstemmed Disease gene classification with metagraph representations
title_sort disease gene classification with metagraph representations
publisher Institutional Knowledge at Singapore Management University
publishDate 2018
url https://ink.library.smu.edu.sg/sis_research/4230
https://ink.library.smu.edu.sg/context/sis_research/article/5233/viewcontent/Disease_gene_manuscript.pdf
_version_ 1770574494946033664