Analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models

Background: Biological entities such as genes, promoters, mRNA, metabolites or proteins do not act alone, but in concert in their network context. Modules, i.e., groups of nodes with similar topological properties in these networks characterize important biological functions of the underlying biomol...

Full description

Saved in:
Bibliographic Details
Main Authors: Baum, Katharina, Rajapakse, Jagath Chandana, Azuaje, Francisco
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/142617
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-142617
record_format dspace
spelling sg-ntu-dr.10356-1426172020-06-25T08:25:59Z Analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models Baum, Katharina Rajapakse, Jagath Chandana Azuaje, Francisco School of Computer Science and Engineering Engineering::Computer science and engineering Biomolecular Networks Co-expression Networks Background: Biological entities such as genes, promoters, mRNA, metabolites or proteins do not act alone, but in concert in their network context. Modules, i.e., groups of nodes with similar topological properties in these networks characterize important biological functions of the underlying biomolecular system. Edges in such molecular networks represent regulatory and physical interactions, and comparing them between conditions provides valuable information on differential molecular mechanisms. However, biological data is inherently noisy and network reduction techniques can propagate errors particularly to the level of edges. We aim to improve the analysis of networks of biological molecules by deriving modules together with edge relevance estimations that are based on global network characteristics. Methods: The key challenge we address here is investigating the capability of stochastic block models (SBMs) for representing and analyzing different types of biomolecular networks. Fitting them to SBMs both delivers modules of the networks and enables the derivation of edge confidence scores, and it has not yet been investigated for analyzing biomolecular networks. We apply SBM-based analysis independently to three correlation-based networks of breast cancer data originating from high-throughput measurements of different molecular layers: either transcriptomics, proteomics, or metabolomics. The networks were reduced by thresholding for correlation significance or by requirements on scale-freeness. Results and discussion: We find that the networks are best represented by the hierarchical version of the SBM, and many of the predicted blocks have a biologically and phenotypically relevant functional annotation. The edge confidence scores are overall in concordance with the biological evidence given by the measurements. We conclude that biomolecular networks can be appropriately represented and analyzed by fitting SBMs. As the SBM-derived edge confidence scores are based on global network connectivity characteristics and potential hierarchies within the biomolecular networks are considered, they could be used as additional, integrated features in network-based data comparisons. Their tight relationship to edge existence probabilities can be exploited to predict missing or spurious edges in order to improve the network representation of the underlying biological system. Published version 2020-06-25T08:25:59Z 2020-06-25T08:25:59Z 2019 Journal Article Baum, K., Rajapakse, J. C., & Azuaje, F. (2019). Analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models. F1000Research, 8, 465-. doi:10.12688/f1000research.18705.1 2046-1402 https://hdl.handle.net/10356/142617 10.12688/f1000research.18705.1 8 2-s2.0-85072696389 8 en F1000Research © 2019 Baum K et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Biomolecular Networks
Co-expression Networks
spellingShingle Engineering::Computer science and engineering
Biomolecular Networks
Co-expression Networks
Baum, Katharina
Rajapakse, Jagath Chandana
Azuaje, Francisco
Analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models
description Background: Biological entities such as genes, promoters, mRNA, metabolites or proteins do not act alone, but in concert in their network context. Modules, i.e., groups of nodes with similar topological properties in these networks characterize important biological functions of the underlying biomolecular system. Edges in such molecular networks represent regulatory and physical interactions, and comparing them between conditions provides valuable information on differential molecular mechanisms. However, biological data is inherently noisy and network reduction techniques can propagate errors particularly to the level of edges. We aim to improve the analysis of networks of biological molecules by deriving modules together with edge relevance estimations that are based on global network characteristics. Methods: The key challenge we address here is investigating the capability of stochastic block models (SBMs) for representing and analyzing different types of biomolecular networks. Fitting them to SBMs both delivers modules of the networks and enables the derivation of edge confidence scores, and it has not yet been investigated for analyzing biomolecular networks. We apply SBM-based analysis independently to three correlation-based networks of breast cancer data originating from high-throughput measurements of different molecular layers: either transcriptomics, proteomics, or metabolomics. The networks were reduced by thresholding for correlation significance or by requirements on scale-freeness. Results and discussion: We find that the networks are best represented by the hierarchical version of the SBM, and many of the predicted blocks have a biologically and phenotypically relevant functional annotation. The edge confidence scores are overall in concordance with the biological evidence given by the measurements. We conclude that biomolecular networks can be appropriately represented and analyzed by fitting SBMs. As the SBM-derived edge confidence scores are based on global network connectivity characteristics and potential hierarchies within the biomolecular networks are considered, they could be used as additional, integrated features in network-based data comparisons. Their tight relationship to edge existence probabilities can be exploited to predict missing or spurious edges in order to improve the network representation of the underlying biological system.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Baum, Katharina
Rajapakse, Jagath Chandana
Azuaje, Francisco
format Article
author Baum, Katharina
Rajapakse, Jagath Chandana
Azuaje, Francisco
author_sort Baum, Katharina
title Analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models
title_short Analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models
title_full Analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models
title_fullStr Analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models
title_full_unstemmed Analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models
title_sort analysis of correlation-based biomolecular networks from different omics data by fitting stochastic block models
publishDate 2020
url https://hdl.handle.net/10356/142617
_version_ 1681056786021351424