Establishing edge labels for benchmarking gene co-expression networks of saccharomyces cerevisiae and homo sapiens

The reliance on one type of correlation to construct Gene Coexpression Networks (GCNs) using ensemble approaches hinders a comprehensive understanding of gene co-expressions. To address this, Lim et al. (unpublished) propose a Two-Tier Ensemble Aggregation (TEA) GCN, combining various GCNs genera...

Full description

Saved in:
Bibliographic Details
Main Author: Antony Velankanni Jenet Princy
Other Authors: Marek Mutwil
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/176362
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The reliance on one type of correlation to construct Gene Coexpression Networks (GCNs) using ensemble approaches hinders a comprehensive understanding of gene co-expressions. To address this, Lim et al. (unpublished) propose a Two-Tier Ensemble Aggregation (TEA) GCN, combining various GCNs generated using different correlation coefficients and different dataset partitions for Arabidopsis thaliana. This project extends the evaluation of TEA-GCN to Saccharomyces cerevisiae and Homo sapiens through edge labelling and benchmarking TEA-GCN against other state-of-the-art ensemble methodologies. To this end, we used gene information from Metabolic pathways, Gene Ontology, and Transcription factors by utilising gene lists from public databases to generate labelled edges. These edges were further refined to obtain the best evaluation data for network performance assessment using Receiver Operating Characteristic and Precision-Recall curves. We show that TEA-GCN outperformed the state-of-the-art GCNs for both the species, in addition to Arabidopsis thaliana. This has demonstrated that the robustness of TEA-GCN’s methodology is generalisable across the diverse transcriptional programs underpinning not just fungal, plant, and mammalian species, but also single-cellular and multicellular organisms. Furthermore, this endeavour has resulted in first-ever comprehensive benchmarking data for Homo sapiens and Saccharomyces cerevisiae GCNs which will be instrumental to the development of more advanced GCN methods.