Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana
This thesis consists of two projects. The first is on the development of an online database for analysing diurnal plant gene expression. This was my first project which took up around one year of my PhD. The second is on using machine learning to identify novel biological relationships amongst Arabi...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/164425 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-164425 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1644252023-02-28T18:38:31Z Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana Ng, Jonathan Wei Xiong Marek Mutwil School of Biological Sciences mutwil@ntu.edu.sg Science::Biological sciences This thesis consists of two projects. The first is on the development of an online database for analysing diurnal plant gene expression. This was my first project which took up around one year of my PhD. The second is on using machine learning to identify novel biological relationships amongst Arabidopsis thaliana genes. This is my main PhD project which took up the bulk of my time. Almost all organisms coordinate some aspects of their biology through the diurnal cycle. Photosynthetic organisms, and plants especially, have established complex programs that coordinate physiological, metabolic and developmental processes with the changing light. The diurnal regulation of the underlying transcriptional processes is observed when groups of functionally related genes (gene modules) are expressed at a specific time of the day. However, studying the diurnal regulation of these gene modules in the plant kingdom was hampered by the large amount of data required for the analyses. To meet this need, I used gene expression data from 17 diurnal studies spanning the whole Archaeplastida kingdom (Plantae kingdom in the broad sense) to make an online diurnal database. I have equipped the database with tools that allow user-friendly cross-species comparisons of gene expression profiles, entire co-expression networks, co-expressed clusters (involved in specific biological processes), time-specific gene expression, and others. I exemplify how these tools can be used by studying three important biological questions: (i) the evolution of cell division, (ii) the diurnal control of gene modules in algae and (iii) the conservation of diurnally controlled modules across species. The database is freely available at https://diurnal.plant.tools/. Understanding how the different cellular components are working together to form a living cell requires multidisciplinary approaches combining molecular and computational biology. Machine learning shows great potential in life sciences, as it can find novel relationships between biological features. Here, I constructed a dataset of 11,801 gene features for 31,522 A. thaliana genes and developed a machine learning workflow to identify linked features. The detected linked features are visualised as a Feature Important Network (FIN), which can be mined to reveal a variety of novel biological insights pertaining to gene function. I demonstrate how FIN can be used to generate novel insights into gene function. To make this network easily accessible to the scientific community, I present the FINder database, available at finder.plant.tools (https://finder.plant.tools). Doctor of Philosophy 2023-01-25T06:36:15Z 2023-01-25T06:36:15Z 2022 Thesis-Doctor of Philosophy Ng, J. W. X. (2022). Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/164425 https://hdl.handle.net/10356/164425 10.32657/10356/164425 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Science::Biological sciences |
spellingShingle |
Science::Biological sciences Ng, Jonathan Wei Xiong Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana |
description |
This thesis consists of two projects. The first is on the development of an online database for analysing diurnal plant gene expression. This was my first project which took up around one year of my PhD. The second is on using machine learning to identify novel biological relationships amongst Arabidopsis thaliana genes. This is my main PhD project which took up the bulk of my time.
Almost all organisms coordinate some aspects of their biology through the diurnal cycle. Photosynthetic organisms, and plants especially, have established complex programs that coordinate physiological, metabolic and developmental processes with the changing light. The diurnal regulation of the underlying transcriptional processes is observed when groups of functionally related genes (gene modules) are expressed at a specific time of the day. However, studying the diurnal regulation of these gene modules in the plant kingdom was hampered by the large amount of data required for the analyses. To meet this need, I used gene expression data from 17 diurnal studies spanning the whole Archaeplastida kingdom (Plantae kingdom in the broad sense) to make an online diurnal database. I have equipped the database with tools that allow user-friendly cross-species comparisons of gene expression profiles, entire co-expression networks, co-expressed clusters (involved in specific biological processes), time-specific gene expression, and others. I exemplify how these tools can be used by studying three important biological questions: (i) the evolution of cell division, (ii) the diurnal control of gene modules in algae and (iii) the conservation of diurnally controlled modules across species. The database is freely available at https://diurnal.plant.tools/.
Understanding how the different cellular components are working together to form a living cell requires multidisciplinary approaches combining molecular and computational biology. Machine learning shows great potential in life sciences, as it can find novel relationships between biological features. Here, I constructed a dataset of 11,801 gene features for 31,522 A. thaliana genes and developed a machine learning workflow to identify linked features. The detected linked features are visualised as a Feature Important Network (FIN), which can be mined to reveal a variety of novel biological insights pertaining to gene function. I demonstrate how FIN can be used to generate novel insights into gene function. To make this network easily accessible to the scientific community, I present the FINder database, available at finder.plant.tools (https://finder.plant.tools). |
author2 |
Marek Mutwil |
author_facet |
Marek Mutwil Ng, Jonathan Wei Xiong |
format |
Thesis-Doctor of Philosophy |
author |
Ng, Jonathan Wei Xiong |
author_sort |
Ng, Jonathan Wei Xiong |
title |
Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana |
title_short |
Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana |
title_full |
Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana |
title_fullStr |
Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana |
title_full_unstemmed |
Feature importance network reveals novel functional relationships between biological features in Arabidopsis thaliana |
title_sort |
feature importance network reveals novel functional relationships between biological features in arabidopsis thaliana |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/164425 |
_version_ |
1759854997869166592 |