LdClusterView : a system for automated analysis and visualization of genomics data
In the study of genetics, researchers explore billions of deoxyribonucleic acid (DNA) bases to identify biologically interesting patterns. Due to the need to explore this voluminous data, bioinformatics scientists have developed genome browsers to provide researchers with a platform to better...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/73950 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | In the study of genetics, researchers explore billions of deoxyribonucleic acid (DNA) bases to
identify biologically interesting patterns. Due to the need to explore this voluminous data,
bioinformatics scientists have developed genome browsers to provide researchers with a
platform to better understand the data. Similarly, in this project, Singapore Immunity
Network (SIgN) aimed to develop an interactive web-based visualizations platform for the
researchers. The visualizations created were LdClusterView, an improvement to the current
genome browsers and Biostatistical Network Tool (BNT), a tool to identify interest genes for
further analysis.
Most of the genome browsers visualized the relationships between different biological layers
through multiple graphical plots stacked on top of each other with a common horizontal axis
representing the chromosome length. However, it only shows spatial relationship between
different biological data at various regions of the chromosome and does not depict the
complex relationship between genetic variations.
LdClusterView extended the basic layout of stacked plots by incorporating a dendrogram and
Sankey plot to describe the relationship between the stacked plots. These improvements
allowed illustration of both relationships between the plots and relationships between the
internal elements of the plots respectively. However, due to the limitation of the web
application to view a large amount of data, only one gene could be displayed at a time.
Therefore, another web application tool, BNT, was created to complement LdClusterView.
BNT explored an emerging method of associating gene information with other types of
biological data by analysing the data through non-parametric tests, plots and sub-network
graph in form of Minimum Spanning Tree (MST) to identify interesting gene candidates for
further exploration in LdClusterView.
Both applications were implemented through HTML, CSS, JavaScript and D3 library. They were
both optimized to be easily used by the researchers to explore the data and to produce
visualizations for reporting purposes. |
---|