LdClusterView : a visualization for genomics data

Scalable processing of large, heterogeneous, and possible incomplete and/or conflicting data, makes the analysis of haplotype data a challenging task. Moreover, near completion of the genome sequences and the re-focus on research analysis, makes the issue of effective genomic sequence display essent...

Full description

Saved in:
Bibliographic Details
Main Author: Gupta, Aakash
Other Authors: Zheng Jie
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/70272
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Scalable processing of large, heterogeneous, and possible incomplete and/or conflicting data, makes the analysis of haplotype data a challenging task. Moreover, near completion of the genome sequences and the re-focus on research analysis, makes the issue of effective genomic sequence display essential: it becomes cumbersome and difficult to understand to have billions of genomic DNA letters displayed on the screen as plain text! Thus, it is of paramount importance to be able to collect and digest the large amount of data about biological systems that is accumulating in the literature.  Visualizing the data has successfully aided in gaining better understanding of the data. Moreover, researchers wish to view all facets of the genotype and haplotype data, including the spatial distribution of the loci along a chromosome, the different frequencies of haplotypes in different subgroups, and possibly also the correlation of occurring haplotypes. This emphasizes a need for a dynamic visualization which can address such complex and huge data sets on many different levels. As a solution, Singapore Immunology Network (SigN) aims to provide a customizable and highly user-interactive display of requested portion of genomes. Apart from kick-starting the project, SIgN aims to release the project in the public domain to enable collaborators from all over the world to contribute to and expand the project. As the foundational stone, three kinds of plots have been made to analyse genomic sequences in a better manner – Manhattan Plot, Genes Plot, and the Leaf Nodes Plot.