Efficient loss-less compression for genetic data

As the usage of technology increases rapidly today, the amount of data created also increases exponentially. In particular, the rate of increase in DNA sequencing has been rising. Efficient compression significantly reduces the storage and maintenance cost. Therefore, this project will look into ava...

Full description

Saved in:
Bibliographic Details
Main Author: Tye, Yong Meng
Other Authors: Anupam Chattopadhyay
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/66707
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:As the usage of technology increases rapidly today, the amount of data created also increases exponentially. In particular, the rate of increase in DNA sequencing has been rising. Efficient compression significantly reduces the storage and maintenance cost. Therefore, this project will look into available compression algorithms which work better than other general compression tools. The first algorithm that will be examined is logic synthesis. It is an algorithm which takes in binary string as input, process it into logic circuits and then giving an optimized logic circuit as the output. This algorithm will work on a segment of DNA sequences to determine if it works well with such data. The second algorithm comes from the Fqzcomp program which won the first prize in the sequence squeeze competition because it offered the best compression ratio on DNA sequences. It will be examined and suggestions will be proposed to make it more efficient.