Error studies for DNA data storage

Due to its large storage capacity and preservation ability, Deoxyribonucleic Acid (DNA) has become an attractive medium to store the increasing amount of information present today. Various research has been conducted to examine more ways to tap on the potential of DNA storage. However, due to the co...

Full description

Saved in:
Bibliographic Details
Main Author: Ng, Yi Mei
Other Authors: Erry Gunawan
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/136748
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Due to its large storage capacity and preservation ability, Deoxyribonucleic Acid (DNA) has become an attractive medium to store the increasing amount of information present today. Various research has been conducted to examine more ways to tap on the potential of DNA storage. However, due to the complex nature of DNA, errors such as deletions, insertions and substitutions can occur during the DNA synthesizing, sequencing or storage stage. Hence, data recovery becomes a challenge as the data directly retrieved from the medium would be altered due to the errors. Error studies of the DNA storage is crucial in understanding the error characteristics of the channel before deciding which error correcting codes to implement. In this final report, comprehensive error studies were conducted using various analysis methods. The error studies are based on the sequencing experimental results using Illumina Hi-Seq sequence technology. An experiment of 5000 oligos were simulated based on the results of error analyses, mainly taking substitution errors as consideration and using the substitution error probabilities. Subsequently, Reed-Solomon code was also implemented to correct the generated errors and the error performance after decoding was evaluated.