Coding for DNA data storage

DNA has recently become an attractive medium for long-term data archive due to its extremely high density (zettabytes per gram), durable preservation and extremely low power consumption. Previous works have designed and implemented several prototypes for this emerging data storage technique where da...

Full description

Saved in:
Bibliographic Details
Main Author: Wang, Yixin
Other Authors: Erry Gunawan
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/155241
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-155241
record_format dspace
spelling sg-ntu-dr.10356-1552412023-07-04T16:33:09Z Coding for DNA data storage Wang, Yixin Erry Gunawan School of Electrical and Electronic Engineering EGUNAWAN@ntu.edu.sg Engineering::Electrical and electronic engineering DNA has recently become an attractive medium for long-term data archive due to its extremely high density (zettabytes per gram), durable preservation and extremely low power consumption. Previous works have designed and implemented several prototypes for this emerging data storage technique where data was encoded, stored, and retrieved without errors, validating the feasibility of DNA data storage. However, current DNA storage systems still have limitations on several evaluable performance metrics including achieved information capacity, net information density, and scalability of random access while the storage channel remains partially uncovered. This research work aims to understand the characteristics of this storage channel and design efficient coding (encoding/decoding) algorithms to construct and implement DNA storage systems with effectivity, efficiency, and scalability. Specifically, error control codes are designed tailoring DNA storage scenario to provide error resilience against the inevitable errors occurring in the storage process, i.e., DNA synthesis, PCR amplification, sample preparation, storage and DNA sequencing. Besides, new constrained codes are proposed as a pre-processing coding technique to convert data into a proper format for further storage in DNA where several biomedical constraints are concerned because DNA strands satisfying these constraints are more stable and mutation-against. Overall, this work has comprehensively studied DNA data storage technology with focus on code, algorithm, and system design. Together, this work not only offers new design solutions to DNA storage by providing several highly performed code, algorithm and system design but also provides new angles towards data reconstruction strategies by investigating error characteristics of DNA storage channel. The results presented here are supposed to further advance DNA data storage to a more efficient and pragmatic technology. Doctor of Philosophy 2022-02-14T02:14:47Z 2022-02-14T02:14:47Z 2021 Thesis-Doctor of Philosophy Wang, Y. (2021). Coding for DNA data storage. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/155241 https://hdl.handle.net/10356/155241 10.32657/10356/155241 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
spellingShingle Engineering::Electrical and electronic engineering
Wang, Yixin
Coding for DNA data storage
description DNA has recently become an attractive medium for long-term data archive due to its extremely high density (zettabytes per gram), durable preservation and extremely low power consumption. Previous works have designed and implemented several prototypes for this emerging data storage technique where data was encoded, stored, and retrieved without errors, validating the feasibility of DNA data storage. However, current DNA storage systems still have limitations on several evaluable performance metrics including achieved information capacity, net information density, and scalability of random access while the storage channel remains partially uncovered. This research work aims to understand the characteristics of this storage channel and design efficient coding (encoding/decoding) algorithms to construct and implement DNA storage systems with effectivity, efficiency, and scalability. Specifically, error control codes are designed tailoring DNA storage scenario to provide error resilience against the inevitable errors occurring in the storage process, i.e., DNA synthesis, PCR amplification, sample preparation, storage and DNA sequencing. Besides, new constrained codes are proposed as a pre-processing coding technique to convert data into a proper format for further storage in DNA where several biomedical constraints are concerned because DNA strands satisfying these constraints are more stable and mutation-against. Overall, this work has comprehensively studied DNA data storage technology with focus on code, algorithm, and system design. Together, this work not only offers new design solutions to DNA storage by providing several highly performed code, algorithm and system design but also provides new angles towards data reconstruction strategies by investigating error characteristics of DNA storage channel. The results presented here are supposed to further advance DNA data storage to a more efficient and pragmatic technology.
author2 Erry Gunawan
author_facet Erry Gunawan
Wang, Yixin
format Thesis-Doctor of Philosophy
author Wang, Yixin
author_sort Wang, Yixin
title Coding for DNA data storage
title_short Coding for DNA data storage
title_full Coding for DNA data storage
title_fullStr Coding for DNA data storage
title_full_unstemmed Coding for DNA data storage
title_sort coding for dna data storage
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/155241
_version_ 1772825127419904000