On the number of DNA sequence profiles for practical values of read lengths

A recent study by one of the authors has demonstrated the relevance of profile vectors in DNA-based data storage. We provide exact values and lower bounds on the number of profile vectors for finite values of alphabet size q, read length ℓ, and word length n. Consequently, we demonstrate that for q...

Full description

Saved in:
Bibliographic Details
Main Authors: Chang, Zuling, Chrisnata, Johan, Ezerman, Martianus Frederic, Kiah, Han Mao
Other Authors: School of Physical and Mathematical Sciences
Format: Conference or Workshop Item
Language:English
Published: 2019
Subjects:
Online Access:https://hdl.handle.net/10356/103253
http://hdl.handle.net/10220/48592
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:A recent study by one of the authors has demonstrated the relevance of profile vectors in DNA-based data storage. We provide exact values and lower bounds on the number of profile vectors for finite values of alphabet size q, read length ℓ, and word length n. Consequently, we demonstrate that for q ≥ 3 and n = q a ℓ, a = o(ℓ), the number of profile vectors is at least q κn for some constant 0 <; κ ≤ 1. In addition to enumeration results, we provide a set of efficient encoding and decoding algorithms for a family of profile vectors.