Rates of DNA sequence profiles for practical values of read lengths
A recent study by one of the authors has demonstrated the importance of profile vectors in DNA-based data storage. We provide exact values and lower bounds on the number of profile vectors for finite values of alphabet size q, read length 1, and word length n. Consequently, we demonstrate that for q...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/101858 http://hdl.handle.net/10220/48560 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | A recent study by one of the authors has demonstrated the importance of profile vectors in DNA-based data storage. We provide exact values and lower bounds on the number of profile vectors for finite values of alphabet size q, read length 1, and word length n. Consequently, we demonstrate that for q ≥ 2 and n ≤ q 1/2-1 , the number of profile vectors is at least q κn with κ very close to 1. In addition to enumeration results, we provide a set of efficient encoding and decoding algorithms for certain families of profile vectors. |
---|