QCKer: An x86-AVX/AVX2 implementation of Q-gram counting filter for DNA sequence alignment

The paper presents the implementation of the q-gram counting filter using x86-AVX/AVX2 SIMD instructions. There are three novel findings during the course of the research work. First, to eliminate inconsistency between the theoretical and experimental result, synthetic reads are generated using DNA...

Full description

Saved in:
Bibliographic Details
Main Authors: Pernez, Joven L., Uy, Roger Luis, Borja, Kaizen Vinz A., Maghirang, Jan Carlo G.
Format: text
Published: Animo Repository 2019
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/4037
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-4948
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-49482021-08-13T07:12:46Z QCKer: An x86-AVX/AVX2 implementation of Q-gram counting filter for DNA sequence alignment Pernez, Joven L. Uy, Roger Luis Borja, Kaizen Vinz A. Maghirang, Jan Carlo G. The paper presents the implementation of the q-gram counting filter using x86-AVX/AVX2 SIMD instructions. There are three novel findings during the course of the research work. First, to eliminate inconsistency between the theoretical and experimental result, synthetic reads are generated using DNA character "T" only since generated synthetic reads create a random condition in which the number of seed instances is variable, and thus cannot be predicted. Second, the presence and absence of various SIMD parameters namely, prefetch, multithreading and AVX instruction sets are introduced to determine the speed factor. Result shows that there is a 2% speedup with the presence of prefetching, a 2.7% speedup with the presence of AVX instruction sets, a 100.41% speedup with the presence of multithreading, and a 112.25%) speedup if all parameters are used. This shows that multithreading has the biggest effect among the said parameters. Third, the x86-AVX is compared with Razers3, an existing read mapper using q-gram counting filter. In terms of filter only, the x86-AVX is 12x faster than the Razers3 for small seed size of 4. Though, Razers3 outperforms the x86-AVX implementation for longer seed (i.e., seed size of 12). This is attributed to Razers3 being optimized for q-gram of 12 or higher. From these findings, it is recommended that using real datasets is preferred over synthetic datasets. Also, implementation using multithreading approach is recommended. Though future work can be done to compare multithread with FPGA implementation. © 2019 ACM. 2019-12-19T08:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/4037 info:doi/10.1145/3383783.3383806 Faculty Research Work Animo Repository Simultaneous multithreading processors Nucleotide sequence Gene mapping Computer Sciences
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
topic Simultaneous multithreading processors
Nucleotide sequence
Gene mapping
Computer Sciences
spellingShingle Simultaneous multithreading processors
Nucleotide sequence
Gene mapping
Computer Sciences
Pernez, Joven L.
Uy, Roger Luis
Borja, Kaizen Vinz A.
Maghirang, Jan Carlo G.
QCKer: An x86-AVX/AVX2 implementation of Q-gram counting filter for DNA sequence alignment
description The paper presents the implementation of the q-gram counting filter using x86-AVX/AVX2 SIMD instructions. There are three novel findings during the course of the research work. First, to eliminate inconsistency between the theoretical and experimental result, synthetic reads are generated using DNA character "T" only since generated synthetic reads create a random condition in which the number of seed instances is variable, and thus cannot be predicted. Second, the presence and absence of various SIMD parameters namely, prefetch, multithreading and AVX instruction sets are introduced to determine the speed factor. Result shows that there is a 2% speedup with the presence of prefetching, a 2.7% speedup with the presence of AVX instruction sets, a 100.41% speedup with the presence of multithreading, and a 112.25%) speedup if all parameters are used. This shows that multithreading has the biggest effect among the said parameters. Third, the x86-AVX is compared with Razers3, an existing read mapper using q-gram counting filter. In terms of filter only, the x86-AVX is 12x faster than the Razers3 for small seed size of 4. Though, Razers3 outperforms the x86-AVX implementation for longer seed (i.e., seed size of 12). This is attributed to Razers3 being optimized for q-gram of 12 or higher. From these findings, it is recommended that using real datasets is preferred over synthetic datasets. Also, implementation using multithreading approach is recommended. Though future work can be done to compare multithread with FPGA implementation. © 2019 ACM.
format text
author Pernez, Joven L.
Uy, Roger Luis
Borja, Kaizen Vinz A.
Maghirang, Jan Carlo G.
author_facet Pernez, Joven L.
Uy, Roger Luis
Borja, Kaizen Vinz A.
Maghirang, Jan Carlo G.
author_sort Pernez, Joven L.
title QCKer: An x86-AVX/AVX2 implementation of Q-gram counting filter for DNA sequence alignment
title_short QCKer: An x86-AVX/AVX2 implementation of Q-gram counting filter for DNA sequence alignment
title_full QCKer: An x86-AVX/AVX2 implementation of Q-gram counting filter for DNA sequence alignment
title_fullStr QCKer: An x86-AVX/AVX2 implementation of Q-gram counting filter for DNA sequence alignment
title_full_unstemmed QCKer: An x86-AVX/AVX2 implementation of Q-gram counting filter for DNA sequence alignment
title_sort qcker: an x86-avx/avx2 implementation of q-gram counting filter for dna sequence alignment
publisher Animo Repository
publishDate 2019
url https://animorepository.dlsu.edu.ph/faculty_research/4037
_version_ 1767196015262171136