EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem

Finding DNA motifs is a widely studied area in the field of Computational Biology. Motifs signify different information that is useful for biologists. There are several variations of the motif finding problem, and one of these is called the (l, d)-motif search or Planted Motif Search problem (PMS)....

Full description

Saved in:
Bibliographic Details
Main Authors: Fernandez, Proceso L, Jr, Ronquillo, Mark Joseph D
Format: text
Published: Archīum Ateneo 2017
Subjects:
Online Access:https://archium.ateneo.edu/discs-faculty-pubs/72
https://kkgpublications.com/ijtes-volume3-issue3-article5/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Ateneo De Manila University
id ph-ateneo-arc.discs-faculty-pubs-1071
record_format eprints
spelling ph-ateneo-arc.discs-faculty-pubs-10712020-05-06T07:56:40Z EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem Fernandez, Proceso L, Jr Ronquillo, Mark Joseph D Finding DNA motifs is a widely studied area in the field of Computational Biology. Motifs signify different information that is useful for biologists. There are several variations of the motif finding problem, and one of these is called the (l, d)-motif search or Planted Motif Search problem (PMS). In this paper, we propose the EMS-GT2 algorithm, an extension of the Exact Motif Search-Generate and Test (EMS- GT) which is an exact enumerative algorithm for PMS. In EMS-GT2, we incorporated a new speedup technique that is based on an important property that we have discovered, which we prove in this paper, and which has enabled a more efficient block-processing of candidate motifs. Our C++ implementation of EMS-GT2 running on synthetic data for several PMS challenge instances demonstrates that it is competitive with both the EMS-GT and qPMS9, the two current best exact solutions for PMS. In particular, EMS-GT2 is able to reduce the run-times of EMS-GT by 20.3%, 15.8% and 22.6% for the (l, d) challenge instances (13, 4), (15, 5) and (17, 6) respectively. It also outperforms qPMS9, having runtime reductions of 91.6%, 79.3%, 82.0%, 59.4% and 9.7% for the (9, 2), (11, 3), (13, 4), (15, 5) and (17, 6) synthetic challenge instances respectively. 2017-06-01T07:00:00Z text https://archium.ateneo.edu/discs-faculty-pubs/72 https://kkgpublications.com/ijtes-volume3-issue3-article5/ Department of Information Systems & Computer Science Faculty Publications Archīum Ateneo Planted (l d)-Motif Problem Bit-Based Exact Enumerative Algorithm Computer Sciences Theory and Algorithms
institution Ateneo De Manila University
building Ateneo De Manila University Library
country Philippines
collection archium.Ateneo Institutional Repository
topic Planted (l
d)-Motif Problem
Bit-Based
Exact Enumerative Algorithm
Computer Sciences
Theory and Algorithms
spellingShingle Planted (l
d)-Motif Problem
Bit-Based
Exact Enumerative Algorithm
Computer Sciences
Theory and Algorithms
Fernandez, Proceso L, Jr
Ronquillo, Mark Joseph D
EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem
description Finding DNA motifs is a widely studied area in the field of Computational Biology. Motifs signify different information that is useful for biologists. There are several variations of the motif finding problem, and one of these is called the (l, d)-motif search or Planted Motif Search problem (PMS). In this paper, we propose the EMS-GT2 algorithm, an extension of the Exact Motif Search-Generate and Test (EMS- GT) which is an exact enumerative algorithm for PMS. In EMS-GT2, we incorporated a new speedup technique that is based on an important property that we have discovered, which we prove in this paper, and which has enabled a more efficient block-processing of candidate motifs. Our C++ implementation of EMS-GT2 running on synthetic data for several PMS challenge instances demonstrates that it is competitive with both the EMS-GT and qPMS9, the two current best exact solutions for PMS. In particular, EMS-GT2 is able to reduce the run-times of EMS-GT by 20.3%, 15.8% and 22.6% for the (l, d) challenge instances (13, 4), (15, 5) and (17, 6) respectively. It also outperforms qPMS9, having runtime reductions of 91.6%, 79.3%, 82.0%, 59.4% and 9.7% for the (9, 2), (11, 3), (13, 4), (15, 5) and (17, 6) synthetic challenge instances respectively.
format text
author Fernandez, Proceso L, Jr
Ronquillo, Mark Joseph D
author_facet Fernandez, Proceso L, Jr
Ronquillo, Mark Joseph D
author_sort Fernandez, Proceso L, Jr
title EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem
title_short EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem
title_full EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem
title_fullStr EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem
title_full_unstemmed EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem
title_sort ems-gt2: an improved exact solution for the (l, d)-planted motif problem
publisher Archīum Ateneo
publishDate 2017
url https://archium.ateneo.edu/discs-faculty-pubs/72
https://kkgpublications.com/ijtes-volume3-issue3-article5/
_version_ 1681506576294215680