EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem
Finding DNA motifs is a widely studied area in the field of Computational Biology. Motifs signify different information that is useful for biologists. There are several variations of the motif finding problem, and one of these is called the (l, d)-motif search or Planted Motif Search problem (PMS)....
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Published: |
Archīum Ateneo
2017
|
Subjects: | |
Online Access: | https://archium.ateneo.edu/discs-faculty-pubs/72 https://kkgpublications.com/ijtes-volume3-issue3-article5/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Ateneo De Manila University |
id |
ph-ateneo-arc.discs-faculty-pubs-1071 |
---|---|
record_format |
eprints |
spelling |
ph-ateneo-arc.discs-faculty-pubs-10712020-05-06T07:56:40Z EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem Fernandez, Proceso L, Jr Ronquillo, Mark Joseph D Finding DNA motifs is a widely studied area in the field of Computational Biology. Motifs signify different information that is useful for biologists. There are several variations of the motif finding problem, and one of these is called the (l, d)-motif search or Planted Motif Search problem (PMS). In this paper, we propose the EMS-GT2 algorithm, an extension of the Exact Motif Search-Generate and Test (EMS- GT) which is an exact enumerative algorithm for PMS. In EMS-GT2, we incorporated a new speedup technique that is based on an important property that we have discovered, which we prove in this paper, and which has enabled a more efficient block-processing of candidate motifs. Our C++ implementation of EMS-GT2 running on synthetic data for several PMS challenge instances demonstrates that it is competitive with both the EMS-GT and qPMS9, the two current best exact solutions for PMS. In particular, EMS-GT2 is able to reduce the run-times of EMS-GT by 20.3%, 15.8% and 22.6% for the (l, d) challenge instances (13, 4), (15, 5) and (17, 6) respectively. It also outperforms qPMS9, having runtime reductions of 91.6%, 79.3%, 82.0%, 59.4% and 9.7% for the (9, 2), (11, 3), (13, 4), (15, 5) and (17, 6) synthetic challenge instances respectively. 2017-06-01T07:00:00Z text https://archium.ateneo.edu/discs-faculty-pubs/72 https://kkgpublications.com/ijtes-volume3-issue3-article5/ Department of Information Systems & Computer Science Faculty Publications Archīum Ateneo Planted (l d)-Motif Problem Bit-Based Exact Enumerative Algorithm Computer Sciences Theory and Algorithms |
institution |
Ateneo De Manila University |
building |
Ateneo De Manila University Library |
country |
Philippines |
collection |
archium.Ateneo Institutional Repository |
topic |
Planted (l d)-Motif Problem Bit-Based Exact Enumerative Algorithm Computer Sciences Theory and Algorithms |
spellingShingle |
Planted (l d)-Motif Problem Bit-Based Exact Enumerative Algorithm Computer Sciences Theory and Algorithms Fernandez, Proceso L, Jr Ronquillo, Mark Joseph D EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem |
description |
Finding DNA motifs is a widely studied area in the field of Computational Biology. Motifs signify different information that is useful for biologists. There are several variations of the motif finding problem, and one of these is called the (l, d)-motif search or Planted Motif Search problem (PMS). In this paper, we propose the EMS-GT2 algorithm, an extension of the Exact Motif Search-Generate and Test (EMS- GT) which is an exact enumerative algorithm for PMS. In EMS-GT2, we incorporated a new speedup technique that is based on an important property that we have discovered, which we prove in this paper, and which has enabled a more efficient block-processing of candidate motifs. Our C++ implementation of EMS-GT2 running on synthetic data for several PMS challenge instances demonstrates that it is competitive with both the EMS-GT and qPMS9, the two current best exact solutions for PMS. In particular, EMS-GT2 is able to reduce the run-times of EMS-GT by 20.3%, 15.8% and 22.6% for the (l, d) challenge instances (13, 4), (15, 5) and (17, 6) respectively. It also outperforms qPMS9, having runtime reductions of 91.6%, 79.3%, 82.0%, 59.4% and 9.7% for the (9, 2), (11, 3), (13, 4), (15, 5) and (17, 6) synthetic challenge instances respectively. |
format |
text |
author |
Fernandez, Proceso L, Jr Ronquillo, Mark Joseph D |
author_facet |
Fernandez, Proceso L, Jr Ronquillo, Mark Joseph D |
author_sort |
Fernandez, Proceso L, Jr |
title |
EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem |
title_short |
EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem |
title_full |
EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem |
title_fullStr |
EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem |
title_full_unstemmed |
EMS-GT2: An Improved Exact Solution for the (l, d)-Planted Motif Problem |
title_sort |
ems-gt2: an improved exact solution for the (l, d)-planted motif problem |
publisher |
Archīum Ateneo |
publishDate |
2017 |
url |
https://archium.ateneo.edu/discs-faculty-pubs/72 https://kkgpublications.com/ijtes-volume3-issue3-article5/ |
_version_ |
1681506576294215680 |