Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties
Enhancers are short deoxyribonucleic acid fragments that assume an important part in the genetic process of gene expression. Due to their possibly distant location relative to the gene that is acted upon, the identification of enhancers is difficult. There are many published works focused on identif...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/142258 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-142258 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1422582020-06-18T02:40:54Z Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties Tan, Kok Keng Le, Nguyen Quoc Khanh Yeh, Hui-Yuan Chua, Matthew Chin Heng School of Humanities Medical Humanities Research Cluster Science::Biological sciences Enhancer DNA Gene Expression Enhancers are short deoxyribonucleic acid fragments that assume an important part in the genetic process of gene expression. Due to their possibly distant location relative to the gene that is acted upon, the identification of enhancers is difficult. There are many published works focused on identifying enhancers based on their sequence information, however, the resulting performance still requires improvements. Using deep learning methods, this study proposes a model ensemble of classifiers for predicting enhancers based on deep recurrent neural networks. The input features of deep ensemble networks were generated from six types of dinucleotide physicochemical properties, which had outperformed the other features. In summary, our model which used this ensemble approach could identify enhancers with achieved sensitivity of 75.5%, specificity of 76%, accuracy of 75.5%, and MCC of 0.51. For classifying enhancers into strong or weak sequences, our model reached sensitivity of 83.15%, specificity of 45.61%, accuracy of 68.49%, and MCC of 0.312. Compared to the benchmark result, our results had higher performance in term of most measurement metrics. The results showed that deep model ensembles hold the potential for improving on the best results achieved to date using shallow machine learning methods. Published version 2020-06-18T02:40:54Z 2020-06-18T02:40:54Z 2019 Journal Article Tan, K. K., Le, N. Q. K., Yeh, H.-Y., & Chua, M. C. H. (2019). Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties. Cells, 8(7), 767-. doi:10.3390/cells8070767 2073-4409 https://hdl.handle.net/10356/142258 10.3390/cells8070767 31340596 7 8 en Cells © 2019 The Author(s). Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Science::Biological sciences Enhancer DNA Gene Expression |
spellingShingle |
Science::Biological sciences Enhancer DNA Gene Expression Tan, Kok Keng Le, Nguyen Quoc Khanh Yeh, Hui-Yuan Chua, Matthew Chin Heng Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties |
description |
Enhancers are short deoxyribonucleic acid fragments that assume an important part in the genetic process of gene expression. Due to their possibly distant location relative to the gene that is acted upon, the identification of enhancers is difficult. There are many published works focused on identifying enhancers based on their sequence information, however, the resulting performance still requires improvements. Using deep learning methods, this study proposes a model ensemble of classifiers for predicting enhancers based on deep recurrent neural networks. The input features of deep ensemble networks were generated from six types of dinucleotide physicochemical properties, which had outperformed the other features. In summary, our model which used this ensemble approach could identify enhancers with achieved sensitivity of 75.5%, specificity of 76%, accuracy of 75.5%, and MCC of 0.51. For classifying enhancers into strong or weak sequences, our model reached sensitivity of 83.15%, specificity of 45.61%, accuracy of 68.49%, and MCC of 0.312. Compared to the benchmark result, our results had higher performance in term of most measurement metrics. The results showed that deep model ensembles hold the potential for improving on the best results achieved to date using shallow machine learning methods. |
author2 |
School of Humanities |
author_facet |
School of Humanities Tan, Kok Keng Le, Nguyen Quoc Khanh Yeh, Hui-Yuan Chua, Matthew Chin Heng |
format |
Article |
author |
Tan, Kok Keng Le, Nguyen Quoc Khanh Yeh, Hui-Yuan Chua, Matthew Chin Heng |
author_sort |
Tan, Kok Keng |
title |
Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties |
title_short |
Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties |
title_full |
Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties |
title_fullStr |
Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties |
title_full_unstemmed |
Ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties |
title_sort |
ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/142258 |
_version_ |
1681058793046147072 |