Bug or not? Bug Report classification using N-gram IDF
© 2017 IEEE. Previous studies have found that a significant number of bug reports are misclassified between bugs and nonbugs, and that manually classifying bug reports is a time-consuming task. To address this problem, we propose a bug reports classification model with N-gram IDF, a theoretical exte...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Published: |
2018
|
Subjects: | |
Online Access: | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85040599854&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/57043 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Chiang Mai University |
id |
th-cmuir.6653943832-57043 |
---|---|
record_format |
dspace |
spelling |
th-cmuir.6653943832-570432018-09-05T03:38:05Z Bug or not? Bug Report classification using N-gram IDF Pannavat Terdchanakul Hideaki Hata Passakorn Phannachitta Kenichi Matsumoto Computer Science Engineering © 2017 IEEE. Previous studies have found that a significant number of bug reports are misclassified between bugs and nonbugs, and that manually classifying bug reports is a time-consuming task. To address this problem, we propose a bug reports classification model with N-gram IDF, a theoretical extension of Inverse Document Frequency (IDF) for handling words and phrases of any length. N-gram IDF enables us to extract key terms of any length from texts, these key terms can be used as the features to classify bug reports. We build classification models with logistic regression and random forest using features from N-gram IDF and topic modeling, which is widely used in various software engineering tasks. With a publicly available dataset, our results show that our N-gram IDF-based models have a superior performance than the topic-based models on all of the evaluated cases. Our models show promising results and have a potential to be extended to other software engineering tasks. 2018-09-05T03:34:18Z 2018-09-05T03:34:18Z 2017-11-02 Conference Proceeding 2-s2.0-85040599854 10.1109/ICSME.2017.14 https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85040599854&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/57043 |
institution |
Chiang Mai University |
building |
Chiang Mai University Library |
country |
Thailand |
collection |
CMU Intellectual Repository |
topic |
Computer Science Engineering |
spellingShingle |
Computer Science Engineering Pannavat Terdchanakul Hideaki Hata Passakorn Phannachitta Kenichi Matsumoto Bug or not? Bug Report classification using N-gram IDF |
description |
© 2017 IEEE. Previous studies have found that a significant number of bug reports are misclassified between bugs and nonbugs, and that manually classifying bug reports is a time-consuming task. To address this problem, we propose a bug reports classification model with N-gram IDF, a theoretical extension of Inverse Document Frequency (IDF) for handling words and phrases of any length. N-gram IDF enables us to extract key terms of any length from texts, these key terms can be used as the features to classify bug reports. We build classification models with logistic regression and random forest using features from N-gram IDF and topic modeling, which is widely used in various software engineering tasks. With a publicly available dataset, our results show that our N-gram IDF-based models have a superior performance than the topic-based models on all of the evaluated cases. Our models show promising results and have a potential to be extended to other software engineering tasks. |
format |
Conference Proceeding |
author |
Pannavat Terdchanakul Hideaki Hata Passakorn Phannachitta Kenichi Matsumoto |
author_facet |
Pannavat Terdchanakul Hideaki Hata Passakorn Phannachitta Kenichi Matsumoto |
author_sort |
Pannavat Terdchanakul |
title |
Bug or not? Bug Report classification using N-gram IDF |
title_short |
Bug or not? Bug Report classification using N-gram IDF |
title_full |
Bug or not? Bug Report classification using N-gram IDF |
title_fullStr |
Bug or not? Bug Report classification using N-gram IDF |
title_full_unstemmed |
Bug or not? Bug Report classification using N-gram IDF |
title_sort |
bug or not? bug report classification using n-gram idf |
publishDate |
2018 |
url |
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85040599854&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/57043 |
_version_ |
1681424805209833472 |