MAHAKIL: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction
© 2018 ACM. This study presents MAHAKIL, a novel and efficient synthetic over-sampling approach for software defect datasets that is based on the chromosomal theory of inheritance. Exploiting this theory, MAHAKIL interprets two distinct sub-classes as parents and generates a new instance that inheri...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Published: |
2018
|
Subjects: | |
Online Access: | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85049405348&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/58500 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Chiang Mai University |
id |
th-cmuir.6653943832-58500 |
---|---|
record_format |
dspace |
spelling |
th-cmuir.6653943832-585002018-09-05T04:25:37Z MAHAKIL: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction Kwabena E. Bennin Jacky Keung Passakorn Phannachitta Akito Monden Solomon Mensah Computer Science © 2018 ACM. This study presents MAHAKIL, a novel and efficient synthetic over-sampling approach for software defect datasets that is based on the chromosomal theory of inheritance. Exploiting this theory, MAHAKIL interprets two distinct sub-classes as parents and generates a new instance that inherits different traits from each parent and contributes to the diversity within the data distribution. We extensively compare MAHAKIL with five other sampling approaches using 20 releases of defect datasets from the PROMISE repository and five prediction models. Our experiments indicate that MAHAKIL improves the prediction performance for all the models and achieves better and more significant pf values than the other oversampling approaches, based on robust statistical tests. 2018-09-05T04:25:37Z 2018-09-05T04:25:37Z 2018-05-27 Conference Proceeding 02705257 2-s2.0-85049405348 10.1145/3180155.3182520 https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85049405348&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/58500 |
institution |
Chiang Mai University |
building |
Chiang Mai University Library |
country |
Thailand |
collection |
CMU Intellectual Repository |
topic |
Computer Science |
spellingShingle |
Computer Science Kwabena E. Bennin Jacky Keung Passakorn Phannachitta Akito Monden Solomon Mensah MAHAKIL: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction |
description |
© 2018 ACM. This study presents MAHAKIL, a novel and efficient synthetic over-sampling approach for software defect datasets that is based on the chromosomal theory of inheritance. Exploiting this theory, MAHAKIL interprets two distinct sub-classes as parents and generates a new instance that inherits different traits from each parent and contributes to the diversity within the data distribution. We extensively compare MAHAKIL with five other sampling approaches using 20 releases of defect datasets from the PROMISE repository and five prediction models. Our experiments indicate that MAHAKIL improves the prediction performance for all the models and achieves better and more significant pf values than the other oversampling approaches, based on robust statistical tests. |
format |
Conference Proceeding |
author |
Kwabena E. Bennin Jacky Keung Passakorn Phannachitta Akito Monden Solomon Mensah |
author_facet |
Kwabena E. Bennin Jacky Keung Passakorn Phannachitta Akito Monden Solomon Mensah |
author_sort |
Kwabena E. Bennin |
title |
MAHAKIL: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction |
title_short |
MAHAKIL: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction |
title_full |
MAHAKIL: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction |
title_fullStr |
MAHAKIL: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction |
title_full_unstemmed |
MAHAKIL: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction |
title_sort |
mahakil: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction |
publishDate |
2018 |
url |
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85049405348&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/58500 |
_version_ |
1681425076999684096 |