Evaluating Defect Prediction using a Massive Set of Metrics

To evaluate the performance of a within-project defect prediction approach, people normally use precision, recall, and F-measure scores. However, in machine learning literature, there are a large number of evaluation metrics to evaluate the performance of an algorithm, (e.g., Matthews Correlation Co...

Full description

Saved in:
Bibliographic Details
Main Authors: XUAN, Xiao, David LO, XIA, Xin, TIAN, Yuan
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2015
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3081
https://ink.library.smu.edu.sg/context/sis_research/article/4081/viewcontent/Defect_prediction_metrics_xuan_2015_afv.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4081
record_format dspace
spelling sg-smu-ink.sis_research-40812019-06-10T09:18:35Z Evaluating Defect Prediction using a Massive Set of Metrics XUAN, Xiao David LO, XIA, Xin TIAN, Yuan To evaluate the performance of a within-project defect prediction approach, people normally use precision, recall, and F-measure scores. However, in machine learning literature, there are a large number of evaluation metrics to evaluate the performance of an algorithm, (e.g., Matthews Correlation Coefficient, G-means, etc.), and these metrics evaluate an approach from different aspects. In this paper, we investigate the performance of within-project defect prediction approaches on a large number of evaluation metrics. We choose 6 state-of-the-art approaches including naive Bayes, decision tree, logistic regression, kNN, random forest and Bayesian network which are widely used in defect prediction literature. And we evaluate these 6 approaches on 14 evaluation metrics (e.g., G-mean, F-measure, balance, MCC, J-coefficient, and AUC). Our goal is to explore a practical and sophisticated way for evaluating the prediction approaches comprehensively. We evaluate the performance of defect prediction approaches on 10 defect datasets from PROMISE repository. The results show that Bayesian network achieves a noteworthy performance. It achieves the best recall, FN-R, G-mean1 and balance on 9 out of the 10 datasets, and F-measure and J-coefficient on 7 out of the 10 datasets. 2015-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3081 info:doi/10.1145/2695664.2695959 https://ink.library.smu.edu.sg/context/sis_research/article/4081/viewcontent/Defect_prediction_metrics_xuan_2015_afv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Defect Prediction Evaluation Metric Machine Learning Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Defect Prediction
Evaluation Metric
Machine Learning
Software Engineering
spellingShingle Defect Prediction
Evaluation Metric
Machine Learning
Software Engineering
XUAN, Xiao
David LO,
XIA, Xin
TIAN, Yuan
Evaluating Defect Prediction using a Massive Set of Metrics
description To evaluate the performance of a within-project defect prediction approach, people normally use precision, recall, and F-measure scores. However, in machine learning literature, there are a large number of evaluation metrics to evaluate the performance of an algorithm, (e.g., Matthews Correlation Coefficient, G-means, etc.), and these metrics evaluate an approach from different aspects. In this paper, we investigate the performance of within-project defect prediction approaches on a large number of evaluation metrics. We choose 6 state-of-the-art approaches including naive Bayes, decision tree, logistic regression, kNN, random forest and Bayesian network which are widely used in defect prediction literature. And we evaluate these 6 approaches on 14 evaluation metrics (e.g., G-mean, F-measure, balance, MCC, J-coefficient, and AUC). Our goal is to explore a practical and sophisticated way for evaluating the prediction approaches comprehensively. We evaluate the performance of defect prediction approaches on 10 defect datasets from PROMISE repository. The results show that Bayesian network achieves a noteworthy performance. It achieves the best recall, FN-R, G-mean1 and balance on 9 out of the 10 datasets, and F-measure and J-coefficient on 7 out of the 10 datasets.
format text
author XUAN, Xiao
David LO,
XIA, Xin
TIAN, Yuan
author_facet XUAN, Xiao
David LO,
XIA, Xin
TIAN, Yuan
author_sort XUAN, Xiao
title Evaluating Defect Prediction using a Massive Set of Metrics
title_short Evaluating Defect Prediction using a Massive Set of Metrics
title_full Evaluating Defect Prediction using a Massive Set of Metrics
title_fullStr Evaluating Defect Prediction using a Massive Set of Metrics
title_full_unstemmed Evaluating Defect Prediction using a Massive Set of Metrics
title_sort evaluating defect prediction using a massive set of metrics
publisher Institutional Knowledge at Singapore Management University
publishDate 2015
url https://ink.library.smu.edu.sg/sis_research/3081
https://ink.library.smu.edu.sg/context/sis_research/article/4081/viewcontent/Defect_prediction_metrics_xuan_2015_afv.pdf
_version_ 1770572803574071296