Increasing the accuracy of software development effort estimation using projects clustering

Software development effort is one of the most important metrics that must be correctly estimated in software projects. Analogy-based estimation (ABE) and artificial neural networks (ANN) are the most popular methods used widely in this field. These methods suffer from inconsistent and irrelevant pr...

Full description

Saved in:
Bibliographic Details
Main Authors: Bardsiri, V. Khatibi, Khatibi, E., Jawawi, D. N. A., Hashim, S. Z. M.
Format: Article
Published: 2012
Subjects:
Online Access:http://eprints.utm.my/id/eprint/47096/
http://dx.doi.org/10.1049/iet-sen.2011.0210
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Description
Summary:Software development effort is one of the most important metrics that must be correctly estimated in software projects. Analogy-based estimation (ABE) and artificial neural networks (ANN) are the most popular methods used widely in this field. These methods suffer from inconsistent and irrelevant projects that exist in the software project datasets. In this paper, a new hybrid method is proposed to increase the accuracy of development effort estimation based on the combination of fuzzy clustering, ABE and ANN methods. In the proposed method, the effect of irrelevant and inconsistent projects on estimates is decreased by designing a new framework, in which all the projects are clustered. The quality of training in ANN and the consistency of historical data in ABE are improved using the proposed framework. Two large and real datasets are utilised in order to evaluate the performance of the proposed method and the obtained results are compared to eight other estimation methods. The promising results showed that the proposed method outperformed the other methods on both datasets. The performance metrics of mean magnitude of relative error (MMRE) and the percentage of the prediction (PRED) (0.25) have been improved by average of 51 and 127% in the first dataset, as well as 52 and 94% in the second dataset.