A Learning-to-Rank Based Fault Localization Approach using Likely Invariants

Debugging is a costly process that consumes much of developer time and energy. To help reduce debugging effort, many studies have proposed various fault localization approaches. These approaches take as input a set of test cases (some failing, some passing) and produce a ranked list of program eleme...

Full description

Saved in:
Bibliographic Details
Main Authors: LE, Tien-Duy B., David LO, LE GOUES, Claire, GRUNSKE, Lars
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2016
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3453
https://ink.library.smu.edu.sg/context/sis_research/article/4454/viewcontent/164___A_Learning_to_Rank_Based_Fault_Localization_Approach_using_Likely_Invariants__ISSTA2016_.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Debugging is a costly process that consumes much of developer time and energy. To help reduce debugging effort, many studies have proposed various fault localization approaches. These approaches take as input a set of test cases (some failing, some passing) and produce a ranked list of program elements that are likely to be the root cause of the failures (i.e., failing test cases). In this work, we propose Savant, a new fault localization approach that employs a learning-to-rank strategy, using likely invariant diffs and suspiciousness scores as features, to rank methods based on their likelihood to be a root cause of a failure. Savant has four steps: method clustering & test case selection, invariant mining, feature extraction, and method ranking. At the end of these four steps, Savant produces a short ranked list of potentially buggy methods. We have evaluated Savant on 357 real-life bugs from 5 programs from the Defects4J benchmark. Out of these bugs, averaging over 100 repeated trials with different seeds to randomly break ties, we find that on average Savant can identify correct buggy methods for 63.03, 101.72, and 122 bugs at top 1, 3, and 5 positions in the ranked lists that Savant produces. We have compared Savant against several state-of-the-art fault localization baselines that work on program spectra. We show that Savant can successfully locate 57.73%, 56.69%, and 43.13% more bugs at top 1, top 3, and top 5 positions than the best performing baseline, respectively.