Real world projects, real faults: Evaluating spectrum based fault localization techniques on Python projects

Spectrum Based Fault Localization (SBFL) is a statistical approach to identify faulty code within a program given a program spectra (i.e., records of program elements executed by passing and failing test cases). Several SBFL techniques have been proposed over the years, but most evaluations of those...

Full description

Saved in:
Bibliographic Details
Main Authors: RATNADIRA WIDYASARI, PRANA, Gede Artha Azriadi, AGUS HARYONO, Stefanus, WANG, Shaowei, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7321
https://ink.library.smu.edu.sg/context/sis_research/article/8324/viewcontent/RealWorldProblems_Faults_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8324
record_format dspace
spelling sg-smu-ink.sis_research-83242022-09-29T05:51:45Z Real world projects, real faults: Evaluating spectrum based fault localization techniques on Python projects RATNADIRA WIDYASARI, PRANA, Gede Artha Azriadi AGUS HARYONO, Stefanus WANG, Shaowei LO, David Spectrum Based Fault Localization (SBFL) is a statistical approach to identify faulty code within a program given a program spectra (i.e., records of program elements executed by passing and failing test cases). Several SBFL techniques have been proposed over the years, but most evaluations of those techniques were done only on Java and C programs, and frequently involve artificial faults. Considering the current popularity of Python, indicated by the results of the Stack Overflow survey among developers in 2020, it becomes increasingly important to understand how SBFL techniques perform on Python projects. However, this remains an understudied topic. In this work, our objective is to analyze the effectiveness of popular SBFL techniques in real-world Python projects. We also aim to compare our observed performance on Python to previously-reported performance on Java. Using the recently-built bug benchmark BugsInPy as our fault dataset, we apply five popular SBFL techniques (Tarantula, Ochiai, O-P, Barinel, and DStar) and analyze their performances. We subsequently compare our results with results from Java and C projects reported in earlier related works. We find that 1) the real faults in BugsInPy are harder to identify using SBFL techniques compared to the real faults in Defects4J, indicated by the lower performance of the evaluated SBFL techniques on BugsInPy; 2) older techniques such as Tarantula, Barinel, and Ochiai consistently outperform newer techniques (i.e., O-P and DStar) in a variety of metrics and debugging scenarios; 3) claims in preceding studies done on artificial faults in C and Java (such as "O-P outperforms Tarantula") do not hold on Python real faults; 4) lower-performing techniques can outperform higher-performing techniques in some cases, emphasizing the potential benefit of combining SBFL techniques. Our results yield insight into how popular SBFL techniques perform in real Python faults and emphasize the importance of conducting SBFL evaluations on real faults. 2022-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7321 info:doi/10.1007/s10664-022-10189-4 https://ink.library.smu.edu.sg/context/sis_research/article/8324/viewcontent/RealWorldProblems_Faults_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Spectrum-based fault localization Python testing and debugging empirical study Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Spectrum-based fault localization
Python
testing and debugging
empirical study
Software Engineering
spellingShingle Spectrum-based fault localization
Python
testing and debugging
empirical study
Software Engineering
RATNADIRA WIDYASARI,
PRANA, Gede Artha Azriadi
AGUS HARYONO, Stefanus
WANG, Shaowei
LO, David
Real world projects, real faults: Evaluating spectrum based fault localization techniques on Python projects
description Spectrum Based Fault Localization (SBFL) is a statistical approach to identify faulty code within a program given a program spectra (i.e., records of program elements executed by passing and failing test cases). Several SBFL techniques have been proposed over the years, but most evaluations of those techniques were done only on Java and C programs, and frequently involve artificial faults. Considering the current popularity of Python, indicated by the results of the Stack Overflow survey among developers in 2020, it becomes increasingly important to understand how SBFL techniques perform on Python projects. However, this remains an understudied topic. In this work, our objective is to analyze the effectiveness of popular SBFL techniques in real-world Python projects. We also aim to compare our observed performance on Python to previously-reported performance on Java. Using the recently-built bug benchmark BugsInPy as our fault dataset, we apply five popular SBFL techniques (Tarantula, Ochiai, O-P, Barinel, and DStar) and analyze their performances. We subsequently compare our results with results from Java and C projects reported in earlier related works. We find that 1) the real faults in BugsInPy are harder to identify using SBFL techniques compared to the real faults in Defects4J, indicated by the lower performance of the evaluated SBFL techniques on BugsInPy; 2) older techniques such as Tarantula, Barinel, and Ochiai consistently outperform newer techniques (i.e., O-P and DStar) in a variety of metrics and debugging scenarios; 3) claims in preceding studies done on artificial faults in C and Java (such as "O-P outperforms Tarantula") do not hold on Python real faults; 4) lower-performing techniques can outperform higher-performing techniques in some cases, emphasizing the potential benefit of combining SBFL techniques. Our results yield insight into how popular SBFL techniques perform in real Python faults and emphasize the importance of conducting SBFL evaluations on real faults.
format text
author RATNADIRA WIDYASARI,
PRANA, Gede Artha Azriadi
AGUS HARYONO, Stefanus
WANG, Shaowei
LO, David
author_facet RATNADIRA WIDYASARI,
PRANA, Gede Artha Azriadi
AGUS HARYONO, Stefanus
WANG, Shaowei
LO, David
author_sort RATNADIRA WIDYASARI,
title Real world projects, real faults: Evaluating spectrum based fault localization techniques on Python projects
title_short Real world projects, real faults: Evaluating spectrum based fault localization techniques on Python projects
title_full Real world projects, real faults: Evaluating spectrum based fault localization techniques on Python projects
title_fullStr Real world projects, real faults: Evaluating spectrum based fault localization techniques on Python projects
title_full_unstemmed Real world projects, real faults: Evaluating spectrum based fault localization techniques on Python projects
title_sort real world projects, real faults: evaluating spectrum based fault localization techniques on python projects
publisher Institutional Knowledge at Singapore Management University
publishDate 2022
url https://ink.library.smu.edu.sg/sis_research/7321
https://ink.library.smu.edu.sg/context/sis_research/article/8324/viewcontent/RealWorldProblems_Faults_av.pdf
_version_ 1770576311391092736