Overfitting in semantics-based automated program repair
The primary goal of Automated Program Repair (APR) is to automatically fix buggy software, to reduce the manual bug-fix burden that presently rests on human developers. Existing APR techniques can be generally divided into two families: semantics- vs. heuristics-based. Semantics-based APR uses symbo...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2018
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/3986 https://ink.library.smu.edu.sg/context/sis_research/article/4988/viewcontent/Overfitting_in_semantics_based_automated_program_repair_afv.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-4988 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-49882020-01-15T14:57:07Z Overfitting in semantics-based automated program repair LE, Dinh Xuan Bach THUNG, Ferdian LO, David LE GOUES, Claire The primary goal of Automated Program Repair (APR) is to automatically fix buggy software, to reduce the manual bug-fix burden that presently rests on human developers. Existing APR techniques can be generally divided into two families: semantics- vs. heuristics-based. Semantics-based APR uses symbolic execution and test suites to extract semantic constraints, and uses program synthesis to synthesize repairs that satisfy the extracted constraints. Heuristic-based APR generates large populations of repair candidates via source manipulation, and searches for the best among them. Both families largely rely on a primary assumption that a program is correctly patched if the generated patch leads the program to pass all provided test cases. Patch correctness is thus an especially pressing concern. A repair technique may generate overfitting patches, which lead a program to pass all existing test cases, but fails to generalize beyond them. In this work, we revisit the overfitting problem with a focus on semantics-based APR techniques, complementing previous studies of the overfitting problem in heuristics-based APR. We perform our study using IntroClass and Codeflaws benchmarks, two datasets well-suited for assessing repair quality, to systematically characterize and understand the nature of overfitting in semantics-based APR. We find that similar to heuristics-based APR, overfitting also occurs in semantics-based APR in various different ways. 2018-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3986 info:doi/10.1007/s10664-017-9577-2 https://ink.library.smu.edu.sg/context/sis_research/article/4988/viewcontent/Overfitting_in_semantics_based_automated_program_repair_afv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Automated program repair Program synthesis Symbolic execution Patch overfitting Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Automated program repair Program synthesis Symbolic execution Patch overfitting Software Engineering |
spellingShingle |
Automated program repair Program synthesis Symbolic execution Patch overfitting Software Engineering LE, Dinh Xuan Bach THUNG, Ferdian LO, David LE GOUES, Claire Overfitting in semantics-based automated program repair |
description |
The primary goal of Automated Program Repair (APR) is to automatically fix buggy software, to reduce the manual bug-fix burden that presently rests on human developers. Existing APR techniques can be generally divided into two families: semantics- vs. heuristics-based. Semantics-based APR uses symbolic execution and test suites to extract semantic constraints, and uses program synthesis to synthesize repairs that satisfy the extracted constraints. Heuristic-based APR generates large populations of repair candidates via source manipulation, and searches for the best among them. Both families largely rely on a primary assumption that a program is correctly patched if the generated patch leads the program to pass all provided test cases. Patch correctness is thus an especially pressing concern. A repair technique may generate overfitting patches, which lead a program to pass all existing test cases, but fails to generalize beyond them. In this work, we revisit the overfitting problem with a focus on semantics-based APR techniques, complementing previous studies of the overfitting problem in heuristics-based APR. We perform our study using IntroClass and Codeflaws benchmarks, two datasets well-suited for assessing repair quality, to systematically characterize and understand the nature of overfitting in semantics-based APR. We find that similar to heuristics-based APR, overfitting also occurs in semantics-based APR in various different ways. |
format |
text |
author |
LE, Dinh Xuan Bach THUNG, Ferdian LO, David LE GOUES, Claire |
author_facet |
LE, Dinh Xuan Bach THUNG, Ferdian LO, David LE GOUES, Claire |
author_sort |
LE, Dinh Xuan Bach |
title |
Overfitting in semantics-based automated program repair |
title_short |
Overfitting in semantics-based automated program repair |
title_full |
Overfitting in semantics-based automated program repair |
title_fullStr |
Overfitting in semantics-based automated program repair |
title_full_unstemmed |
Overfitting in semantics-based automated program repair |
title_sort |
overfitting in semantics-based automated program repair |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2018 |
url |
https://ink.library.smu.edu.sg/sis_research/3986 https://ink.library.smu.edu.sg/context/sis_research/article/4988/viewcontent/Overfitting_in_semantics_based_automated_program_repair_afv.pdf |
_version_ |
1770574112140296192 |