Invalidator: Automated patch correctness assessment via semantic and syntactic reasoning

Automated program repair (APR) has been gaining ground recently. However, a significant challenge that still remains is test overfitting, in which APR-generated patches plausibly pass the validation test suite but fail to generalize. A common practice to assess the correctness of APR-generated patch...

Full description

Saved in:

Bibliographic Details
Main Authors:	LE-CONG, Tranh, LUONG, Duc Minh, LE, Xuan Bach D., LO, David, TRAN, Nhat-Hoa, QUANG-HUY, Bui, HUYNH, Quyet-Thang
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2023
Subjects:	Automated Patch Correctness Assessment Automated Program Repair Code Representations Overfitting problem Program Invariants Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/7800 https://ink.library.smu.edu.sg/context/sis_research/article/8803/viewcontent/Invalidator_av.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-8803
record_format	dspace
spelling	sg-smu-ink.sis_research-88032023-04-04T03:02:50Z Invalidator: Automated patch correctness assessment via semantic and syntactic reasoning LE-CONG, Tranh LUONG, Duc Minh LE, Xuan Bach D. LO, David TRAN, Nhat-Hoa QUANG-HUY, Bui HUYNH, Quyet-Thang Automated program repair (APR) has been gaining ground recently. However, a significant challenge that still remains is test overfitting, in which APR-generated patches plausibly pass the validation test suite but fail to generalize. A common practice to assess the correctness of APR-generated patches is to judge whether they are equivalent to ground truth, i.e., developer-written patches, by either generating additional test cases or employing human manual inspections. The former often requires the generation of at least one test that shows behavioral differences between the APR-patched and developer-patched programs. Searching for this test, however, can be difficult as the search space can be enormous. Meanwhile, the latter is prone to human biases and requires repetitive and expensive manual effort. In this paper, we propose a novel technique, , to automatically assess the correctness of APR-generated patches via semantic and syntactic reasoning. leverages program invariants to reason about program semantics while also capturing program syntax through language semantics learned from a large code corpus using a pre-trained language model. Given a buggy program and the developer-patched program, infers likely invariants on both programs. Then, determines that an APR-generated patch overfits if: (1) it violates correct specifications or (2) maintains erroneous behaviors from the original buggy program. In case our approach fails to determine an overfitting patch based on invariants, utilizes a trained model from labeled patches to assess patch correctness based on program syntax. The benefit of is threefold. First, leverages both semantic and syntactic reasoning to enhance its discriminative capability. Second, does not require new test cases to be generated, but instead only relies on the current test suite and uses invariant inference to generalize program behaviors. Third, is fully automated. We conducted our experiments on a dataset of 885 patches generated on real-world programs in Defects4J. Experiment results show that correctly classified 79% of overfitting patches, accounting for 23% more overfitting patches being detected than the best baseline. also substantially outperforms the best baselines by 14% and 19% in terms of Accuracy and F-Measure, respectively. 2023-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7800 info:doi/10.1109/TSE.2023.3255177 https://ink.library.smu.edu.sg/context/sis_research/article/8803/viewcontent/Invalidator_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Automated Patch Correctness Assessment Automated Program Repair Code Representations Overfitting problem Program Invariants Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Automated Patch Correctness Assessment Automated Program Repair Code Representations Overfitting problem Program Invariants Software Engineering
spellingShingle	Automated Patch Correctness Assessment Automated Program Repair Code Representations Overfitting problem Program Invariants Software Engineering LE-CONG, Tranh LUONG, Duc Minh LE, Xuan Bach D. LO, David TRAN, Nhat-Hoa QUANG-HUY, Bui HUYNH, Quyet-Thang Invalidator: Automated patch correctness assessment via semantic and syntactic reasoning
description	Automated program repair (APR) has been gaining ground recently. However, a significant challenge that still remains is test overfitting, in which APR-generated patches plausibly pass the validation test suite but fail to generalize. A common practice to assess the correctness of APR-generated patches is to judge whether they are equivalent to ground truth, i.e., developer-written patches, by either generating additional test cases or employing human manual inspections. The former often requires the generation of at least one test that shows behavioral differences between the APR-patched and developer-patched programs. Searching for this test, however, can be difficult as the search space can be enormous. Meanwhile, the latter is prone to human biases and requires repetitive and expensive manual effort. In this paper, we propose a novel technique, , to automatically assess the correctness of APR-generated patches via semantic and syntactic reasoning. leverages program invariants to reason about program semantics while also capturing program syntax through language semantics learned from a large code corpus using a pre-trained language model. Given a buggy program and the developer-patched program, infers likely invariants on both programs. Then, determines that an APR-generated patch overfits if: (1) it violates correct specifications or (2) maintains erroneous behaviors from the original buggy program. In case our approach fails to determine an overfitting patch based on invariants, utilizes a trained model from labeled patches to assess patch correctness based on program syntax. The benefit of is threefold. First, leverages both semantic and syntactic reasoning to enhance its discriminative capability. Second, does not require new test cases to be generated, but instead only relies on the current test suite and uses invariant inference to generalize program behaviors. Third, is fully automated. We conducted our experiments on a dataset of 885 patches generated on real-world programs in Defects4J. Experiment results show that correctly classified 79% of overfitting patches, accounting for 23% more overfitting patches being detected than the best baseline. also substantially outperforms the best baselines by 14% and 19% in terms of Accuracy and F-Measure, respectively.
format	text
author	LE-CONG, Tranh LUONG, Duc Minh LE, Xuan Bach D. LO, David TRAN, Nhat-Hoa QUANG-HUY, Bui HUYNH, Quyet-Thang
author_facet	LE-CONG, Tranh LUONG, Duc Minh LE, Xuan Bach D. LO, David TRAN, Nhat-Hoa QUANG-HUY, Bui HUYNH, Quyet-Thang
author_sort	LE-CONG, Tranh
title	Invalidator: Automated patch correctness assessment via semantic and syntactic reasoning
title_short	Invalidator: Automated patch correctness assessment via semantic and syntactic reasoning
title_full	Invalidator: Automated patch correctness assessment via semantic and syntactic reasoning
title_fullStr	Invalidator: Automated patch correctness assessment via semantic and syntactic reasoning
title_full_unstemmed	Invalidator: Automated patch correctness assessment via semantic and syntactic reasoning
title_sort	invalidator: automated patch correctness assessment via semantic and syntactic reasoning
publisher	Institutional Knowledge at Singapore Management University
publishDate	2023
url	https://ink.library.smu.edu.sg/sis_research/7800 https://ink.library.smu.edu.sg/context/sis_research/article/8803/viewcontent/Invalidator_av.pdf
_version_	1770576516099342336

Invalidator: Automated patch correctness assessment via semantic and syntactic reasoning

Similar Items