When automated program repair meets regression testing - an extensive study on two million patches

In recent years, Automated Program Repair (APR) has been extensively studied in academia and even drawn wide attention from the industry. However, APR techniques can be extremely time consuming since (1) a large number of patches can be generated for a given bug, and (2) each patch needs to be execu...

Full description

Saved in:

Bibliographic Details
Main Authors:	Lou, Yiling, Yang, Jun, Benton, Samuel, Hao, Dan, Tan, Lin, Chen, Zhenpeng, Zhang, Lu, Zhang, Lingming
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2025
Subjects:	Computer and Information Science Patch validation Program repair
Online Access:	https://hdl.handle.net/10356/182504
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-182504
record_format	dspace
spelling	sg-ntu-dr.10356-1825042025-02-07T15:38:06Z When automated program repair meets regression testing - an extensive study on two million patches Lou, Yiling Yang, Jun Benton, Samuel Hao, Dan Tan, Lin Chen, Zhenpeng Zhang, Lu Zhang, Lingming School of Computer Science and Engineering Computer and Information Science Patch validation Program repair In recent years, Automated Program Repair (APR) has been extensively studied in academia and even drawn wide attention from the industry. However, APR techniques can be extremely time consuming since (1) a large number of patches can be generated for a given bug, and (2) each patch needs to be executed on the original tests to ensure its correctness. In the literature, various techniques (e.g., based on learning, mining, and constraint solving) have been proposed/studied to reduce the number of patches. Intuitively, every patch can be treated as a software revision during regression testing; thus, traditional Regression Test Selection (RTS) techniques can be leveraged to only execute the tests affected by each patch (as the other tests would keep the same outcomes) to further reduce patch execution time. However, few APR systems actually adopt RTS and there is still a lack of systematic studies demonstrating the benefits of RTS and the impact of different RTS strategies on APR. To this end, this article presents the first extensive study of widely used RTS techniques at different levels (i.e., class/method/statement levels) for 12 state-of-the-art APR systems on over 2M patches. Our study reveals various practical guidelines for bridging the gap between APR and regression testing, including: (1) the number of patches widely used for measuring APR efficiency can incur skewed conclusions, and the use of inconsistent RTS configurations can further skew the conclusions; (2) all studied RTS techniques can substantially improve APR efficiency and should be considered in future APR work; (3) method- and statement-level RTS outperform class-level RTS substantially and should be preferred; (4) RTS techniques can substantially outperform state-of-the-art test prioritization techniques for APR, and combining them can further improve APR efficiency; and (5) traditional Regression Test Prioritization (RTP) widely studied in regression testing performs even better than APR-specific test prioritization when combined with most RTS techniques. Furthermore, we also present the detailed impact of different patch categories and patch validation strategies on our findings. Published version 2025-02-05T04:13:04Z 2025-02-05T04:13:04Z 2024 Journal Article Lou, Y., Yang, J., Benton, S., Hao, D., Tan, L., Chen, Z., Zhang, L. & Zhang, L. (2024). When automated program repair meets regression testing - an extensive study on two million patches. ACM Transactions On Software Engineering and Methodology, 33(7), 180-. https://dx.doi.org/10.1145/3672450 1049-331X https://hdl.handle.net/10356/182504 10.1145/3672450 2-s2.0-85206219383 7 33 180 en ACM Transactions on Software Engineering and Methodology © 2024 the Owner/Author(s). This work is licensed under a Creative Commons Attribution International 4.0 License. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Patch validation Program repair
spellingShingle	Computer and Information Science Patch validation Program repair Lou, Yiling Yang, Jun Benton, Samuel Hao, Dan Tan, Lin Chen, Zhenpeng Zhang, Lu Zhang, Lingming When automated program repair meets regression testing - an extensive study on two million patches
description	In recent years, Automated Program Repair (APR) has been extensively studied in academia and even drawn wide attention from the industry. However, APR techniques can be extremely time consuming since (1) a large number of patches can be generated for a given bug, and (2) each patch needs to be executed on the original tests to ensure its correctness. In the literature, various techniques (e.g., based on learning, mining, and constraint solving) have been proposed/studied to reduce the number of patches. Intuitively, every patch can be treated as a software revision during regression testing; thus, traditional Regression Test Selection (RTS) techniques can be leveraged to only execute the tests affected by each patch (as the other tests would keep the same outcomes) to further reduce patch execution time. However, few APR systems actually adopt RTS and there is still a lack of systematic studies demonstrating the benefits of RTS and the impact of different RTS strategies on APR. To this end, this article presents the first extensive study of widely used RTS techniques at different levels (i.e., class/method/statement levels) for 12 state-of-the-art APR systems on over 2M patches. Our study reveals various practical guidelines for bridging the gap between APR and regression testing, including: (1) the number of patches widely used for measuring APR efficiency can incur skewed conclusions, and the use of inconsistent RTS configurations can further skew the conclusions; (2) all studied RTS techniques can substantially improve APR efficiency and should be considered in future APR work; (3) method- and statement-level RTS outperform class-level RTS substantially and should be preferred; (4) RTS techniques can substantially outperform state-of-the-art test prioritization techniques for APR, and combining them can further improve APR efficiency; and (5) traditional Regression Test Prioritization (RTP) widely studied in regression testing performs even better than APR-specific test prioritization when combined with most RTS techniques. Furthermore, we also present the detailed impact of different patch categories and patch validation strategies on our findings.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Lou, Yiling Yang, Jun Benton, Samuel Hao, Dan Tan, Lin Chen, Zhenpeng Zhang, Lu Zhang, Lingming
format	Article
author	Lou, Yiling Yang, Jun Benton, Samuel Hao, Dan Tan, Lin Chen, Zhenpeng Zhang, Lu Zhang, Lingming
author_sort	Lou, Yiling
title	When automated program repair meets regression testing - an extensive study on two million patches
title_short	When automated program repair meets regression testing - an extensive study on two million patches
title_full	When automated program repair meets regression testing - an extensive study on two million patches
title_fullStr	When automated program repair meets regression testing - an extensive study on two million patches
title_full_unstemmed	When automated program repair meets regression testing - an extensive study on two million patches
title_sort	when automated program repair meets regression testing - an extensive study on two million patches
publishDate	2025
url	https://hdl.handle.net/10356/182504
_version_	1823807386847019008

When automated program repair meets regression testing - an extensive study on two million patches

Similar Items