Neural code generation for robust automatic program repair
Automatic program repair (APR) is crucial to reduce manual debugging efforts for developers and improve software reliability. Consequently, it has gained increasing attention as an essential technique in software development to boost developers’ productivity. Conventional search-based techniques...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/173910 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Automatic program repair (APR) is crucial to reduce manual debugging efforts for
developers and improve software reliability. Consequently, it has gained increasing
attention as an essential technique in software development to boost developers’
productivity. Conventional search-based techniques typically rely on heuristic rules
or a redundancy assumption to mine fix patterns, which continuously generate
code patches until the resultant program meets the pre-defined test specifications.
However, these approaches often yield a substantial amount of low-quality patch
candidates, leading to an ineffective APR system. To address these limitations,
inspired by the surge of deep learning (DL) based approaches for natural language
processing (NLP), we focus on robust APR methods in real-world scenarios via
neural code generation. In this thesis, we leverage recent advances in deep learning
and deploy novel transformer-based frameworks to automate the program repair
process in a data-driven manner. This thesis presents numerous contributions
toward the development of robust automatic program repair.
First, we propose a novel code-aware, encoder-decoder-based pre-trained programming language model CodeT5 to support both code understanding and generation
tasks. To be more specific, we employ a unified encoder-decoder transformer architecture T5 and incorporate code-specific knowledge for better code representation
and understanding. Furthermore, we propose a novel identifier-aware pre-training
task that enables the model to distinguish which code tokens are identifiers and to
recover them when they are masked. Besides, we propose to exploit the user-written
code comments with a bimodal dual-generation task for better natural language
(NL)-programming language (PL) alignment. Comprehensive experiments show
that CodeT5 significantly outperforms prior methods on understanding tasks such
as code defect detection and clone detection, and generation tasks across various
directions including PL-NL, NL-PL, and PL-PL. Further analysis reveals that our
model can better capture semantic information from code.
Second, we investigate the effectiveness of leveraging bug-fix patterns for automatic
program repair. We propose a novel Retrieval-Augmented Patch Generation
framework (RAP-Gen) by explicitly leveraging relevant fix patterns retrieved from
a codebase of previous bug-fix pairs. Specifically, we build a hybrid patch retriever
to account for both lexical and semantic matching based on the raw source code in a
language-agnostic manner, which does not rely on any code-specific features. In addition, we adapt our code-aware language model CodeT5 as the foundation model
to facilitate both patch retrieval and generation tasks in a unified manner. Notably,
RAP-Gen is a generic APR framework that can flexibly integrate different patch
retrievers and generators to repair various types of bugs. We thoroughly evaluate
RAP-Gen on three benchmarks in two programming languages, including the TFix
benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java,
where the bug localization information may or may not be provided. Experimental results show that RAP-Gen significantly outperforms previous state-of-the-art
(SoTA) approaches on all benchmarks, e.g., boosting the accuracy of T5-large on
TFix from 49.70% to 54.15% (repairing 478 more bugs) and repairing 15 more bugs
on 818 Defects4J bugs. Further analysis reveals that our patch retriever can search
for relevant fix patterns to guide the APR systems.
Third, we focus on a novel task of low-resource APR. Recent advances in deep
learning (DL) based models have demonstrated promising results by learning from
large-scale bug-fix examples in a data-driven manner. However, in practical scenarios, software bugs have an imbalanced distribution, and the fixing knowledge
learned by APR models often only captures the patterns of frequent error types,
making it inapplicable to handle the rare error types. To address this limitation, we propose Meta-APR, a new meta-learning framework integrated with code
pretrained language models to generate fixes for low-resource bugs with limited
training samples. Extensive experimental results on three benchmarks in various
programming languages verify the superiority of our method over existing DL-based
APR approaches.
Last but not least, we explore xCodeEval, the largest executable multilingual
multitask benchmark to date consisting of 25 M document-level coding examples
from about 7.5K unique problems covering up to 17 programming languages with
execution-level parallelism. We propose a novel APR task to synthesize a fix for a
detected program bug. Specifically, given a bug-specific defect, the objective of this
task is to generate a correct fix that passes all the unit tests. Detailed experiments
demonstrate that our proposed APR task offers a fresh perspective for examining
and analyzing large language model (LLM)-based APR, facilitating comprehensive
and to some extent highly interpretable investigations of their repair performance.
This thesis strives for a robust neural code generation across multiple languages
and tasks, facilitating real-world APR tasks to alleviate manual debugging efforts
for everyone regardless of their coding background. |
---|