Refining ChatGPT-generated code: Characterizing and mitigating code quality issues

Since its introduction in November 2022, ChatGPT has rapidly gained popularity due to its remarkable ability in language understanding and human-like responses. ChatGPT, based on GPT-3.5 architecture, has shown great promise for revolutionizing various research fields, including code generation. How...

Full description

Saved in:

Bibliographic Details
Main Authors:	LIU, Yue, LE-CONG, Thanh, RATNADIRA WIDYASARI, LO, David
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Automated code generation ChatGPT code analysis Artificial Intelligence and Robotics Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/9242 https://ink.library.smu.edu.sg/context/sis_research/article/10242/viewcontent/3643674.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10242
record_format	dspace
spelling	sg-smu-ink.sis_research-102422024-09-02T06:46:09Z Refining ChatGPT-generated code: Characterizing and mitigating code quality issues LIU, Yue LE-CONG, Thanh RATNADIRA WIDYASARI, LO, David Since its introduction in November 2022, ChatGPT has rapidly gained popularity due to its remarkable ability in language understanding and human-like responses. ChatGPT, based on GPT-3.5 architecture, has shown great promise for revolutionizing various research fields, including code generation. However, the reliability and quality of code generated by ChatGPT remain unexplored, raising concerns about potential risks associated with the widespread use of ChatGPT-driven code generation.In this article, we systematically study the quality of 4,066 ChatGPT-generated programs of code implemented in two popular programming languages, i.e., Java and Python, for 2,033 programming tasks. The goal of this work is threefold. First, we analyze the correctness of ChatGPT on code generation tasks and uncover the factors that influence its effectiveness, including task difficulty, programming language, time that tasks are introduced, and program size. Second, we identify and characterize potential issues with the quality of ChatGPT-generated code. Last, we provide insights into how these issues can be mitigated. Experiments highlight that out of 4,066 programs generated by ChatGPT, 2,756 programs are deemed correct, 1,082 programs provide wrong outputs, and 177 programs contain compilation or runtime errors. Additionally, we further analyze other characteristics of the generated code through static analysis tools, such as code style and maintainability, and find that 1,930 ChatGPT-generated code snippets suffer from maintainability issues. Subsequently, we investigate ChatGPT’s self-repairing ability and its interaction with static analysis tools to fix the errors uncovered in the previous step. Experiments suggest that ChatGPT can partially address these challenges, improving code quality by more than 20%, but there are still limitations and opportunities for improvement. Overall, our study provides valuable insights into the current limitations of ChatGPT and offers a roadmap for future research and development efforts to enhance the code generation capabilities of artificial intelligence models such as ChatGPT. 2024-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9242 info:doi/10.1145/3643674 https://ink.library.smu.edu.sg/context/sis_research/article/10242/viewcontent/3643674.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Automated code generation ChatGPT code analysis Artificial Intelligence and Robotics Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Automated code generation ChatGPT code analysis Artificial Intelligence and Robotics Software Engineering
spellingShingle	Automated code generation ChatGPT code analysis Artificial Intelligence and Robotics Software Engineering LIU, Yue LE-CONG, Thanh RATNADIRA WIDYASARI, LO, David Refining ChatGPT-generated code: Characterizing and mitigating code quality issues
description	Since its introduction in November 2022, ChatGPT has rapidly gained popularity due to its remarkable ability in language understanding and human-like responses. ChatGPT, based on GPT-3.5 architecture, has shown great promise for revolutionizing various research fields, including code generation. However, the reliability and quality of code generated by ChatGPT remain unexplored, raising concerns about potential risks associated with the widespread use of ChatGPT-driven code generation.In this article, we systematically study the quality of 4,066 ChatGPT-generated programs of code implemented in two popular programming languages, i.e., Java and Python, for 2,033 programming tasks. The goal of this work is threefold. First, we analyze the correctness of ChatGPT on code generation tasks and uncover the factors that influence its effectiveness, including task difficulty, programming language, time that tasks are introduced, and program size. Second, we identify and characterize potential issues with the quality of ChatGPT-generated code. Last, we provide insights into how these issues can be mitigated. Experiments highlight that out of 4,066 programs generated by ChatGPT, 2,756 programs are deemed correct, 1,082 programs provide wrong outputs, and 177 programs contain compilation or runtime errors. Additionally, we further analyze other characteristics of the generated code through static analysis tools, such as code style and maintainability, and find that 1,930 ChatGPT-generated code snippets suffer from maintainability issues. Subsequently, we investigate ChatGPT’s self-repairing ability and its interaction with static analysis tools to fix the errors uncovered in the previous step. Experiments suggest that ChatGPT can partially address these challenges, improving code quality by more than 20%, but there are still limitations and opportunities for improvement. Overall, our study provides valuable insights into the current limitations of ChatGPT and offers a roadmap for future research and development efforts to enhance the code generation capabilities of artificial intelligence models such as ChatGPT.
format	text
author	LIU, Yue LE-CONG, Thanh RATNADIRA WIDYASARI, LO, David
author_facet	LIU, Yue LE-CONG, Thanh RATNADIRA WIDYASARI, LO, David
author_sort	LIU, Yue
title	Refining ChatGPT-generated code: Characterizing and mitigating code quality issues
title_short	Refining ChatGPT-generated code: Characterizing and mitigating code quality issues
title_full	Refining ChatGPT-generated code: Characterizing and mitigating code quality issues
title_fullStr	Refining ChatGPT-generated code: Characterizing and mitigating code quality issues
title_full_unstemmed	Refining ChatGPT-generated code: Characterizing and mitigating code quality issues
title_sort	refining chatgpt-generated code: characterizing and mitigating code quality issues
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/9242 https://ink.library.smu.edu.sg/context/sis_research/article/10242/viewcontent/3643674.pdf
_version_	1814047842665234432

Refining ChatGPT-generated code: Characterizing and mitigating code quality issues

Similar Items