Deep code comment generation with hybrid lexical and syntactical information

During software maintenance, developers spend a lot of time understanding the source code. Existing studies show that code comments help developers comprehend programs and reduce additional time spent on reading and navigating source code. Unfortunately, these comments are often mismatched, missing...

Full description

Saved in:
Bibliographic Details
Main Authors: HU, Xing, LI, Ge, XIA, Xin, LO, David, JIN, Zhi
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2019
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4407
https://ink.library.smu.edu.sg/context/sis_research/article/5410/viewcontent/Hu2019_Article_DeepCodeCommentGenerationWithH.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5410
record_format dspace
spelling sg-smu-ink.sis_research-54102020-05-26T04:55:09Z Deep code comment generation with hybrid lexical and syntactical information HU, Xing LI, Ge XIA, Xin LO, David JIN, Zhi During software maintenance, developers spend a lot of time understanding the source code. Existing studies show that code comments help developers comprehend programs and reduce additional time spent on reading and navigating source code. Unfortunately, these comments are often mismatched, missing or outdated in software projects. Developers have to infer the functionality from the source code. This paper proposes a new approach named Hybrid-DeepCom to automatically generate code comments for the functional units of Java language, namely, Java methods. The generated comments aim to help developers understand the functionality of Java methods. Hybrid-DeepCom applies Natural Language Processing (NLP) techniques to learn from a large code corpus and generates comments from learned features. It formulates the comment generation task as the machine translation problem. Hybrid-DeepCom exploits a deep neural network that combines the lexical and structure information of Java methods for better comments generation. We conduct experiments on a large-scale Java corpus built from 9,714 open source projects on GitHub. We evaluate the experimental results on both machine translation metrics and information retrieval metrics. Experimental results demonstrate that our method Hybrid-DeepCom outperforms the state-of-the-art by a substantial margin. In addition, we evaluate the influence of out-of-vocabulary tokens on comment generation. The results show that reducing the out-of-vocabulary tokens improves the accuracy effectively. 2019-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4407 info:doi/10.1007/s10664-019-09730-9 https://ink.library.smu.edu.sg/context/sis_research/article/5410/viewcontent/Hu2019_Article_DeepCodeCommentGenerationWithH.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Comment generation Deep learning Program comprehension Programming Languages and Compilers Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Comment generation
Deep learning
Program comprehension
Programming Languages and Compilers
Software Engineering
spellingShingle Comment generation
Deep learning
Program comprehension
Programming Languages and Compilers
Software Engineering
HU, Xing
LI, Ge
XIA, Xin
LO, David
JIN, Zhi
Deep code comment generation with hybrid lexical and syntactical information
description During software maintenance, developers spend a lot of time understanding the source code. Existing studies show that code comments help developers comprehend programs and reduce additional time spent on reading and navigating source code. Unfortunately, these comments are often mismatched, missing or outdated in software projects. Developers have to infer the functionality from the source code. This paper proposes a new approach named Hybrid-DeepCom to automatically generate code comments for the functional units of Java language, namely, Java methods. The generated comments aim to help developers understand the functionality of Java methods. Hybrid-DeepCom applies Natural Language Processing (NLP) techniques to learn from a large code corpus and generates comments from learned features. It formulates the comment generation task as the machine translation problem. Hybrid-DeepCom exploits a deep neural network that combines the lexical and structure information of Java methods for better comments generation. We conduct experiments on a large-scale Java corpus built from 9,714 open source projects on GitHub. We evaluate the experimental results on both machine translation metrics and information retrieval metrics. Experimental results demonstrate that our method Hybrid-DeepCom outperforms the state-of-the-art by a substantial margin. In addition, we evaluate the influence of out-of-vocabulary tokens on comment generation. The results show that reducing the out-of-vocabulary tokens improves the accuracy effectively.
format text
author HU, Xing
LI, Ge
XIA, Xin
LO, David
JIN, Zhi
author_facet HU, Xing
LI, Ge
XIA, Xin
LO, David
JIN, Zhi
author_sort HU, Xing
title Deep code comment generation with hybrid lexical and syntactical information
title_short Deep code comment generation with hybrid lexical and syntactical information
title_full Deep code comment generation with hybrid lexical and syntactical information
title_fullStr Deep code comment generation with hybrid lexical and syntactical information
title_full_unstemmed Deep code comment generation with hybrid lexical and syntactical information
title_sort deep code comment generation with hybrid lexical and syntactical information
publisher Institutional Knowledge at Singapore Management University
publishDate 2019
url https://ink.library.smu.edu.sg/sis_research/4407
https://ink.library.smu.edu.sg/context/sis_research/article/5410/viewcontent/Hu2019_Article_DeepCodeCommentGenerationWithH.pdf
_version_ 1770574700118802432