Automatic generation of pull request descriptions

Enabled by the pull-based development model, developers can easily contribute to a project through pull requests (PRs). When creating a PR, developers can add a free-form description to describe what changes are made in this PR and/or why. Such a description is helpful for reviewers and other develo...

Full description

Saved in:

Bibliographic Details
Main Authors:	LIU, Zhongxin, XIA, Xin, TREUDE, Christoph, LO, David, LI, Shanping
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2019
Subjects:	Document Generation Pull Request Sequence to Sequence Learning Databases and Information Systems Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/7948 https://ink.library.smu.edu.sg/context/sis_research/article/8951/viewcontent/Liu2019PullRequestDesc.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-8951
record_format	dspace
spelling	sg-smu-ink.sis_research-89512023-08-03T05:56:47Z Automatic generation of pull request descriptions LIU, Zhongxin XIA, Xin TREUDE, Christoph LO, David LI, Shanping Enabled by the pull-based development model, developers can easily contribute to a project through pull requests (PRs). When creating a PR, developers can add a free-form description to describe what changes are made in this PR and/or why. Such a description is helpful for reviewers and other developers to gain a quick understanding of the PR without touching the details and may reduce the possibility of the PR being ignored or rejected. However, developers sometimes neglect to write descriptions for PRs. For example, in our collected dataset with over 333K PRs, more than 34% of the PR descriptions are empty. To alleviate this problem, we propose an approach to automatically generate PR descriptions based on the commit messages and the added source code comments in the PRs. We regard this problem as a text summarization problem and solve it using a novel sequence-to-sequence model. To cope with out-of-vocabulary words in software artifacts and bridge the gap between the training loss function of the sequence-to-sequence model and the evaluation metric ROUGE, which has been shown to correspond to human evaluation, we integrate the pointer generator and directly optimize for ROUGE using reinforcement learning and a special loss function. We build a dataset with over 41K PRs and evaluate our approach on this dataset through ROUGE and a human evaluation. Our evaluation results show that our approach outperforms two baselines by significant margins. 2019-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7948 info:doi/10.1109/ASE.2019.00026 https://ink.library.smu.edu.sg/context/sis_research/article/8951/viewcontent/Liu2019PullRequestDesc.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Document Generation Pull Request Sequence to Sequence Learning Databases and Information Systems Software Engineering
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Document Generation Pull Request Sequence to Sequence Learning Databases and Information Systems Software Engineering
spellingShingle	Document Generation Pull Request Sequence to Sequence Learning Databases and Information Systems Software Engineering LIU, Zhongxin XIA, Xin TREUDE, Christoph LO, David LI, Shanping Automatic generation of pull request descriptions
description	Enabled by the pull-based development model, developers can easily contribute to a project through pull requests (PRs). When creating a PR, developers can add a free-form description to describe what changes are made in this PR and/or why. Such a description is helpful for reviewers and other developers to gain a quick understanding of the PR without touching the details and may reduce the possibility of the PR being ignored or rejected. However, developers sometimes neglect to write descriptions for PRs. For example, in our collected dataset with over 333K PRs, more than 34% of the PR descriptions are empty. To alleviate this problem, we propose an approach to automatically generate PR descriptions based on the commit messages and the added source code comments in the PRs. We regard this problem as a text summarization problem and solve it using a novel sequence-to-sequence model. To cope with out-of-vocabulary words in software artifacts and bridge the gap between the training loss function of the sequence-to-sequence model and the evaluation metric ROUGE, which has been shown to correspond to human evaluation, we integrate the pointer generator and directly optimize for ROUGE using reinforcement learning and a special loss function. We build a dataset with over 41K PRs and evaluate our approach on this dataset through ROUGE and a human evaluation. Our evaluation results show that our approach outperforms two baselines by significant margins.
format	text
author	LIU, Zhongxin XIA, Xin TREUDE, Christoph LO, David LI, Shanping
author_facet	LIU, Zhongxin XIA, Xin TREUDE, Christoph LO, David LI, Shanping
author_sort	LIU, Zhongxin
title	Automatic generation of pull request descriptions
title_short	Automatic generation of pull request descriptions
title_full	Automatic generation of pull request descriptions
title_fullStr	Automatic generation of pull request descriptions
title_full_unstemmed	Automatic generation of pull request descriptions
title_sort	automatic generation of pull request descriptions
publisher	Institutional Knowledge at Singapore Management University
publishDate	2019
url	https://ink.library.smu.edu.sg/sis_research/7948 https://ink.library.smu.edu.sg/context/sis_research/article/8951/viewcontent/Liu2019PullRequestDesc.pdf
_version_	1773551437771440128

Automatic generation of pull request descriptions

Similar Items