CC2Vec: Distributed representations of code changes

Existing work on software patches often use features specific to a single task. These works often rely on manually identified features, and human effort is required to identify these features for each task. In this work, we propose CC2Vec, a neural network model that learns a representation of code...

Full description

Saved in:

Bibliographic Details
Main Authors:	HOANG, Thong, KANG, Hong Jin, LAWALL, Julia, LO, David
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Software Engineering
Online Access:	https://ink.library.smu.edu.sg/sis_research/5500 https://ink.library.smu.edu.sg/context/sis_research/article/6503/viewcontent/3377811.3380361.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Description
Summary:	Existing work on software patches often use features specific to a single task. These works often rely on manually identified features, and human effort is required to identify these features for each task. In this work, we propose CC2Vec, a neural network model that learns a representation of code changes guided by their accompanying log messages, which represent the semantic intent of the code changes. CC2Vec models the hierarchical structure of a code change with the help of the attention mechanism and usesmultiple comparison functions to identify the differences between the removed and added code. To evaluate if CC2Vec can produce a distributed representation of code changes that is general and useful for multiple tasks on software patches, we use the vectors produced by CC2Vec for three tasks: log message generation, bug fixing patch identification, and just-in-time defect prediction. In all tasks, the models using CC2Vec outperform the state-of-the-art techniques.

CC2Vec: Distributed representations of code changes

Similar Items