TreeCaps: Tree-Structured Capsule Networks for program source code processing

Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing programs. Being able to process programming language code automatic...

Full description

Saved in:
Bibliographic Details
Main Authors: JAYASUNDARA, Vinoj, BUI, Duy Quoc Nghi, JIANG, Lingxiao, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2019
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4816
https://ink.library.smu.edu.sg/context/sis_research/article/5819/viewcontent/ml4systems19treecaps.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5819
record_format dspace
spelling sg-smu-ink.sis_research-58192022-04-21T04:03:43Z TreeCaps: Tree-Structured Capsule Networks for program source code processing JAYASUNDARA, Vinoj BUI, Duy Quoc Nghi JIANG, Lingxiao LO, David Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing programs. Being able to process programming language code automatically and provide summaries of code functionality accurately can significantly help developers to reduce time spent in code navigation and understanding, and thus increase productivity. Different from natural language articles, source code in programming languages often follows rigid syntactical structures and there can exist dependencies among code elements that are located far away from each other through complex control flows and data flows. Existing studies on tree-based convolutional neural networks (TBCNN) and gated graph neural networks (GGNN) are not able to capture essential semantic dependencies among code elements accurately. In this paper, we propose novel tree-based capsule networks (TreeCaps) and relevant techniques for processing program code in an automated way that encodes code syntactical structures and captures code dependencies more accurately. Based on evaluation on programs written in different programming languages, we show that our TreeCaps-based approach can outperform other approaches in classifying the functionalities of many programs. 2019-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4816 https://ink.library.smu.edu.sg/context/sis_research/article/5819/viewcontent/ml4systems19treecaps.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software Engineering
spellingShingle Software Engineering
JAYASUNDARA, Vinoj
BUI, Duy Quoc Nghi
JIANG, Lingxiao
LO, David
TreeCaps: Tree-Structured Capsule Networks for program source code processing
description Program comprehension is a fundamental task in software development and maintenance processes. Software developers often need to understand a large amount of existing code before they can develop new features or fix bugs in existing programs. Being able to process programming language code automatically and provide summaries of code functionality accurately can significantly help developers to reduce time spent in code navigation and understanding, and thus increase productivity. Different from natural language articles, source code in programming languages often follows rigid syntactical structures and there can exist dependencies among code elements that are located far away from each other through complex control flows and data flows. Existing studies on tree-based convolutional neural networks (TBCNN) and gated graph neural networks (GGNN) are not able to capture essential semantic dependencies among code elements accurately. In this paper, we propose novel tree-based capsule networks (TreeCaps) and relevant techniques for processing program code in an automated way that encodes code syntactical structures and captures code dependencies more accurately. Based on evaluation on programs written in different programming languages, we show that our TreeCaps-based approach can outperform other approaches in classifying the functionalities of many programs.
format text
author JAYASUNDARA, Vinoj
BUI, Duy Quoc Nghi
JIANG, Lingxiao
LO, David
author_facet JAYASUNDARA, Vinoj
BUI, Duy Quoc Nghi
JIANG, Lingxiao
LO, David
author_sort JAYASUNDARA, Vinoj
title TreeCaps: Tree-Structured Capsule Networks for program source code processing
title_short TreeCaps: Tree-Structured Capsule Networks for program source code processing
title_full TreeCaps: Tree-Structured Capsule Networks for program source code processing
title_fullStr TreeCaps: Tree-Structured Capsule Networks for program source code processing
title_full_unstemmed TreeCaps: Tree-Structured Capsule Networks for program source code processing
title_sort treecaps: tree-structured capsule networks for program source code processing
publisher Institutional Knowledge at Singapore Management University
publishDate 2019
url https://ink.library.smu.edu.sg/sis_research/4816
https://ink.library.smu.edu.sg/context/sis_research/article/5819/viewcontent/ml4systems19treecaps.pdf
_version_ 1770575053918830592