An exploratory study on code attention in BERT

Many recent models in software engineering introduced deep neural models based on the Transformer architecture or use transformerbased Pre-trained Language Models (PLM) trained on code. Although these models achieve the state of the arts results in many downstream tasks such as code summarization an...

Full description

Saved in:

Bibliographic Details
Main Authors:	SHARMA, Rishab, CHEN, Fuxiang, FARD, Fatemeh H., LO, David
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2022
Subjects:	Pre-trained language models BERT CodeBERT Attention Databases and Information Systems
Online Access:	https://ink.library.smu.edu.sg/sis_research/7694 https://ink.library.smu.edu.sg/context/sis_research/article/8697/viewcontent/An_Exploratory.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-8697
record_format	dspace
spelling	sg-smu-ink.sis_research-86972023-01-10T03:12:52Z An exploratory study on code attention in BERT SHARMA, Rishab CHEN, Fuxiang FARD, Fatemeh H. LO, David Many recent models in software engineering introduced deep neural models based on the Transformer architecture or use transformerbased Pre-trained Language Models (PLM) trained on code. Although these models achieve the state of the arts results in many downstream tasks such as code summarization and bug detection, they are based on Transformer and PLM, which are mainly studied in the Natural Language Processing (NLP) field. The current studies rely on the reasoning and practices from NLP for these models in code, despite the differences between natural languages and programming languages. There is also limited literature on explaining how code is modeled. Here, we investigate the attention behavior of PLM on code and compare it with natural language. We pre-trained BERT, a Transformer based PLM, on code and explored what kind of information it learns, both semantic and syntactic. We run several experiments to analyze the attention values of code constructs on each other and what BERT learns in each layer. Our analyses show that BERT pays more attention to syntactic entities, specifically identifiers and separators, in contrast to the most attended token [CLS] in NLP. This observation motivated us to leverage identifiers to represent the code sequence instead of the [CLS] token when used for code clone detection. Our results show that employing embeddings from identifiers increases the performance of BERT by 605% and 4% F1- score in its lower layers and the upper layers, respectively. When identifiers’ embeddings are used in CodeBERT, a code-based PLM, the performance is improved by 21–24% in the F1-score of clone detection. The findings can benefit the research community by using code-specific representations instead of applying the common embeddings used in NLP, and open new directions for developing smaller models with similar performance. 2022-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7694 info:doi/10.1145/3524610.3527921 https://ink.library.smu.edu.sg/context/sis_research/article/8697/viewcontent/An_Exploratory.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Pre-trained language models BERT CodeBERT Attention Databases and Information Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Pre-trained language models BERT CodeBERT Attention Databases and Information Systems
spellingShingle	Pre-trained language models BERT CodeBERT Attention Databases and Information Systems SHARMA, Rishab CHEN, Fuxiang FARD, Fatemeh H. LO, David An exploratory study on code attention in BERT
description	Many recent models in software engineering introduced deep neural models based on the Transformer architecture or use transformerbased Pre-trained Language Models (PLM) trained on code. Although these models achieve the state of the arts results in many downstream tasks such as code summarization and bug detection, they are based on Transformer and PLM, which are mainly studied in the Natural Language Processing (NLP) field. The current studies rely on the reasoning and practices from NLP for these models in code, despite the differences between natural languages and programming languages. There is also limited literature on explaining how code is modeled. Here, we investigate the attention behavior of PLM on code and compare it with natural language. We pre-trained BERT, a Transformer based PLM, on code and explored what kind of information it learns, both semantic and syntactic. We run several experiments to analyze the attention values of code constructs on each other and what BERT learns in each layer. Our analyses show that BERT pays more attention to syntactic entities, specifically identifiers and separators, in contrast to the most attended token [CLS] in NLP. This observation motivated us to leverage identifiers to represent the code sequence instead of the [CLS] token when used for code clone detection. Our results show that employing embeddings from identifiers increases the performance of BERT by 605% and 4% F1- score in its lower layers and the upper layers, respectively. When identifiers’ embeddings are used in CodeBERT, a code-based PLM, the performance is improved by 21–24% in the F1-score of clone detection. The findings can benefit the research community by using code-specific representations instead of applying the common embeddings used in NLP, and open new directions for developing smaller models with similar performance.
format	text
author	SHARMA, Rishab CHEN, Fuxiang FARD, Fatemeh H. LO, David
author_facet	SHARMA, Rishab CHEN, Fuxiang FARD, Fatemeh H. LO, David
author_sort	SHARMA, Rishab
title	An exploratory study on code attention in BERT
title_short	An exploratory study on code attention in BERT
title_full	An exploratory study on code attention in BERT
title_fullStr	An exploratory study on code attention in BERT
title_full_unstemmed	An exploratory study on code attention in BERT
title_sort	exploratory study on code attention in bert
publisher	Institutional Knowledge at Singapore Management University
publishDate	2022
url	https://ink.library.smu.edu.sg/sis_research/7694 https://ink.library.smu.edu.sg/context/sis_research/article/8697/viewcontent/An_Exploratory.pdf
_version_	1770576415573409792

An exploratory study on code attention in BERT

Similar Items