Sentiment analysis for software engineering: How far can pre-trained transformer models go?

Extensive research has been conducted on sentiment analysis for software engineering (SA4SE). Researchers have invested much effort in developing customized tools (e.g., SentiStrength-SE, SentiCR) to classify the sentiment polarity for Software Engineering (SE) specific contents (e.g., discussions i...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG, Ting, XU, Bowen, Ferdian, Thung, AGUS HARYONO, Stefanus, LO, David, JIANG, Lingxiao
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/5535
https://ink.library.smu.edu.sg/context/sis_research/article/6538/viewcontent/icsme20SA4SE.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Extensive research has been conducted on sentiment analysis for software engineering (SA4SE). Researchers have invested much effort in developing customized tools (e.g., SentiStrength-SE, SentiCR) to classify the sentiment polarity for Software Engineering (SE) specific contents (e.g., discussions in Stack Overflow and code review comments). Even so, there is still much room for improvement. Recently, pre-trained Transformer-based models (e.g., BERT, XLNet) have brought considerable breakthroughs in the field of natural language processing (NLP). In this work, we conducted a systematic evaluation of five existing SA4SE tools and variants of four state-of-the-art pre-trained Transformer-based models on six SE datasets. Our work is the first to fine-tune pre-trained Transformer-based models for the SA4SE task. Empirically, across all six datasets, our fine-tuned pre-trained Transformer-based models outperform the existing SA4SE tools by 6.5-35.6% in terms of macro/micro-averaged F1 scores.