Picaso: Enhancing API recommendations with relevant stack overflow posts

While having options could be liberating, too many options could lead to the sub-optimal solution being chosen. This is not an exception in the software engineering domain. Nowadays, API has become imperative in making software developers' life easier. APIs help developers implement a function...

Full description

Saved in:
Bibliographic Details
Main Authors: IRSAN, Ivana Clairine, ZHANG, Ting, THUNG, Ferdian, KIM, Kisub, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2023
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8572
https://ink.library.smu.edu.sg/context/sis_research/article/9575/viewcontent/picaso.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9575
record_format dspace
spelling sg-smu-ink.sis_research-95752024-01-25T08:59:36Z Picaso: Enhancing API recommendations with relevant stack overflow posts IRSAN, Ivana Clairine ZHANG, Ting THUNG, Ferdian KIM, Kisub LO, David While having options could be liberating, too many options could lead to the sub-optimal solution being chosen. This is not an exception in the software engineering domain. Nowadays, API has become imperative in making software developers' life easier. APIs help developers implement a function faster and more efficiently. However, given the large number of open-source libraries to choose from, choosing the right APIs is not a simple task. Previous studies on API recommendation leverage natural language (query) to identify which API would be suitable for the given task. However, these studies only consider one source of input, i.e., GitHub or Stack Overflow, independently. There are no existing approaches that utilize Stack Overflow to help generate better API sequence recommendations from queries obtained from GitHub. Therefore, in this study, we aim to provide a framework that could improve the result of the API sequence recommendation by leveraging information from Stack Overflow. In this work, we propose Picaso, which leverages contrastive learning to train a sentence embedding model and a cross-encoder model to build a classification model in order to find a semantically similar Stack Overflow post given an annotation (i.e., code comment). Subsequently, Picaso then uses the Stack Overflow's title as a query expansion. Picaso then uses the extended queries to fine-tune a CodeBERT, resulting in an API sequence generation model. Based on our experiments, we found that incorporating the Stack Overflow information into CodeBERT would improve the performance of API sequence generation's BLEU-4 score by 10.8%. 2023-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8572 info:doi/10.1109/MSR59073.2023.00025 https://ink.library.smu.edu.sg/context/sis_research/article/9575/viewcontent/picaso.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University API recommendation Multi-source analytic Multi-Sources Pre-trained model Query expansion Sequence generation Software developer Software engineering domain Stack overflow Suboptimal solution Databases and Information Systems Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic API recommendation
Multi-source analytic
Multi-Sources
Pre-trained model
Query expansion
Sequence generation
Software developer
Software engineering domain
Stack overflow
Suboptimal solution
Databases and Information Systems
Software Engineering
spellingShingle API recommendation
Multi-source analytic
Multi-Sources
Pre-trained model
Query expansion
Sequence generation
Software developer
Software engineering domain
Stack overflow
Suboptimal solution
Databases and Information Systems
Software Engineering
IRSAN, Ivana Clairine
ZHANG, Ting
THUNG, Ferdian
KIM, Kisub
LO, David
Picaso: Enhancing API recommendations with relevant stack overflow posts
description While having options could be liberating, too many options could lead to the sub-optimal solution being chosen. This is not an exception in the software engineering domain. Nowadays, API has become imperative in making software developers' life easier. APIs help developers implement a function faster and more efficiently. However, given the large number of open-source libraries to choose from, choosing the right APIs is not a simple task. Previous studies on API recommendation leverage natural language (query) to identify which API would be suitable for the given task. However, these studies only consider one source of input, i.e., GitHub or Stack Overflow, independently. There are no existing approaches that utilize Stack Overflow to help generate better API sequence recommendations from queries obtained from GitHub. Therefore, in this study, we aim to provide a framework that could improve the result of the API sequence recommendation by leveraging information from Stack Overflow. In this work, we propose Picaso, which leverages contrastive learning to train a sentence embedding model and a cross-encoder model to build a classification model in order to find a semantically similar Stack Overflow post given an annotation (i.e., code comment). Subsequently, Picaso then uses the Stack Overflow's title as a query expansion. Picaso then uses the extended queries to fine-tune a CodeBERT, resulting in an API sequence generation model. Based on our experiments, we found that incorporating the Stack Overflow information into CodeBERT would improve the performance of API sequence generation's BLEU-4 score by 10.8%.
format text
author IRSAN, Ivana Clairine
ZHANG, Ting
THUNG, Ferdian
KIM, Kisub
LO, David
author_facet IRSAN, Ivana Clairine
ZHANG, Ting
THUNG, Ferdian
KIM, Kisub
LO, David
author_sort IRSAN, Ivana Clairine
title Picaso: Enhancing API recommendations with relevant stack overflow posts
title_short Picaso: Enhancing API recommendations with relevant stack overflow posts
title_full Picaso: Enhancing API recommendations with relevant stack overflow posts
title_fullStr Picaso: Enhancing API recommendations with relevant stack overflow posts
title_full_unstemmed Picaso: Enhancing API recommendations with relevant stack overflow posts
title_sort picaso: enhancing api recommendations with relevant stack overflow posts
publisher Institutional Knowledge at Singapore Management University
publishDate 2023
url https://ink.library.smu.edu.sg/sis_research/8572
https://ink.library.smu.edu.sg/context/sis_research/article/9575/viewcontent/picaso.pdf
_version_ 1789483278468644864