An empirical study of API misuses of data-centric libraries

Developers rely on third-party library Application Programming Interfaces (APIs) when developing software. However, libraries typically come with assumptions and API usage constraints, whose violation results in API misuse. API misuses may result in crashes or incorrect behavior. Even though API mis...

Full description

Saved in:
Bibliographic Details
Main Authors: GALAPPATHTHI, Akalanda, NADI, Sarah, TREUDE, Christoph
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9781
https://ink.library.smu.edu.sg/context/sis_research/article/10781/viewcontent/API_Misuses_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-10781
record_format dspace
spelling sg-smu-ink.sis_research-107812024-12-16T02:05:55Z An empirical study of API misuses of data-centric libraries GALAPPATHTHI, Akalanda NADI, Sarah TREUDE, Christoph Developers rely on third-party library Application Programming Interfaces (APIs) when developing software. However, libraries typically come with assumptions and API usage constraints, whose violation results in API misuse. API misuses may result in crashes or incorrect behavior. Even though API misuse is a well-studied area, a recent study of API misuse of deep learning libraries showed that the nature of these misuses and their symptoms are different from misuses of traditional libraries, and as a result highlighted potential shortcomings of current misuse detection tools. We speculate that these observations may not be limited to deep learning API misuses but may stem from the data-centric nature of these APIs. Data-centric libraries often deal with diverse data structures, intricate processing workflows, and a multitude of parameters, which can make them inherently more challenging to use correctly. Therefore, understanding the potential misuses of these libraries is important to avoid unexpected application behavior. To this end, this paper contributes an empirical study of API misuses of five data-centric libraries that cover areas such as data processing, numerical computation, machine learning, and visualization. We identify misuses of these libraries by analyzing data from both Stack Overflow and GitHub. Our results show that many of the characteristics of API misuses observed for deep learning libraries extend to misuses of the data-centric library APIs we study. We also find that developers tend to misuse APIs from data-centric libraries, regardless of whether the API directive appears in the documentation. Overall, our work exposes the challenges of API misuse in data-centric libraries, rather than only focusing on deep learning libraries. Our collected misuses and their characterization lay groundwork for future research to help reduce misuses of these libraries. 2024-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9781 info:doi/10.1145/3674805.3686685 https://ink.library.smu.edu.sg/context/sis_research/article/10781/viewcontent/API_Misuses_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University API misuse data-centric libraries empirical study Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic API misuse
data-centric libraries
empirical study
Software Engineering
spellingShingle API misuse
data-centric libraries
empirical study
Software Engineering
GALAPPATHTHI, Akalanda
NADI, Sarah
TREUDE, Christoph
An empirical study of API misuses of data-centric libraries
description Developers rely on third-party library Application Programming Interfaces (APIs) when developing software. However, libraries typically come with assumptions and API usage constraints, whose violation results in API misuse. API misuses may result in crashes or incorrect behavior. Even though API misuse is a well-studied area, a recent study of API misuse of deep learning libraries showed that the nature of these misuses and their symptoms are different from misuses of traditional libraries, and as a result highlighted potential shortcomings of current misuse detection tools. We speculate that these observations may not be limited to deep learning API misuses but may stem from the data-centric nature of these APIs. Data-centric libraries often deal with diverse data structures, intricate processing workflows, and a multitude of parameters, which can make them inherently more challenging to use correctly. Therefore, understanding the potential misuses of these libraries is important to avoid unexpected application behavior. To this end, this paper contributes an empirical study of API misuses of five data-centric libraries that cover areas such as data processing, numerical computation, machine learning, and visualization. We identify misuses of these libraries by analyzing data from both Stack Overflow and GitHub. Our results show that many of the characteristics of API misuses observed for deep learning libraries extend to misuses of the data-centric library APIs we study. We also find that developers tend to misuse APIs from data-centric libraries, regardless of whether the API directive appears in the documentation. Overall, our work exposes the challenges of API misuse in data-centric libraries, rather than only focusing on deep learning libraries. Our collected misuses and their characterization lay groundwork for future research to help reduce misuses of these libraries.
format text
author GALAPPATHTHI, Akalanda
NADI, Sarah
TREUDE, Christoph
author_facet GALAPPATHTHI, Akalanda
NADI, Sarah
TREUDE, Christoph
author_sort GALAPPATHTHI, Akalanda
title An empirical study of API misuses of data-centric libraries
title_short An empirical study of API misuses of data-centric libraries
title_full An empirical study of API misuses of data-centric libraries
title_fullStr An empirical study of API misuses of data-centric libraries
title_full_unstemmed An empirical study of API misuses of data-centric libraries
title_sort empirical study of api misuses of data-centric libraries
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/sis_research/9781
https://ink.library.smu.edu.sg/context/sis_research/article/10781/viewcontent/API_Misuses_av.pdf
_version_ 1819113136988356608