Active learning of discriminative subgraph patterns for API misuse detection
A common cause of bugs and vulnerabilities are the violations of usage constraints associated with Application Programming Interfaces (APIs). API misuses are common in software projects, and while there have been techniques proposed to detect such misuses, studies have shown that they fail to reliab...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2022
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/7635 https://ink.library.smu.edu.sg/context/sis_research/article/8638/viewcontent/2204.09945.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-8638 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-86382023-01-10T03:55:43Z Active learning of discriminative subgraph patterns for API misuse detection KANG, Hong Jin LO, David A common cause of bugs and vulnerabilities are the violations of usage constraints associated with Application Programming Interfaces (APIs). API misuses are common in software projects, and while there have been techniques proposed to detect such misuses, studies have shown that they fail to reliably detect misuses while reporting many false positives. One limitation of prior work is the inability to reliably identify correct patterns of usage. Many approaches confuse a usage pattern’s frequency for correctness. Due to the variety of alternative usage patterns that may be uncommon but correct, anomaly detection-based techniques have limited success in identifying misuses. We address these challenges and propose ALP (Actively Learned Patterns), reformulating API misuse detection as a classification problem. After representing programs as graphs, ALP mines discriminative subgraphs. While still incorporating frequency information, through limited human supervision, we reduce the reliance on the assumption relating frequency and correctness. The principles of active learning are incorporated to shift human attention away from the most frequent patterns. Instead, ALP samples informative and representative examples while minimizing labeling effort. In our empirical evaluation, ALP substantially outperforms prior approaches on both MUBench, an API Misuse benchmark, and a new dataset that we constructed from real-world software projects. 2022-02-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7635 info:doi/10.1109/TSE.2021.3069978 https://ink.library.smu.edu.sg/context/sis_research/article/8638/viewcontent/2204.09945.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University API-Misuse Detection Discriminative Subgraph Mining Graph Classification Active Learning Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
API-Misuse Detection Discriminative Subgraph Mining Graph Classification Active Learning Software Engineering |
spellingShingle |
API-Misuse Detection Discriminative Subgraph Mining Graph Classification Active Learning Software Engineering KANG, Hong Jin LO, David Active learning of discriminative subgraph patterns for API misuse detection |
description |
A common cause of bugs and vulnerabilities are the violations of usage constraints associated with Application Programming Interfaces (APIs). API misuses are common in software projects, and while there have been techniques proposed to detect such misuses, studies have shown that they fail to reliably detect misuses while reporting many false positives. One limitation of prior work is the inability to reliably identify correct patterns of usage. Many approaches confuse a usage pattern’s frequency for correctness. Due to the variety of alternative usage patterns that may be uncommon but correct, anomaly detection-based techniques have limited success in identifying misuses. We address these challenges and propose ALP (Actively Learned Patterns), reformulating API misuse detection as a classification problem. After representing programs as graphs, ALP mines discriminative subgraphs. While still incorporating frequency information, through limited human supervision, we reduce the reliance on the assumption relating frequency and correctness. The principles of active learning are incorporated to shift human attention away from the most frequent patterns. Instead, ALP samples informative and representative examples while minimizing labeling effort. In our empirical evaluation, ALP substantially outperforms prior approaches on both MUBench, an API Misuse benchmark, and a new dataset that we constructed from real-world software projects. |
format |
text |
author |
KANG, Hong Jin LO, David |
author_facet |
KANG, Hong Jin LO, David |
author_sort |
KANG, Hong Jin |
title |
Active learning of discriminative subgraph patterns for API misuse detection |
title_short |
Active learning of discriminative subgraph patterns for API misuse detection |
title_full |
Active learning of discriminative subgraph patterns for API misuse detection |
title_fullStr |
Active learning of discriminative subgraph patterns for API misuse detection |
title_full_unstemmed |
Active learning of discriminative subgraph patterns for API misuse detection |
title_sort |
active learning of discriminative subgraph patterns for api misuse detection |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2022 |
url |
https://ink.library.smu.edu.sg/sis_research/7635 https://ink.library.smu.edu.sg/context/sis_research/article/8638/viewcontent/2204.09945.pdf |
_version_ |
1770576397816823808 |