Searching Connected API Subgraph via Text Phrases

Reusing APIs of existing libraries is a common practice during software development, but searching suitable APIs and their usages can be time-consuming [6]. In this paper, we study a new and more practical approach to help users find usages of APIs given only simple text phrases, when users have lim...

Full description

Saved in:
Bibliographic Details
Main Authors: CHAN, Wing-Kwan, CHENG, Hong, LO, David
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2012
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/1580
http://dx.doi.org/10.1145/2393596.2393606
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Reusing APIs of existing libraries is a common practice during software development, but searching suitable APIs and their usages can be time-consuming [6]. In this paper, we study a new and more practical approach to help users find usages of APIs given only simple text phrases, when users have limited knowledge about an API library. We model API invocations as an API graph and aim to find an optimum connected subgraph that meets users' search needs. The problem is challenging since the search space in an API graph is very huge. We start with a greedy subgraph search algorithm which returns a connected subgraph containing nodes with high textual similarity to the query phrases. Two refinement techniques are proposed to improve the quality of the returned subgraph. Furthermore, as the greedy subgraph search algorithm relies on online query of shortest path between two graph nodes, we propose a space-efficient compressed shortest path indexing scheme that can efficiently recover the exact shortest path. We conduct extensive experiments to show that the proposed subgraph search approach for API recommendation is very effective in that it boosts the average F1-measure of the state-of-the-art approach, Portfolio [15], on two groups of real-life queries by 64% and 36% respectively.