Exploiting text mining for Java package mappings

Developers often need to utilize method(s) that serve a functionality from more than one program library in order to obtain the latest optimized functionality or to seek a desired functionality. For example, a developer may be utilizing the array feature from the program library “org.json”. Therefor...

Full description

Saved in:
Bibliographic Details
Main Author: Ong, Kent Long Xiong
Other Authors: Liu Yang
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/70045
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-70045
record_format dspace
spelling sg-ntu-dr.10356-700452023-03-03T20:57:01Z Exploiting text mining for Java package mappings Ong, Kent Long Xiong Liu Yang School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing Developers often need to utilize method(s) that serve a functionality from more than one program library in order to obtain the latest optimized functionality or to seek a desired functionality. For example, a developer may be utilizing the array feature from the program library “org.json”. Therefore, he/she may require method(s) from the package “org.json.JSONArray” to perform some array operations but “org.json” may no longer be under active development. Consequently, he/she may wish to search for method(s) in another analogical program library (i.e. gson) that performs operations on arrays such as method(s) from the package “com.google.gson.JsonArray”. As a result, a mapping between these packages are required. Such mappings are called package mappings. Due to large number of package mappings, a manual process of defining those mappings is tedious and error-prone. To relieve developers from this tiresome process, an automatic technique to create a database of likely package mappings is desired. Therefore, this report proposes the use of Term Frequency-Inverse Document Frequency (TF-IDF) to perform package mappings between analogical Java program libraries. TF-IDF makes use of package names and their descriptions from Java documentations to measure the similarity and define the package mappings between analogical program libraries. We used Application Programming Interface (API) mappings between four pairs of analogical program libraries as ground truth to evaluate our approach. Our results indicate that the mappings performed inferred the right analogical API within the top-10 recommended results over 50% of the time. With this result, we also present a web application (http://similarpackage.appspot.com/) which can recommend analogical packages for 71,775 packages of 117 pairs of analogical Java program libraries with diverse functionalities. Bachelor of Engineering (Computer Science) 2017-04-10T05:23:46Z 2017-04-10T05:23:46Z 2017 Final Year Project (FYP) http://hdl.handle.net/10356/70045 en Nanyang Technological University 67 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Ong, Kent Long Xiong
Exploiting text mining for Java package mappings
description Developers often need to utilize method(s) that serve a functionality from more than one program library in order to obtain the latest optimized functionality or to seek a desired functionality. For example, a developer may be utilizing the array feature from the program library “org.json”. Therefore, he/she may require method(s) from the package “org.json.JSONArray” to perform some array operations but “org.json” may no longer be under active development. Consequently, he/she may wish to search for method(s) in another analogical program library (i.e. gson) that performs operations on arrays such as method(s) from the package “com.google.gson.JsonArray”. As a result, a mapping between these packages are required. Such mappings are called package mappings. Due to large number of package mappings, a manual process of defining those mappings is tedious and error-prone. To relieve developers from this tiresome process, an automatic technique to create a database of likely package mappings is desired. Therefore, this report proposes the use of Term Frequency-Inverse Document Frequency (TF-IDF) to perform package mappings between analogical Java program libraries. TF-IDF makes use of package names and their descriptions from Java documentations to measure the similarity and define the package mappings between analogical program libraries. We used Application Programming Interface (API) mappings between four pairs of analogical program libraries as ground truth to evaluate our approach. Our results indicate that the mappings performed inferred the right analogical API within the top-10 recommended results over 50% of the time. With this result, we also present a web application (http://similarpackage.appspot.com/) which can recommend analogical packages for 71,775 packages of 117 pairs of analogical Java program libraries with diverse functionalities.
author2 Liu Yang
author_facet Liu Yang
Ong, Kent Long Xiong
format Final Year Project
author Ong, Kent Long Xiong
author_sort Ong, Kent Long Xiong
title Exploiting text mining for Java package mappings
title_short Exploiting text mining for Java package mappings
title_full Exploiting text mining for Java package mappings
title_fullStr Exploiting text mining for Java package mappings
title_full_unstemmed Exploiting text mining for Java package mappings
title_sort exploiting text mining for java package mappings
publishDate 2017
url http://hdl.handle.net/10356/70045
_version_ 1759855367356940288