Exploiting text mining for Java package mappings
Developers often need to utilize method(s) that serve a functionality from more than one program library in order to obtain the latest optimized functionality or to seek a desired functionality. For example, a developer may be utilizing the array feature from the program library “org.json”. Therefor...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/70045 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-70045 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-700452023-03-03T20:57:01Z Exploiting text mining for Java package mappings Ong, Kent Long Xiong Liu Yang School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing Developers often need to utilize method(s) that serve a functionality from more than one program library in order to obtain the latest optimized functionality or to seek a desired functionality. For example, a developer may be utilizing the array feature from the program library “org.json”. Therefore, he/she may require method(s) from the package “org.json.JSONArray” to perform some array operations but “org.json” may no longer be under active development. Consequently, he/she may wish to search for method(s) in another analogical program library (i.e. gson) that performs operations on arrays such as method(s) from the package “com.google.gson.JsonArray”. As a result, a mapping between these packages are required. Such mappings are called package mappings. Due to large number of package mappings, a manual process of defining those mappings is tedious and error-prone. To relieve developers from this tiresome process, an automatic technique to create a database of likely package mappings is desired. Therefore, this report proposes the use of Term Frequency-Inverse Document Frequency (TF-IDF) to perform package mappings between analogical Java program libraries. TF-IDF makes use of package names and their descriptions from Java documentations to measure the similarity and define the package mappings between analogical program libraries. We used Application Programming Interface (API) mappings between four pairs of analogical program libraries as ground truth to evaluate our approach. Our results indicate that the mappings performed inferred the right analogical API within the top-10 recommended results over 50% of the time. With this result, we also present a web application (http://similarpackage.appspot.com/) which can recommend analogical packages for 71,775 packages of 117 pairs of analogical Java program libraries with diverse functionalities. Bachelor of Engineering (Computer Science) 2017-04-10T05:23:46Z 2017-04-10T05:23:46Z 2017 Final Year Project (FYP) http://hdl.handle.net/10356/70045 en Nanyang Technological University 67 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing Ong, Kent Long Xiong Exploiting text mining for Java package mappings |
description |
Developers often need to utilize method(s) that serve a functionality from more than one program library in order to obtain the latest optimized functionality or to seek a desired functionality. For example, a developer may be utilizing the array feature from the program library “org.json”. Therefore, he/she may require method(s) from the package “org.json.JSONArray” to perform some array operations but “org.json” may no longer be under active development. Consequently, he/she may wish to search for method(s) in another analogical program library (i.e. gson) that performs operations on arrays such as method(s) from the package “com.google.gson.JsonArray”. As a result, a mapping between these packages are required. Such mappings are called package mappings. Due to large number of package mappings, a manual process of defining those mappings is tedious and error-prone. To relieve developers from this tiresome process, an automatic technique to create a database of likely package mappings is desired. Therefore, this report proposes the use of Term Frequency-Inverse Document Frequency (TF-IDF) to perform package mappings between analogical Java program libraries. TF-IDF makes use of package names and their descriptions from Java documentations to measure the similarity and define the package mappings between analogical program libraries. We used Application Programming Interface (API) mappings between four pairs of analogical program libraries as ground truth to evaluate our approach. Our results indicate that the mappings performed inferred the right analogical API within the top-10 recommended results over 50% of the time. With this result, we also present a web application (http://similarpackage.appspot.com/) which can recommend analogical packages for 71,775 packages of 117 pairs of analogical Java program libraries with diverse functionalities. |
author2 |
Liu Yang |
author_facet |
Liu Yang Ong, Kent Long Xiong |
format |
Final Year Project |
author |
Ong, Kent Long Xiong |
author_sort |
Ong, Kent Long Xiong |
title |
Exploiting text mining for Java package mappings |
title_short |
Exploiting text mining for Java package mappings |
title_full |
Exploiting text mining for Java package mappings |
title_fullStr |
Exploiting text mining for Java package mappings |
title_full_unstemmed |
Exploiting text mining for Java package mappings |
title_sort |
exploiting text mining for java package mappings |
publishDate |
2017 |
url |
http://hdl.handle.net/10356/70045 |
_version_ |
1759855367356940288 |