Source code classification using latent semantic indexing with structural and frequency term weighting
In recent years, there is an increase in the number of open source software.Hence, the demand for automatic software classification is also increasing.Latent Semantic Indexing (LSI) is an information retrieval approach that is utilized in classifying source code programs. This research proposes a L...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Medwell Publishing
2012
|
Subjects: | |
Online Access: | http://repo.uum.edu.my/9501/1/2.pdf http://repo.uum.edu.my/9501/ http://medwelljournals.com/abstract/?doi=rjasci.2012.266.271 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Utara Malaysia |
Language: | English |
id |
my.uum.repo.9501 |
---|---|
record_format |
eprints |
spelling |
my.uum.repo.95012014-03-24T03:12:53Z http://repo.uum.edu.my/9501/ Source code classification using latent semantic indexing with structural and frequency term weighting Yusof, Yuhanis Alhersh, Taha Mahmuddin, Massudi Mohamed Din, Aniza QA76 Computer software In recent years, there is an increase in the number of open source software.Hence, the demand for automatic software classification is also increasing.Latent Semantic Indexing (LSI) is an information retrieval approach that is utilized in classifying source code programs. This research proposes a Latent Semantic Indexing classifier that integrates information structural and frequency of terms in its weighting scheme.The content terms are identified by extracting words in the source code program. Based on the undertaken experiment the LSI classifier is noted to generate a higher precision and recall compared to the C4.5 algorithm. Furthermore,it is also learned that the use of structural information in the weighting scheme contribute to a better classification. Medwell Publishing 2012 Article PeerReviewed application/pdf en http://repo.uum.edu.my/9501/1/2.pdf Yusof, Yuhanis and Alhersh, Taha and Mahmuddin, Massudi and Mohamed Din, Aniza (2012) Source code classification using latent semantic indexing with structural and frequency term weighting. Research Journal of Applied Sciences, 7 (5). pp. 266-271. ISSN 1815-932X http://medwelljournals.com/abstract/?doi=rjasci.2012.266.271 |
institution |
Universiti Utara Malaysia |
building |
UUM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Utara Malaysia |
content_source |
UUM Institutionali Repository |
url_provider |
http://repo.uum.edu.my/ |
language |
English |
topic |
QA76 Computer software |
spellingShingle |
QA76 Computer software Yusof, Yuhanis Alhersh, Taha Mahmuddin, Massudi Mohamed Din, Aniza Source code classification using latent semantic indexing with structural and frequency term weighting |
description |
In recent years, there is an increase in the number of open source software.Hence, the demand for automatic software classification is also increasing.Latent Semantic Indexing
(LSI) is an information retrieval approach that is utilized in classifying source code programs. This research proposes a Latent Semantic Indexing classifier that integrates information structural and frequency of terms in its weighting scheme.The content terms are identified by extracting words in the source code program. Based on the undertaken experiment the LSI classifier is noted to generate a higher precision and recall compared to the C4.5 algorithm. Furthermore,it is also learned that the use of structural information in the weighting scheme contribute
to a better classification. |
format |
Article |
author |
Yusof, Yuhanis Alhersh, Taha Mahmuddin, Massudi Mohamed Din, Aniza |
author_facet |
Yusof, Yuhanis Alhersh, Taha Mahmuddin, Massudi Mohamed Din, Aniza |
author_sort |
Yusof, Yuhanis |
title |
Source code classification using latent semantic indexing with structural and frequency term weighting |
title_short |
Source code classification using latent semantic indexing with structural and frequency term weighting |
title_full |
Source code classification using latent semantic indexing with structural and frequency term weighting |
title_fullStr |
Source code classification using latent semantic indexing with structural and frequency term weighting |
title_full_unstemmed |
Source code classification using latent semantic indexing with structural and frequency term weighting |
title_sort |
source code classification using latent semantic indexing with structural and frequency term weighting |
publisher |
Medwell Publishing |
publishDate |
2012 |
url |
http://repo.uum.edu.my/9501/1/2.pdf http://repo.uum.edu.my/9501/ http://medwelljournals.com/abstract/?doi=rjasci.2012.266.271 |
_version_ |
1644280124474916864 |