Term co-occurrence evolution study

Huge data is created continuously and all these data are stored somewhere in its raw form. In this project, we introduced a prototype application using series of algorithms to convert these raw data into a form that we can study on. The project focused on the terms’ co-occurrence evolution over time...

Full description

Saved in:

Bibliographic Details
Main Author:	Tan, Bernard Mao Sheng
Other Authors:	Sun Aixin
Format:	Final Year Project
Language:	English
Published:	2014
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	http://hdl.handle.net/10356/58954
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-58954
record_format	dspace
spelling	sg-ntu-dr.10356-589542023-03-03T20:36:51Z Term co-occurrence evolution study Tan, Bernard Mao Sheng Sun Aixin School of Computer Engineering DRNTU::Engineering::Computer science and engineering Huge data is created continuously and all these data are stored somewhere in its raw form. In this project, we introduced a prototype application using series of algorithms to convert these raw data into a form that we can study on. The project focused on the terms’ co-occurrence evolution over time. In order to implement this application, some research is done to identify ways to transform these raw data into other forms for easy manipulation. Various API Libraries, including Natural Language Processing, Multi-threading and Data Indexing are used. With project focus on studying term co-occurrence evolution, the prototype is designed with a graphical user interface with real-time performance in consideration. The application allows direct user interaction to run analysis which complete within seconds. The result is displayed in two forms, line chart graph and detailed table. User is able to directly manipulate on the line chart by dynamically selecting co-occurred terms that they are interested in. To facilitate on clearer analysis result, the application includes ranking algorithms to rank the terms from the result based on their interestingness. By default, when the analysis is complete, the application will rank the terms, output the line chart with top 5 interesting terms and sort the details in the detailed table. Due to the nature of handling huge data, the application needs to be optimised and fast. This is where preprocessing is performed and multi-threading is added in the analysis process to utilise the system’s computing power to speed up the analysis. Even though, the objective is achieved in identifying interesting co-occurred terms, improvements and additional features could be introduced to extend its potential. Some recommendations include better multi-threading logic and better ranking algorithms. Bachelor of Engineering (Computer Science) 2014-04-17T03:10:42Z 2014-04-17T03:10:42Z 2014 2014 Final Year Project (FYP) http://hdl.handle.net/10356/58954 en Nanyang Technological University 37 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Tan, Bernard Mao Sheng Term co-occurrence evolution study
description	Huge data is created continuously and all these data are stored somewhere in its raw form. In this project, we introduced a prototype application using series of algorithms to convert these raw data into a form that we can study on. The project focused on the terms’ co-occurrence evolution over time. In order to implement this application, some research is done to identify ways to transform these raw data into other forms for easy manipulation. Various API Libraries, including Natural Language Processing, Multi-threading and Data Indexing are used. With project focus on studying term co-occurrence evolution, the prototype is designed with a graphical user interface with real-time performance in consideration. The application allows direct user interaction to run analysis which complete within seconds. The result is displayed in two forms, line chart graph and detailed table. User is able to directly manipulate on the line chart by dynamically selecting co-occurred terms that they are interested in. To facilitate on clearer analysis result, the application includes ranking algorithms to rank the terms from the result based on their interestingness. By default, when the analysis is complete, the application will rank the terms, output the line chart with top 5 interesting terms and sort the details in the detailed table. Due to the nature of handling huge data, the application needs to be optimised and fast. This is where preprocessing is performed and multi-threading is added in the analysis process to utilise the system’s computing power to speed up the analysis. Even though, the objective is achieved in identifying interesting co-occurred terms, improvements and additional features could be introduced to extend its potential. Some recommendations include better multi-threading logic and better ranking algorithms.
author2	Sun Aixin
author_facet	Sun Aixin Tan, Bernard Mao Sheng
format	Final Year Project
author	Tan, Bernard Mao Sheng
author_sort	Tan, Bernard Mao Sheng
title	Term co-occurrence evolution study
title_short	Term co-occurrence evolution study
title_full	Term co-occurrence evolution study
title_fullStr	Term co-occurrence evolution study
title_full_unstemmed	Term co-occurrence evolution study
title_sort	term co-occurrence evolution study
publishDate	2014
url	http://hdl.handle.net/10356/58954
_version_	1759856321340899328

Term co-occurrence evolution study

Similar Items