Semantic text analytic services on the cloud

Text Analytics has applications in many areas. However, when Big Data is involved, there is a need to implement Text Analytic Services on the Cloud, due to the limitations of individual machines. Nonetheless, performing computation on large volumes of data is difficult. Even if an implementation wor...

Full description

Saved in:
Bibliographic Details
Main Author: Ng, Wei Kok.
Other Authors: Chan Chee Keong
Format: Final Year Project
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/54449
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Text Analytics has applications in many areas. However, when Big Data is involved, there is a need to implement Text Analytic Services on the Cloud, due to the limitations of individual machines. Nonetheless, performing computation on large volumes of data is difficult. Even if an implementation works on 10 machines, to scale up to 100s or even 1000s of machines would require major changes to the implementation. The Hadoop framework ensures reliability and availability using unreliable commodity hardware. Hadoop also has a linear scaling, even when scaling up by orders of magnitude, giving Hadoop an advantage over other forms of distributed computing. By implementing GATE, a widely used Text Analytic tool, on the Hadoop framework using MapReduce, this project aims to enable Text Analysis on Big Data, with the linear scaling provided by the Hadoop framework. Performance analysis of the implementation shows that there is indeed a linear scaling when processing with increasing number of machines. As GATE can be used for a multitude of Text Analytic purposes, this implementation will allow the analysis of Big Data in many areas of interest.