Tools for analysis of large-scale networks (I) algorithms, analytics and visualization

Rapid development in social network Internet application have led to unprecedented increase in the size and quality of datasets. Developing a tool which can be used in analysing large scale network can indeed make our life much easier and contribute to data analytic. This project aimed to continue...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang, Xinyi
Other Authors: Cong Gao
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/72828
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Rapid development in social network Internet application have led to unprecedented increase in the size and quality of datasets. Developing a tool which can be used in analysing large scale network can indeed make our life much easier and contribute to data analytic. This project aimed to continue to work onto the large scaled analyzation tool that is developed by the previous student, Chua Chee Ann. In the previous tool, many basic analytical functions such as data search, topic modelling on the retrieved data and graphic user interface has been implemented successfully. Among all the social media sites, Twitter was chosen and 16.5GB raw tweets was used as the dataset in this project. In this project, 2 approaches have been taken to improve the overall performance of this tool. Firstly, existing data structure-grid file has been analysed and implemented onto the dataset, which proven to be effectively improve the query time of different type of queries. Secondly, multiprocessing has been implemented in this project to improve the efficiency of the data processing time specifically on Topic Modelling function.