Knowledge discovery from forum data

Advancement in information retrieval and data mining techniques has provided more and more useful mechanisms for the retrieval of most relevant information from documents, as well as for knowledge discovery from the same. The knowledge embedded in online forums, a kind of knowledge-rich data source,...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Jun
Other Authors: Sun Aixin
Format: Theses and Dissertations
Language:English
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/10356/62935
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-62935
record_format dspace
spelling sg-ntu-dr.10356-629352019-12-10T12:32:33Z Knowledge discovery from forum data Li, Jun Sun Aixin Wee Kim Wee School of Communication and Information DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval Advancement in information retrieval and data mining techniques has provided more and more useful mechanisms for the retrieval of most relevant information from documents, as well as for knowledge discovery from the same. The knowledge embedded in online forums, a kind of knowledge-rich data source, has yet be fully utilized because of the limited search functionalities provided by most existing forum platforms. This project provides a prototype solution to improve search functions of online forums. More specifically, a multithreaded Crawler and a Parser have been implemented to download and parse the posts published in a local forum in HTML format. A Topic Modeler which is built based on the MALLET package is used to generate the high-level topics of the forum data. An Indexer and a Searcher are then developed based on Lucene, to support searching over the forum data. A web search interface which supports sophisticated search requests and search result facet visualization is developed for users to discover knowledge in online forums. As the result, the solution provided by this project allows users to search relevant information by simple (e.g. single-keyword) as well as sophisticated queries. It also shows users a high-level view of the search results in aggregative and multi-facet visualized form. Furthermore, it enables users to understand the high-level topics of the search results by topic modeling. This search interface helps users to find the relevant information more effectively and efficiently. This study ends with a few limitations identified but not tackled due to the project scope and time constraint. Nevertheless, recommendations on addressing these limitations are made as future work. Master of Science (Information Studies) 2015-05-04T03:23:49Z 2015-05-04T03:23:49Z 2015 2015 Thesis http://hdl.handle.net/10356/62935 en Nanyang Technological University 68 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
spellingShingle DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
Li, Jun
Knowledge discovery from forum data
description Advancement in information retrieval and data mining techniques has provided more and more useful mechanisms for the retrieval of most relevant information from documents, as well as for knowledge discovery from the same. The knowledge embedded in online forums, a kind of knowledge-rich data source, has yet be fully utilized because of the limited search functionalities provided by most existing forum platforms. This project provides a prototype solution to improve search functions of online forums. More specifically, a multithreaded Crawler and a Parser have been implemented to download and parse the posts published in a local forum in HTML format. A Topic Modeler which is built based on the MALLET package is used to generate the high-level topics of the forum data. An Indexer and a Searcher are then developed based on Lucene, to support searching over the forum data. A web search interface which supports sophisticated search requests and search result facet visualization is developed for users to discover knowledge in online forums. As the result, the solution provided by this project allows users to search relevant information by simple (e.g. single-keyword) as well as sophisticated queries. It also shows users a high-level view of the search results in aggregative and multi-facet visualized form. Furthermore, it enables users to understand the high-level topics of the search results by topic modeling. This search interface helps users to find the relevant information more effectively and efficiently. This study ends with a few limitations identified but not tackled due to the project scope and time constraint. Nevertheless, recommendations on addressing these limitations are made as future work.
author2 Sun Aixin
author_facet Sun Aixin
Li, Jun
format Theses and Dissertations
author Li, Jun
author_sort Li, Jun
title Knowledge discovery from forum data
title_short Knowledge discovery from forum data
title_full Knowledge discovery from forum data
title_fullStr Knowledge discovery from forum data
title_full_unstemmed Knowledge discovery from forum data
title_sort knowledge discovery from forum data
publishDate 2015
url http://hdl.handle.net/10356/62935
_version_ 1681035551689408512