Knowledge discovery from forum data
Advancement in information retrieval and data mining techniques has provided more and more useful mechanisms for the retrieval of most relevant information from documents, as well as for knowledge discovery from the same. The knowledge embedded in online forums, a kind of knowledge-rich data source,...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/62935 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-62935 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-629352019-12-10T12:32:33Z Knowledge discovery from forum data Li, Jun Sun Aixin Wee Kim Wee School of Communication and Information DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval Advancement in information retrieval and data mining techniques has provided more and more useful mechanisms for the retrieval of most relevant information from documents, as well as for knowledge discovery from the same. The knowledge embedded in online forums, a kind of knowledge-rich data source, has yet be fully utilized because of the limited search functionalities provided by most existing forum platforms. This project provides a prototype solution to improve search functions of online forums. More specifically, a multithreaded Crawler and a Parser have been implemented to download and parse the posts published in a local forum in HTML format. A Topic Modeler which is built based on the MALLET package is used to generate the high-level topics of the forum data. An Indexer and a Searcher are then developed based on Lucene, to support searching over the forum data. A web search interface which supports sophisticated search requests and search result facet visualization is developed for users to discover knowledge in online forums. As the result, the solution provided by this project allows users to search relevant information by simple (e.g. single-keyword) as well as sophisticated queries. It also shows users a high-level view of the search results in aggregative and multi-facet visualized form. Furthermore, it enables users to understand the high-level topics of the search results by topic modeling. This search interface helps users to find the relevant information more effectively and efficiently. This study ends with a few limitations identified but not tackled due to the project scope and time constraint. Nevertheless, recommendations on addressing these limitations are made as future work. Master of Science (Information Studies) 2015-05-04T03:23:49Z 2015-05-04T03:23:49Z 2015 2015 Thesis http://hdl.handle.net/10356/62935 en Nanyang Technological University 68 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval Li, Jun Knowledge discovery from forum data |
description |
Advancement in information retrieval and data mining techniques has provided more and more useful mechanisms for the retrieval of most relevant information from documents, as well as for knowledge discovery from the same. The knowledge embedded in online forums, a kind of knowledge-rich data source, has yet be fully utilized because of the limited search functionalities provided by most existing forum platforms. This project provides a prototype solution to improve search functions of online forums. More specifically, a multithreaded Crawler and a Parser have been implemented to download and parse the posts published in a local forum in HTML format. A Topic Modeler which is built based on the MALLET package is used to generate the high-level topics of the forum data. An Indexer and a Searcher are then developed based on Lucene, to support searching over the forum data. A web search interface which supports sophisticated search requests and search result facet visualization is developed for users to discover knowledge in online forums. As the result, the solution provided by this project allows users to search relevant information by simple (e.g. single-keyword) as well as sophisticated queries. It also shows users a high-level view of the search results in aggregative and multi-facet visualized form. Furthermore, it enables users to understand the high-level topics of the search results by topic modeling. This search interface helps users to find the relevant information more effectively and efficiently. This study ends with a few limitations identified but not tackled due to the project scope and time constraint. Nevertheless, recommendations on addressing these limitations are made as future work. |
author2 |
Sun Aixin |
author_facet |
Sun Aixin Li, Jun |
format |
Theses and Dissertations |
author |
Li, Jun |
author_sort |
Li, Jun |
title |
Knowledge discovery from forum data |
title_short |
Knowledge discovery from forum data |
title_full |
Knowledge discovery from forum data |
title_fullStr |
Knowledge discovery from forum data |
title_full_unstemmed |
Knowledge discovery from forum data |
title_sort |
knowledge discovery from forum data |
publishDate |
2015 |
url |
http://hdl.handle.net/10356/62935 |
_version_ |
1681035551689408512 |