Intelligent forum search: knowledge discovery through co-occurrence analysis in the forum document set

In many use cases of search engines, users need to deal with large collections of documents from unfamiliar domains. Searching and browsing documents are full of frustration without high familiarity with the domain. Users need a way to get a quick understanding of the key terms and key topics that a...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang, Danyang
Other Authors: Sun Aixin
Format: Final Year Project
Language:English
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/10356/62806
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In many use cases of search engines, users need to deal with large collections of documents from unfamiliar domains. Searching and browsing documents are full of frustration without high familiarity with the domain. Users need a way to get a quick understanding of the key terms and key topics that are particular to that domain of texts. Searching for the documents you do not know, that is discovering new knowledge in unfamiliar domains, is the problem that this project aims to address. We developed an intelligent search engine to equip users with the ability to extract key terms as well as key phrases from a totally new domain of texts, by leveraging co-occurrence analysis. Specifically, we extended the existing Lucene searching engine core, implemented the RAKE phrase extraction algorithm, the document clustering analysis, and the co-occurrence analysis for both terms and phrases. We applied the intelligent search engine to search in the domain of a local forum, which demonstrated the richness and effectiveness of co-occurrence analysis for query term suggestions and query phrase suggestions.