Ontology-based blog discovery and classification

Blog is a new media emerging on the Internet representing a new source of information. However, the huge number of blogs, the noisy content, as well as the high dynamic nature preventing many existing techniques to be applied to blogs directly. In this project we research on the techniques for organ...

Full description

Saved in:
Bibliographic Details
Main Author: Sun, Aixin.
Other Authors: School of Computer Engineering
Format: Research Report
Language:English
Published: 2010
Subjects:
Online Access:http://hdl.handle.net/10356/42266
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Blog is a new media emerging on the Internet representing a new source of information. However, the huge number of blogs, the noisy content, as well as the high dynamic nature preventing many existing techniques to be applied to blogs directly. In this project we research on the techniques for organizing blogs into pre-defined categories for easy management, summarizing the major topics in blogs so as to minimize the impact of noise, and also conducting content analysis to identify the linkage between blogs and news articles for event detection. All these topics are new research topics and have not been well studied in literature. Through the completion of the project, we have developed techniques for classifying blogs using tags describing them coupled with tag expansion, blog post summarization through comments to catch readers' understanding about the posts, profiling blogs by picking up more representative blog posts to enable more efficient blog content analysis, and also event detection though blog and news search. Due to the lack of manpower support, the research was mainly focused on algorithms rather than implementation of a prototype system. The techniques and their experimental evaluation have been presented in well-recognized conferences including ACM SIGIR conference and ACM international Conference on Information and Knowledge Management (CIKM).