Classification problem of community question answering (CQA) services website

There are numerous data in Community Question Answering (CQA) services systems. More and more are becoming available every day. Massive amount of information is easily accessible. Seeking value in this huge collection requires organization; much of the work of organizing documents can be automated t...

Full description

Saved in:
Bibliographic Details
Main Author: Sun, Fei.
Other Authors: School of Computer Engineering
Format: Final Year Project
Language:English
Published: 2011
Subjects:
Online Access:http://hdl.handle.net/10356/44137
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:There are numerous data in Community Question Answering (CQA) services systems. More and more are becoming available every day. Massive amount of information is easily accessible. Seeking value in this huge collection requires organization; much of the work of organizing documents can be automated through text classification. This project study the method of classify text document into categories. Recent approaches to text classification have used various text classification techniques. In this project, Naïve Bayes and Support vector Machine are used for experiment. This project is aimed describing the differences of these two methods, and by comparing their classification performance, I found Support Vector Machine over perform Naïve Bayes on the data set I crawled from wikianswers.com. Furthermore, an approach of text classification called hierarchy classification is introduced. An experiment is carried out to evaluate the performance of the model.