Classification problem of community question answering (CQA) services website
There are numerous data in Community Question Answering (CQA) services systems. More and more are becoming available every day. Massive amount of information is easily accessible. Seeking value in this huge collection requires organization; much of the work of organizing documents can be automated t...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2011
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/44137 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | There are numerous data in Community Question Answering (CQA) services systems. More and more are becoming available every day. Massive amount of information is easily accessible. Seeking value in this huge collection requires organization; much of the work of organizing documents can be automated through text classification. This project study the method of classify text document into categories. Recent approaches to text classification have used various text classification techniques. In this project, Naïve Bayes and Support vector Machine are used for experiment. This project is aimed describing the differences of these two methods, and by comparing their classification performance, I found Support Vector Machine over perform Naïve Bayes on the data set I crawled from wikianswers.com. Furthermore, an approach of text classification called hierarchy classification is introduced. An experiment is carried out to evaluate the performance of the model. |
---|