Web blog content classification

The internet is a great source of database that contains valuable information of all kind. Weblogs or blogs has been one of the most popular and greatly growing communication tools on the internet. With many people sharing and discussing on different topic, many valuable information can be attain...

Full description

Saved in:
Bibliographic Details
Main Author: Wong, Hsiu Yun.
Other Authors: Wang Lipo
Format: Final Year Project
Language:English
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/10356/17934
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-17934
record_format dspace
spelling sg-ntu-dr.10356-179342023-07-07T15:47:06Z Web blog content classification Wong, Hsiu Yun. Wang Lipo School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems The internet is a great source of database that contains valuable information of all kind. Weblogs or blogs has been one of the most popular and greatly growing communication tools on the internet. With many people sharing and discussing on different topic, many valuable information can be attain from it. However, with many blog providers on the internet providing free space for people to host their blog or discuss on topics, the number of people “blogging”, has grown exponentially. The problem is, most of the time, blogger can post entry on any topic anytime they want. Although some blog does categories the entries posted or have different categories for the user to post their topic on, most off the categories are either too general or the user might just post their topic on any categories which is not related at all. This resulted in time wasted on going through unnecessary entries and most of the time user will be spending more time on searching what they want to read then reading it. Therefore, weblog content classification program was created to ease the user the browsing and reading of interested entries. However, in order to classify the content more specifically to the user desired categories, the involvement of the user judgment on the keywords related to the categories will be required. For this project, an open source web scrapper tool Web-Harvest was intergraded with the program to extract the required desired blog contents for classification. Bachelor of Engineering 2009-06-18T02:18:18Z 2009-06-18T02:18:18Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/17934 en Nanyang Technological University 110 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems
spellingShingle DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems
Wong, Hsiu Yun.
Web blog content classification
description The internet is a great source of database that contains valuable information of all kind. Weblogs or blogs has been one of the most popular and greatly growing communication tools on the internet. With many people sharing and discussing on different topic, many valuable information can be attain from it. However, with many blog providers on the internet providing free space for people to host their blog or discuss on topics, the number of people “blogging”, has grown exponentially. The problem is, most of the time, blogger can post entry on any topic anytime they want. Although some blog does categories the entries posted or have different categories for the user to post their topic on, most off the categories are either too general or the user might just post their topic on any categories which is not related at all. This resulted in time wasted on going through unnecessary entries and most of the time user will be spending more time on searching what they want to read then reading it. Therefore, weblog content classification program was created to ease the user the browsing and reading of interested entries. However, in order to classify the content more specifically to the user desired categories, the involvement of the user judgment on the keywords related to the categories will be required. For this project, an open source web scrapper tool Web-Harvest was intergraded with the program to extract the required desired blog contents for classification.
author2 Wang Lipo
author_facet Wang Lipo
Wong, Hsiu Yun.
format Final Year Project
author Wong, Hsiu Yun.
author_sort Wong, Hsiu Yun.
title Web blog content classification
title_short Web blog content classification
title_full Web blog content classification
title_fullStr Web blog content classification
title_full_unstemmed Web blog content classification
title_sort web blog content classification
publishDate 2009
url http://hdl.handle.net/10356/17934
_version_ 1772827991126048768