Automatic summarizer for web documents
As the world globalize, internet is being used around the world. This resulted in the web documents in texts, growing exponentially. It is not suitable to read through all the text information online and just to find and sieve out what you need. Using unsupervised clustering algorithms,...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/61087 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-61087 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-610872023-07-07T17:09:44Z Automatic summarizer for web documents Chia, Pei Qi Mao Kezhi School of Electrical and Electronic Engineering DRNTU::Engineering As the world globalize, internet is being used around the world. This resulted in the web documents in texts, growing exponentially. It is not suitable to read through all the text information online and just to find and sieve out what you need. Using unsupervised clustering algorithms, the author had created an automatic summarizer that summarizes long documents into short summaries. This thesis will discuss various natural language processing techniques and data mining concepts that are used within the software with primary focus on Lemmatization. These allows the gathering of similar meaning words as well as clustering algorithms Hierarchical Agglomerative Clustering and K-means. The methodology is using the top down and incremental approach to design and build a reliable and functional summarizer. This thesis also explains the functionalities of the summarizer with different implemented tests for greater confidence. They are then observe and evaluate on its flexibility to different text inputs and the logicality of the output summaries. The thesis would then conclude with the suggestion of increasing the usage of natural language process to aid computers in the 'understanding' text information and the probably of using soft clustering approach. All in all, the objective of the project is met and the thesis provides the reader the necessary knowledge to develop a summarizer using the clustering process depicted. Bachelor of Engineering 2014-06-04T08:04:22Z 2014-06-04T08:04:22Z 2014 2014 Final Year Project (FYP) http://hdl.handle.net/10356/61087 en Nanyang Technological University 77 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering |
spellingShingle |
DRNTU::Engineering Chia, Pei Qi Automatic summarizer for web documents |
description |
As the world globalize, internet is being used around the world. This resulted in the web
documents in texts, growing exponentially. It is not suitable to read through all the text
information online and just to find and sieve out what you need. Using unsupervised clustering
algorithms, the author had created an automatic summarizer that summarizes long documents
into short summaries.
This thesis will discuss various natural language processing techniques and data mining concepts
that are used within the software with primary focus on Lemmatization. These allows the
gathering of similar meaning words as well as clustering algorithms Hierarchical Agglomerative
Clustering and K-means. The methodology is using the top down and incremental approach to
design and build a reliable and functional summarizer.
This thesis also explains the functionalities of the summarizer with different implemented tests
for greater confidence. They are then observe and evaluate on its flexibility to different text
inputs and the logicality of the output summaries.
The thesis would then conclude with the suggestion of increasing the usage of natural language
process to aid computers in the 'understanding' text information and the probably of using soft
clustering approach. All in all, the objective of the project is met and the thesis provides the
reader the necessary knowledge to develop a summarizer using the clustering process depicted. |
author2 |
Mao Kezhi |
author_facet |
Mao Kezhi Chia, Pei Qi |
format |
Final Year Project |
author |
Chia, Pei Qi |
author_sort |
Chia, Pei Qi |
title |
Automatic summarizer for web documents |
title_short |
Automatic summarizer for web documents |
title_full |
Automatic summarizer for web documents |
title_fullStr |
Automatic summarizer for web documents |
title_full_unstemmed |
Automatic summarizer for web documents |
title_sort |
automatic summarizer for web documents |
publishDate |
2014 |
url |
http://hdl.handle.net/10356/61087 |
_version_ |
1772825618360041472 |