#TITLE_ALTERNATIVE#

Abstract: <br /> <br /> <br /> <br /> <br /> The progress of technology supports the computerization in many field, i.e: recording, computing, and illustration. These lead to the need of available of data in large volume, which can yield the knowledge which is...

Full description

Saved in:
Bibliographic Details
Main Author: Kusmaya
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/8443
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:8443
spelling id-itb.:84432017-09-27T15:37:08Z#TITLE_ALTERNATIVE# Kusmaya Indonesia Theses INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/8443 Abstract: <br /> <br /> <br /> <br /> <br /> The progress of technology supports the computerization in many field, i.e: recording, computing, and illustration. These lead to the need of available of data in large volume, which can yield the knowledge which is useful. The analysis to the data is needed, whether in exploring or in decision making. The importance of the data analysis cause the research of about data mining expanding. <br /> <br /> <br /> <br /> <br /> Comparative Text Mining (CTM) is one of many technique in text mining which peculiarly have the function; finding common theme from all collection, and finding the special theme from a document. The use of CTM, for example, is to summarizing reviews. Summarization is an automatic process yielding shorter document version (50percent or less) but remains useful for user. By summarization, user is expected to be able to catch the documents content without having to see the overall of document. <br /> <br /> <br /> <br /> <br /> Clustering method is a method owning ability to analyse and also to group documents automatically. Generally, clustering technique using word and document is usually considered as a word sets without the existence of sequence, called bag of word. Suffix Tree Clustering (STC) is the first algorithm that use phrase (multi-word terms) so that its process is simpler compared with other algorithm. STC is an incremental algorithm, the complexity of the algorithm is linear O(n) and fulfill the criterion for clustering web documents. <br /> <br /> <br /> <br /> <br /> This thesis aims to study and to prove the performance of STCs algorithm by applying it to CTM case. In experiment, observation is done to see how parameter influence optimalization which may result, by comparing the theme yielded by CTM with the theme yielded by STC. <br /> text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Abstract: <br /> <br /> <br /> <br /> <br /> The progress of technology supports the computerization in many field, i.e: recording, computing, and illustration. These lead to the need of available of data in large volume, which can yield the knowledge which is useful. The analysis to the data is needed, whether in exploring or in decision making. The importance of the data analysis cause the research of about data mining expanding. <br /> <br /> <br /> <br /> <br /> Comparative Text Mining (CTM) is one of many technique in text mining which peculiarly have the function; finding common theme from all collection, and finding the special theme from a document. The use of CTM, for example, is to summarizing reviews. Summarization is an automatic process yielding shorter document version (50percent or less) but remains useful for user. By summarization, user is expected to be able to catch the documents content without having to see the overall of document. <br /> <br /> <br /> <br /> <br /> Clustering method is a method owning ability to analyse and also to group documents automatically. Generally, clustering technique using word and document is usually considered as a word sets without the existence of sequence, called bag of word. Suffix Tree Clustering (STC) is the first algorithm that use phrase (multi-word terms) so that its process is simpler compared with other algorithm. STC is an incremental algorithm, the complexity of the algorithm is linear O(n) and fulfill the criterion for clustering web documents. <br /> <br /> <br /> <br /> <br /> This thesis aims to study and to prove the performance of STCs algorithm by applying it to CTM case. In experiment, observation is done to see how parameter influence optimalization which may result, by comparing the theme yielded by CTM with the theme yielded by STC. <br />
format Theses
author Kusmaya
spellingShingle Kusmaya
#TITLE_ALTERNATIVE#
author_facet Kusmaya
author_sort Kusmaya
title #TITLE_ALTERNATIVE#
title_short #TITLE_ALTERNATIVE#
title_full #TITLE_ALTERNATIVE#
title_fullStr #TITLE_ALTERNATIVE#
title_full_unstemmed #TITLE_ALTERNATIVE#
title_sort #title_alternative#
url https://digilib.itb.ac.id/gdl/view/8443
_version_ 1820664413571514368