#TITLE_ALTERNATIVE#

Abstract: <br /> <br /> <br /> <br /> <br /> The progress of technology supports the computerization in many field, i.e: recording, computing, and illustration. These lead to the need of available of data in large volume, which can yield the knowledge which is usef...

Full description

Saved in:
Bibliographic Details
Main Author: (NIM 235 04 036), Kusmaya
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/8444
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:8444
spelling id-itb.:84442017-09-27T15:37:09Z#TITLE_ALTERNATIVE# (NIM 235 04 036), Kusmaya Indonesia Theses INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/8444 Abstract: <br /> <br /> <br /> <br /> <br /> The progress of technology supports the computerization in many field, i.e: recording, computing, and illustration. These lead to the need of available of data in large volume, which can yield the knowledge which is useful. The analysis to the data is needed, whether in exploring or in decision making. The importance of the data analysis cause the research of about data mining expanding. <br /> <br /> <br /> <br /> <br /> Comparative Text Mining (CTM) is one of many technique in text mining which peculiarly have the function; finding common theme from all collection, and finding the special theme from a document. The use of CTM, for example, is to summarizing reviews. Summarization is an automatic process yielding shorter document version (50percent or less) but remains useful for user. By summarization, user is expected to be able to catch the documents content without having to see the overall of document. <br /> <br /> <br /> <br /> <br /> Clustering method is a method owning ability to analyse and also to group documents automatically. Generally, clustering technique using word and document is usually considered as a word sets without the existence of sequence, called bag of word. Suffix Tree Clustering (STC) is the first algorithm that use phrase (multi-word terms) so that its process is simpler compared with other algorithm. STC is an incremental algorithm, the complexity of the algorithm is linear O(n) and fulfill the criterion for clustering web documents. <br /> <br /> <br /> <br /> <br /> This thesis aims to study and to prove the performance of STCs algorithm by applying it to CTM case. In experiment, observation is done to see how parameter influence optimalization which may result, by comparing the theme yielded by CTM with the theme yielded by STC. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Abstract: <br /> <br /> <br /> <br /> <br /> The progress of technology supports the computerization in many field, i.e: recording, computing, and illustration. These lead to the need of available of data in large volume, which can yield the knowledge which is useful. The analysis to the data is needed, whether in exploring or in decision making. The importance of the data analysis cause the research of about data mining expanding. <br /> <br /> <br /> <br /> <br /> Comparative Text Mining (CTM) is one of many technique in text mining which peculiarly have the function; finding common theme from all collection, and finding the special theme from a document. The use of CTM, for example, is to summarizing reviews. Summarization is an automatic process yielding shorter document version (50percent or less) but remains useful for user. By summarization, user is expected to be able to catch the documents content without having to see the overall of document. <br /> <br /> <br /> <br /> <br /> Clustering method is a method owning ability to analyse and also to group documents automatically. Generally, clustering technique using word and document is usually considered as a word sets without the existence of sequence, called bag of word. Suffix Tree Clustering (STC) is the first algorithm that use phrase (multi-word terms) so that its process is simpler compared with other algorithm. STC is an incremental algorithm, the complexity of the algorithm is linear O(n) and fulfill the criterion for clustering web documents. <br /> <br /> <br /> <br /> <br /> This thesis aims to study and to prove the performance of STCs algorithm by applying it to CTM case. In experiment, observation is done to see how parameter influence optimalization which may result, by comparing the theme yielded by CTM with the theme yielded by STC.
format Theses
author (NIM 235 04 036), Kusmaya
spellingShingle (NIM 235 04 036), Kusmaya
#TITLE_ALTERNATIVE#
author_facet (NIM 235 04 036), Kusmaya
author_sort (NIM 235 04 036), Kusmaya
title #TITLE_ALTERNATIVE#
title_short #TITLE_ALTERNATIVE#
title_full #TITLE_ALTERNATIVE#
title_fullStr #TITLE_ALTERNATIVE#
title_full_unstemmed #TITLE_ALTERNATIVE#
title_sort #title_alternative#
url https://digilib.itb.ac.id/gdl/view/8444
_version_ 1820664413855678464