#TITLE_ALTERNATIVE#
Abstract: <br /> <br /> <br /> <br /> <br /> The progress of technology supports the computerization in many field, i.e: recording, computing, and illustration. These lead to the need of available of data in large volume, which can yield the knowledge which is...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/8443 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Abstract: <br />
<br />
<br />
<br />
<br />
The progress of technology supports the computerization in many field, i.e: recording, computing, and illustration. These lead to the need of available of data in large volume, which can yield the knowledge which is useful. The analysis to the data is needed, whether in exploring or in decision making. The importance of the data analysis cause the research of about data mining expanding. <br />
<br />
<br />
<br />
<br />
Comparative Text Mining (CTM) is one of many technique in text mining which peculiarly have the function; finding common theme from all collection, and finding the special theme from a document. The use of CTM, for example, is to summarizing reviews. Summarization is an automatic process yielding shorter document version (50percent or less) but remains useful for user. By summarization, user is expected to be able to catch the documents content without having to see the overall of document. <br />
<br />
<br />
<br />
<br />
Clustering method is a method owning ability to analyse and also to group documents automatically. Generally, clustering technique using word and document is usually considered as a word sets without the existence of sequence, called bag of word. Suffix Tree Clustering (STC) is the first algorithm that use phrase (multi-word terms) so that its process is simpler compared with other algorithm. STC is an incremental algorithm, the complexity of the algorithm is linear O(n) and fulfill the criterion for clustering web documents. <br />
<br />
<br />
<br />
<br />
This thesis aims to study and to prove the performance of STCs algorithm by applying it to CTM case. In experiment, observation is done to see how parameter influence optimalization which may result, by comparing the theme yielded by CTM with the theme yielded by STC. <br />
|
---|