SINGLE PASS FUZZY MEANS CLUSTERING FOR TOPIC DETECTION
Nowadays, information can be found from various sources, social media for instance. Social media provides massive scale information which can be used for many purposes. The main issue is how to tranform this information into knowledge. In this research, information was processed to detect current to...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/47981 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:47981 |
---|---|
spelling |
id-itb.:479812020-06-25T01:38:32ZSINGLE PASS FUZZY MEANS CLUSTERING FOR TOPIC DETECTION Veronica Claudia Muljana, Maria Indonesia Theses Single Pass Fuzzy Means, clustering, bag of words INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/47981 Nowadays, information can be found from various sources, social media for instance. Social media provides massive scale information which can be used for many purposes. The main issue is how to tranform this information into knowledge. In this research, information was processed to detect current topics in social media. Topic detection can be done using several methods, one of them is clustering. This research focus on increasing the efficiency of clustering algorithm for text data. Text data is retrieved from Twitter and processed using clustering algorithm to produce bags of words, where each bag of words is cluster center and represents a particular topic. Proposed algorithm, Single Pass Fuzzy Means, is an algorithm based on Fuzzy C-Means Clustering with application of similarity threshold in clusters building. Text data were transformed using Vector Space Model and TF-IDF weighting, then clustered using Single Pass Fuzzy Means. Experiments proved that Single Pass Fuzzy Means clusters have similar quality compared with Fuzzy C-Means. Moreover, processs time is shorter than Fuzzy C-Means. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Nowadays, information can be found from various sources, social media for instance. Social media provides massive scale information which can be used for many purposes. The main issue is how to tranform this information into knowledge. In this research, information was processed to detect current topics in social media.
Topic detection can be done using several methods, one of them is clustering. This research focus on increasing the efficiency of clustering algorithm for text data. Text data is retrieved from Twitter and processed using clustering algorithm to produce bags of words, where each bag of words is cluster center and represents a particular topic. Proposed algorithm, Single Pass Fuzzy Means, is an algorithm based on Fuzzy C-Means Clustering with application of similarity threshold in clusters building.
Text data were transformed using Vector Space Model and TF-IDF weighting, then clustered using Single Pass Fuzzy Means. Experiments proved that Single Pass Fuzzy Means clusters have similar quality compared with Fuzzy C-Means. Moreover, processs time is shorter than Fuzzy C-Means.
|
format |
Theses |
author |
Veronica Claudia Muljana, Maria |
spellingShingle |
Veronica Claudia Muljana, Maria SINGLE PASS FUZZY MEANS CLUSTERING FOR TOPIC DETECTION |
author_facet |
Veronica Claudia Muljana, Maria |
author_sort |
Veronica Claudia Muljana, Maria |
title |
SINGLE PASS FUZZY MEANS CLUSTERING FOR TOPIC DETECTION |
title_short |
SINGLE PASS FUZZY MEANS CLUSTERING FOR TOPIC DETECTION |
title_full |
SINGLE PASS FUZZY MEANS CLUSTERING FOR TOPIC DETECTION |
title_fullStr |
SINGLE PASS FUZZY MEANS CLUSTERING FOR TOPIC DETECTION |
title_full_unstemmed |
SINGLE PASS FUZZY MEANS CLUSTERING FOR TOPIC DETECTION |
title_sort |
single pass fuzzy means clustering for topic detection |
url |
https://digilib.itb.ac.id/gdl/view/47981 |
_version_ |
1822271592355332096 |