CLASSIFICATION AND CLUSTERING TO IDENTIFY SPOKEN DIALECTS IN INDONESIAN

In this thesis, the research is done with Support Vector Machines to identify eight spoken dialects in Indonesian. Those eight dialects are chosen based on previous research, they are Aceh, Bali, Batak, Betawi, Jawa, Minangkabau, Sulawesi, and Sunda dialects. <br /> <br /> <br /&...

Full description

Saved in:
Bibliographic Details
Main Author: IBRAHIM, JACQUELINE
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/22664
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:22664
spelling id-itb.:226642017-10-09T10:28:08ZCLASSIFICATION AND CLUSTERING TO IDENTIFY SPOKEN DIALECTS IN INDONESIAN IBRAHIM, JACQUELINE Indonesia Final Project INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/22664 In this thesis, the research is done with Support Vector Machines to identify eight spoken dialects in Indonesian. Those eight dialects are chosen based on previous research, they are Aceh, Bali, Batak, Betawi, Jawa, Minangkabau, Sulawesi, and Sunda dialects. <br /> <br /> <br /> <br /> <br /> Spoken data is from speaker who lives in Bandung. In other note, the dialect that is heard has probability to be not so clear due to effect from environment. Spoken data then is being segmented to 4 seconds each. Then, it is being extracted for MFCC, spectral flux, and spectral centroid feature. That data in ARFF format then is being added by dialect attribute as label to its dialect. <br /> <br /> <br /> <br /> <br /> Experiment and testing then is being held with all-at-once and one-against-one technique. The kernel function that is used is linear kernel. The highest average result is given by one-against-one technique and with MFCC, spectral flux, and spectral centroid feature, that is 55%. On the other hand, with MFCC feature only, the result is lower, that is 53,5%. That being said, the used of three features is better than only MFCC feature. <br /> text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description In this thesis, the research is done with Support Vector Machines to identify eight spoken dialects in Indonesian. Those eight dialects are chosen based on previous research, they are Aceh, Bali, Batak, Betawi, Jawa, Minangkabau, Sulawesi, and Sunda dialects. <br /> <br /> <br /> <br /> <br /> Spoken data is from speaker who lives in Bandung. In other note, the dialect that is heard has probability to be not so clear due to effect from environment. Spoken data then is being segmented to 4 seconds each. Then, it is being extracted for MFCC, spectral flux, and spectral centroid feature. That data in ARFF format then is being added by dialect attribute as label to its dialect. <br /> <br /> <br /> <br /> <br /> Experiment and testing then is being held with all-at-once and one-against-one technique. The kernel function that is used is linear kernel. The highest average result is given by one-against-one technique and with MFCC, spectral flux, and spectral centroid feature, that is 55%. On the other hand, with MFCC feature only, the result is lower, that is 53,5%. That being said, the used of three features is better than only MFCC feature. <br />
format Final Project
author IBRAHIM, JACQUELINE
spellingShingle IBRAHIM, JACQUELINE
CLASSIFICATION AND CLUSTERING TO IDENTIFY SPOKEN DIALECTS IN INDONESIAN
author_facet IBRAHIM, JACQUELINE
author_sort IBRAHIM, JACQUELINE
title CLASSIFICATION AND CLUSTERING TO IDENTIFY SPOKEN DIALECTS IN INDONESIAN
title_short CLASSIFICATION AND CLUSTERING TO IDENTIFY SPOKEN DIALECTS IN INDONESIAN
title_full CLASSIFICATION AND CLUSTERING TO IDENTIFY SPOKEN DIALECTS IN INDONESIAN
title_fullStr CLASSIFICATION AND CLUSTERING TO IDENTIFY SPOKEN DIALECTS IN INDONESIAN
title_full_unstemmed CLASSIFICATION AND CLUSTERING TO IDENTIFY SPOKEN DIALECTS IN INDONESIAN
title_sort classification and clustering to identify spoken dialects in indonesian
url https://digilib.itb.ac.id/gdl/view/22664
_version_ 1821120840749547520