A systematic approach for Malay language dialect identification by using CNN / Mohd Azman Hanif Sulaiman … [et al.]

As Malaysia moves forward towards the Industrial Revolution (IR 4. 0), computer systems have become part of everyday life, leading to increased man-machine interactions. Verbal communication is a convenient means to interact with computers. Speech recognition systems need to be robust to cater for v...

Full description

Saved in:
Bibliographic Details
Main Authors: Sulaiman, Mohd Azman Hanif, Abd Aziz, Nurhakimah, Zabidi, Azlee, Jantan, Zuraidah, Mohd Yassin, Ihsan, Megat Ali, Megat Syahirul Amin, Eskandari, Farzad
Format: Article
Language:English
Published: Universiti Teknologi MARA 2021
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/52057/1/52057.pdf
https://ir.uitm.edu.my/id/eprint/52057/
https://jeesr.uitm.edu.my/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Mara
Language: English
Description
Summary:As Malaysia moves forward towards the Industrial Revolution (IR 4. 0), computer systems have become part of everyday life, leading to increased man-machine interactions. Verbal communication is a convenient means to interact with computers. Speech recognition systems need to be robust to cater for various languages and dialects in order to interact better with humans. Dialects within a spoken language present a challenge for computers require a speech recognition system to translate these verbal commands to computer understanding of the underlying meaning from spoken words. In this paper, works on Malay language dialect identification are presented using Convolution Neural Network (CNN) trained on Mel Frequency Cepstral Coefficient (MFCC) features. Data was collected from 12 native speakers. Each speaker was instructed to utter 10 carefully selected words to emphasize the dialect nuances of the eastern, northern and central (standard) Malay dialect. The MFCC features were then extracted from the recorded audio samples and converted to graphical form. The images were then used to train a custom CNN neural network to differentiate between the various spoken words and their dialects. Results demonstrate that CNN was able to effectively differentiate between the spoken words with excellent accuracy (between 85% and 100%).