Cognitive-inspired speaker affective state profiling

Human behavior is influenced by emotion and human expressed affective state through numerous channels using non-verbal communication; namely: facial expression, gestures, eye-gazing, body postures, as well as verbal communication. In verbal communication itself, there are lots of underlying informat...

Full description

Saved in:
Bibliographic Details
Main Author: Norhaslinda Kamaruddin
Other Authors: Quek Hiok Chai
Format: Theses and Dissertations
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/51057
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-51057
record_format dspace
spelling sg-ntu-dr.10356-510572023-03-04T00:41:31Z Cognitive-inspired speaker affective state profiling Norhaslinda Kamaruddin Quek Hiok Chai School of Computer Engineering Centre for Computational Intelligence Abdul Wahab Abdul Rahman DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition DRNTU::Engineering::Computer science and engineering::Computer applications::Social and behavioral sciences Human behavior is influenced by emotion and human expressed affective state through numerous channels using non-verbal communication; namely: facial expression, gestures, eye-gazing, body postures, as well as verbal communication. In verbal communication itself, there are lots of underlying information transmitted using acoustical features and the semantic meaning of the word/sentence used. Despite the evident complexity of such interaction, listener still can correctly perceive the propagated emotion conveyed by the interlocutor. This is due to the human cognitive functional ability to dissect and infer the information with high accuracy and then react accordingly with appropriate behavioral responses and feedbacks. Hence, this research work introduces novel technique in discriminating emotion to facilitate the understanding of speaker affective state, based on the hypothesis that emotion is propagated through speech and it can be quantified. Speech emotion is a growing multi-disciplinary research field and is gaining greater momentum due to the increased need to improve on the quality of human computer interaction. Numerous researchers apply various feature extraction methods coupled with classifiers to produce acceptable accuracy performance. Nonetheless, the performance of such a system is bound to cultural influence which resulted in unpromising outcome once an unknown culture-influenced speech is introduced. Culture is always regarded as a trivial and inconsequential parameter that heeds minimal consideration in speech emotion recognition. Hence, in this work, the intricate relationship of cultural influence in term of intra-cultural and inter-cultural effects is studied in details. Two speech emotion datasets; of the NTU_American and NTU_Asian dataset representing the American and Asian culture influence to speech emotion respectively were collected and together with the standard Berlin speech emotion dataset were used to understand the speech emotion recognition system and the culture bias. The work is then extended to investigate speaker affective state profiling using the Valence-Arousal (VA) analysis approach that enables visualization tool to be utilized for intra-cultural and inter-cultural assessments. The strength of this VA approach is that it is able to facilitate the observation of new finding as well as catering to dynamic data-driven affective space model generation that is able to empirically verify the psychologists’ agreement of the affective space model. This proposed approach is developed to complement the discrete-class classification system that is rigid and lacking the explainable components. The result shows huge potential for future practical applications of such analysis system; which enables researchers, engineers, scientists, psychologists, medical practitioner as well as intelligent system developer to visualize emotions from a common view point. DOCTOR OF PHILOSOPHY (SCE) 2013-01-03T04:16:50Z 2013-01-03T04:16:50Z 2012 2012 Thesis Kamaruddin, N. (2012). Cognitive-inspired speaker affective state profiling. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/51057 10.32657/10356/51057 en 269 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
DRNTU::Engineering::Computer science and engineering::Computer applications::Social and behavioral sciences
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
DRNTU::Engineering::Computer science and engineering::Computer applications::Social and behavioral sciences
Norhaslinda Kamaruddin
Cognitive-inspired speaker affective state profiling
description Human behavior is influenced by emotion and human expressed affective state through numerous channels using non-verbal communication; namely: facial expression, gestures, eye-gazing, body postures, as well as verbal communication. In verbal communication itself, there are lots of underlying information transmitted using acoustical features and the semantic meaning of the word/sentence used. Despite the evident complexity of such interaction, listener still can correctly perceive the propagated emotion conveyed by the interlocutor. This is due to the human cognitive functional ability to dissect and infer the information with high accuracy and then react accordingly with appropriate behavioral responses and feedbacks. Hence, this research work introduces novel technique in discriminating emotion to facilitate the understanding of speaker affective state, based on the hypothesis that emotion is propagated through speech and it can be quantified. Speech emotion is a growing multi-disciplinary research field and is gaining greater momentum due to the increased need to improve on the quality of human computer interaction. Numerous researchers apply various feature extraction methods coupled with classifiers to produce acceptable accuracy performance. Nonetheless, the performance of such a system is bound to cultural influence which resulted in unpromising outcome once an unknown culture-influenced speech is introduced. Culture is always regarded as a trivial and inconsequential parameter that heeds minimal consideration in speech emotion recognition. Hence, in this work, the intricate relationship of cultural influence in term of intra-cultural and inter-cultural effects is studied in details. Two speech emotion datasets; of the NTU_American and NTU_Asian dataset representing the American and Asian culture influence to speech emotion respectively were collected and together with the standard Berlin speech emotion dataset were used to understand the speech emotion recognition system and the culture bias. The work is then extended to investigate speaker affective state profiling using the Valence-Arousal (VA) analysis approach that enables visualization tool to be utilized for intra-cultural and inter-cultural assessments. The strength of this VA approach is that it is able to facilitate the observation of new finding as well as catering to dynamic data-driven affective space model generation that is able to empirically verify the psychologists’ agreement of the affective space model. This proposed approach is developed to complement the discrete-class classification system that is rigid and lacking the explainable components. The result shows huge potential for future practical applications of such analysis system; which enables researchers, engineers, scientists, psychologists, medical practitioner as well as intelligent system developer to visualize emotions from a common view point.
author2 Quek Hiok Chai
author_facet Quek Hiok Chai
Norhaslinda Kamaruddin
format Theses and Dissertations
author Norhaslinda Kamaruddin
author_sort Norhaslinda Kamaruddin
title Cognitive-inspired speaker affective state profiling
title_short Cognitive-inspired speaker affective state profiling
title_full Cognitive-inspired speaker affective state profiling
title_fullStr Cognitive-inspired speaker affective state profiling
title_full_unstemmed Cognitive-inspired speaker affective state profiling
title_sort cognitive-inspired speaker affective state profiling
publishDate 2013
url https://hdl.handle.net/10356/51057
_version_ 1759857381625298944