Deep-learning for conversational speech using semantic textual analysis
Automatic Speech Recognition (ASR) systems today have a prominent and widespread impact among software applications of different domains. They are usually embedded in the applications to provide user input to the main functionality, hence, acting as the cornerstone of these applications, especia...
محفوظ في:
المؤلف الرئيسي: | |
---|---|
مؤلفون آخرون: | |
التنسيق: | Final Year Project |
اللغة: | English |
منشور في: |
Nanyang Technological University
2022
|
الموضوعات: | |
الوصول للمادة أونلاين: | https://hdl.handle.net/10356/156632 |
الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
الملخص: | Automatic Speech Recognition (ASR) systems today have a prominent and
widespread impact among software applications of different domains.
They are usually embedded in the applications to provide user input to
the main functionality, hence, acting as the cornerstone of these
applications, especially potentially life-saving ones. However, most ASR
systems today can only work effectively on formal speech input. They
have a lot of room to fully understand speech of colloquial nature.
Focusing on English speech in the Singaporean context, this project
aims to provide a solution for generating formal semantic equivalents of
conversational sentences derived from speech. Thus, acoustic and
language models of existing ASR systems can be trained with these
mappings from conversational to formal text, thus acquiring better
comprehension and performance when receiving informal speech input.
Furthermore, this project aims to analyse the semantic similarity
performance of deep-learning models in terms of semantically similar
formal sentence generation and their use of deep-learning techniques. The
experimentation results show that the PEGASUS model performs better
holistically. This report will present the proposed solution framework and
lay out in detail the components of the project implementation. |
---|