Deep-learning for conversational speech using semantic textual analysis

Automatic Speech Recognition (ASR) systems today have a prominent and widespread impact among software applications of different domains. They are usually embedded in the applications to provide user input to the main functionality, hence, acting as the cornerstone of these applications, especia...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Suthakar, Shiny Gladdys
مؤلفون آخرون: Chng Eng Siong
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2022
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/156632
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Automatic Speech Recognition (ASR) systems today have a prominent and widespread impact among software applications of different domains. They are usually embedded in the applications to provide user input to the main functionality, hence, acting as the cornerstone of these applications, especially potentially life-saving ones. However, most ASR systems today can only work effectively on formal speech input. They have a lot of room to fully understand speech of colloquial nature. Focusing on English speech in the Singaporean context, this project aims to provide a solution for generating formal semantic equivalents of conversational sentences derived from speech. Thus, acoustic and language models of existing ASR systems can be trained with these mappings from conversational to formal text, thus acquiring better comprehension and performance when receiving informal speech input. Furthermore, this project aims to analyse the semantic similarity performance of deep-learning models in terms of semantically similar formal sentence generation and their use of deep-learning techniques. The experimentation results show that the PEGASUS model performs better holistically. This report will present the proposed solution framework and lay out in detail the components of the project implementation.