Development of a Mandarin learning tool for children using speech recognition model
This project report explores the evaluation performance of speech recognition and generation models specifically for short Mandarin phrases and children's voices. It introduces a Mandarin learning application prototype framework that leverages these models, which have been finetuned to recog...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/177146 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This project report explores the evaluation performance of speech recognition and
generation models specifically for short Mandarin phrases and children's voices. It
introduces a Mandarin learning application prototype framework that leverages these
models, which have been finetuned to recognize nuances in children’s voice and short
Chinese phrases. The primary goal of this study was to forge a developmental pathway for
a learning tool designed to significantly enhance the educational experience of children.
Presenting a tool framework focuses on improving pronunciation, intonation, and
understanding of Chinese characters (汉字) through a structured pedagogical approach.
This project is the extensive adaptation of the Whisper Model, engineered to overcome the
inherent variability in children's speech patterns and the tonal complexity of Mandarin.
Our approach involved a systematic methodology comprising the assembly of a children
audio dataset, model performance testing with a focus on children's voices, and fine-tuning
to elevate the model's acuity for concise Mandarin phrases.
The prototype framework serves as a proof of concept, demonstrating the capabilities of
the model in a structured educational context. It outlines the envisioned interactive
modules aimed at reinforcing pronunciation, intonation, and character recognition,
fostering a comprehensive learning experience.
The project successfully demonstrated the Whisper model's performance at recognising
short phrases articulated by both adults and children. This success underpins the model's
enhancements to better serve the unique needs of young learners and short phrase
recognition, culminating in the introduction of an educational application prototype
framework. This prototype harnesses speech technology to facilitate language learning,
thereby showcasing the potential of integrating speech recognition and generation
technologies into educational tools. The findings lay a crucial groundwork for future
research and development in this field. |
---|