Seeing sounds : sound visualization and speech to text conversion using laptop and microphone array
A large population of people is suffering from hearing loss and inconveniences caused due to lack of auditory sense. There are different types of hearing aids available in the market but due to the discomfort associated with prolong use and the stigma of being recognized as a handicapped person, the...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/145083 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | A large population of people is suffering from hearing loss and inconveniences caused due to lack of auditory sense. There are different types of hearing aids available in the market but due to the discomfort associated with prolong use and the stigma of being recognized as a handicapped person, they are not used often or are abandoned after several years. Therefore, this project aims to explore a way of using mixed reality and sound localization to highlight the existing sound sources in a live video stream to help hearing-impairment identify the sound. The speech-to-text conversion will be implemented to help the conversation. In this project, two approaches were explored to map the sound sources on the video stream, including using mathematical formulas and machine learning. For mathematical formula approach, to find the best-fit equation, pseudo-inverse and dot product were used to find the relationship between sound source coordinates and image coordinates. The best-fit equation was then used to map the sound source to the video stream. Machine learning approach was able to achieve better mapping accuracy in a wider range distance comparing to mathematical approach, however due to the prediction speed, it couldn’t be used in this project. Overall sound tracking and speech-to-text conversion was successfully achieved to a certain extend in this project. The future improvement can be using a smartphone camera as a platform to achieve better mobility and convenience. |
---|