The augmented human — seeing sounds
“Dinner table syndrome” describes the difficulties faced by those with hearing impairments in engaging in conversations with other speaking participants, which may lead to fatigue and social exclusion. This project designed and implemented an application which provides accessibility to hearing-impai...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175150 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | “Dinner table syndrome” describes the difficulties faced by those with hearing impairments in engaging in conversations with other speaking participants, which may lead to fatigue and social exclusion. This project designed and implemented an application which provides accessibility to hearing-impaired users in multi-speaker conversations, through automated captioning and speaker identification. Through a user-friendly web interface, a user uploads a video and adjusts settings for its processing. Speech separation and speech recognition models are then applied to produce an annotated video with captions differentiated by speaker. Four video annotation interfaces were proposed, of which floating captions tagged to each speaker’s face were preferred. The user is then able to monitor the processing of the video, and download the annotated video and a transcript with captions divided by speaker identity when processing is complete. Design choices made in the implementation of the final application are evaluated. |
---|