The augmented human — seeing sounds

“Dinner table syndrome” describes the difficulties faced by those with hearing impairments in engaging in conversations with other speaking participants, which may lead to fatigue and social exclusion. This project designed and implemented an application which provides accessibility to hearing-impai...

Full description

Saved in:

Bibliographic Details
Main Author:	Lim, Nicole Sze Ting
Other Authors:	Cham Tat Jen
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Assistive technology Speech separation Speech recognition Hearing impariment Software engineering Web application
Online Access:	https://hdl.handle.net/10356/175150
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-175150
record_format	dspace
spelling	sg-ntu-dr.10356-1751502024-04-26T15:41:08Z The augmented human — seeing sounds Lim, Nicole Sze Ting Cham Tat Jen School of Computer Science and Engineering ASTJCham@ntu.edu.sg Computer and Information Science Assistive technology Speech separation Speech recognition Hearing impariment Software engineering Web application “Dinner table syndrome” describes the difficulties faced by those with hearing impairments in engaging in conversations with other speaking participants, which may lead to fatigue and social exclusion. This project designed and implemented an application which provides accessibility to hearing-impaired users in multi-speaker conversations, through automated captioning and speaker identification. Through a user-friendly web interface, a user uploads a video and adjusts settings for its processing. Speech separation and speech recognition models are then applied to produce an annotated video with captions differentiated by speaker. Four video annotation interfaces were proposed, of which floating captions tagged to each speaker’s face were preferred. The user is then able to monitor the processing of the video, and download the annotated video and a transcript with captions divided by speaker identity when processing is complete. Design choices made in the implementation of the final application are evaluated. Bachelor's degree 2024-04-22T05:35:24Z 2024-04-22T05:35:24Z 2024 Final Year Project (FYP) Lim, N. S. T. (2024). The augmented human — seeing sounds. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175150 https://hdl.handle.net/10356/175150 en SCSE23-0038 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Assistive technology Speech separation Speech recognition Hearing impariment Software engineering Web application
spellingShingle	Computer and Information Science Assistive technology Speech separation Speech recognition Hearing impariment Software engineering Web application Lim, Nicole Sze Ting The augmented human — seeing sounds
description	“Dinner table syndrome” describes the difficulties faced by those with hearing impairments in engaging in conversations with other speaking participants, which may lead to fatigue and social exclusion. This project designed and implemented an application which provides accessibility to hearing-impaired users in multi-speaker conversations, through automated captioning and speaker identification. Through a user-friendly web interface, a user uploads a video and adjusts settings for its processing. Speech separation and speech recognition models are then applied to produce an annotated video with captions differentiated by speaker. Four video annotation interfaces were proposed, of which floating captions tagged to each speaker’s face were preferred. The user is then able to monitor the processing of the video, and download the annotated video and a transcript with captions divided by speaker identity when processing is complete. Design choices made in the implementation of the final application are evaluated.
author2	Cham Tat Jen
author_facet	Cham Tat Jen Lim, Nicole Sze Ting
format	Final Year Project
author	Lim, Nicole Sze Ting
author_sort	Lim, Nicole Sze Ting
title	The augmented human — seeing sounds
title_short	The augmented human — seeing sounds
title_full	The augmented human — seeing sounds
title_fullStr	The augmented human — seeing sounds
title_full_unstemmed	The augmented human — seeing sounds
title_sort	augmented human — seeing sounds
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/175150
_version_	1814047432157167616

The augmented human — seeing sounds

Similar Items