MAGOR speech to text transcription

This project presents the development and deployment of MAGOR, a web application designed to facilitate audio and video transcription services without relying on internet connectivity. This project introduces a locally deployed speech-to-text system that mitigates security risks commonly assoc...

Full description

Saved in:

Bibliographic Details
Main Author:	Lim, Yao Xian
Other Authors:	Chng Eng Siong
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science
Online Access:	https://hdl.handle.net/10356/181070
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-181070
record_format	dspace
spelling	sg-ntu-dr.10356-1810702024-11-13T08:30:38Z MAGOR speech to text transcription Lim, Yao Xian Chng Eng Siong College of Computing and Data Science ASESChng@ntu.edu.sg Computer and Information Science This project presents the development and deployment of MAGOR, a web application designed to facilitate audio and video transcription services without relying on internet connectivity. This project introduces a locally deployed speech-to-text system that mitigates security risks commonly associated with cloud-based solutions. By leveraging Docker for containerization, MAGOR ensures secure, isolated processing of audio and video data, which is crucial for environments with strict data privacy requirements. The core objective was to dockerize MAGOR and the associated Automatic Speech Recognition (ASR) gateway, ensuring seamless integration between them. This allows users to upload video files via MAGOR, which are then processed by the local ASR gateway to perform speech-to-text translations. The system is designed to support transcription differentiation for up to eight speakers, with distinct colours used to identify each speaker in the transcription. The application leverages modern technologies including React and Node.js to ensure a responsive and efficient user experience. Significant efforts were made to enhance the reliability of the system, including the successful implementation of the ASR request tracking feature, which indicates whether a recording has been successfully processed by the ASR gateway. Additionally, the statistics tab was developed to provide comprehensive insights into the usage and performance metrics of the system, demonstrating the project’s success in achieving its goals. Finally, important maintenance tasks, such as bug fixes, are also covered in this project. Bachelor's degree 2024-11-13T08:30:38Z 2024-11-13T08:30:38Z 2024 Final Year Project (FYP) Lim, Y. X. (2024). MAGOR speech to text transcription. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181070 https://hdl.handle.net/10356/181070 en SCSE23-0814 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science
spellingShingle	Computer and Information Science Lim, Yao Xian MAGOR speech to text transcription
description	This project presents the development and deployment of MAGOR, a web application designed to facilitate audio and video transcription services without relying on internet connectivity. This project introduces a locally deployed speech-to-text system that mitigates security risks commonly associated with cloud-based solutions. By leveraging Docker for containerization, MAGOR ensures secure, isolated processing of audio and video data, which is crucial for environments with strict data privacy requirements. The core objective was to dockerize MAGOR and the associated Automatic Speech Recognition (ASR) gateway, ensuring seamless integration between them. This allows users to upload video files via MAGOR, which are then processed by the local ASR gateway to perform speech-to-text translations. The system is designed to support transcription differentiation for up to eight speakers, with distinct colours used to identify each speaker in the transcription. The application leverages modern technologies including React and Node.js to ensure a responsive and efficient user experience. Significant efforts were made to enhance the reliability of the system, including the successful implementation of the ASR request tracking feature, which indicates whether a recording has been successfully processed by the ASR gateway. Additionally, the statistics tab was developed to provide comprehensive insights into the usage and performance metrics of the system, demonstrating the project’s success in achieving its goals. Finally, important maintenance tasks, such as bug fixes, are also covered in this project.
author2	Chng Eng Siong
author_facet	Chng Eng Siong Lim, Yao Xian
format	Final Year Project
author	Lim, Yao Xian
author_sort	Lim, Yao Xian
title	MAGOR speech to text transcription
title_short	MAGOR speech to text transcription
title_full	MAGOR speech to text transcription
title_fullStr	MAGOR speech to text transcription
title_full_unstemmed	MAGOR speech to text transcription
title_sort	magor speech to text transcription
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/181070
_version_	1816858987032739840

MAGOR speech to text transcription

Similar Items