MAGOR speech to text transcription

This project presents the development and deployment of MAGOR, a web application designed to facilitate audio and video transcription services without relying on internet connectivity. This project introduces a locally deployed speech-to-text system that mitigates security risks commonly assoc...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Yao Xian
Other Authors: Chng Eng Siong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181070
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181070
record_format dspace
spelling sg-ntu-dr.10356-1810702024-11-13T08:30:38Z MAGOR speech to text transcription Lim, Yao Xian Chng Eng Siong College of Computing and Data Science ASESChng@ntu.edu.sg Computer and Information Science This project presents the development and deployment of MAGOR, a web application designed to facilitate audio and video transcription services without relying on internet connectivity. This project introduces a locally deployed speech-to-text system that mitigates security risks commonly associated with cloud-based solutions. By leveraging Docker for containerization, MAGOR ensures secure, isolated processing of audio and video data, which is crucial for environments with strict data privacy requirements. The core objective was to dockerize MAGOR and the associated Automatic Speech Recognition (ASR) gateway, ensuring seamless integration between them. This allows users to upload video files via MAGOR, which are then processed by the local ASR gateway to perform speech-to-text translations. The system is designed to support transcription differentiation for up to eight speakers, with distinct colours used to identify each speaker in the transcription. The application leverages modern technologies including React and Node.js to ensure a responsive and efficient user experience. Significant efforts were made to enhance the reliability of the system, including the successful implementation of the ASR request tracking feature, which indicates whether a recording has been successfully processed by the ASR gateway. Additionally, the statistics tab was developed to provide comprehensive insights into the usage and performance metrics of the system, demonstrating the project’s success in achieving its goals. Finally, important maintenance tasks, such as bug fixes, are also covered in this project. Bachelor's degree 2024-11-13T08:30:38Z 2024-11-13T08:30:38Z 2024 Final Year Project (FYP) Lim, Y. X. (2024). MAGOR speech to text transcription. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181070 https://hdl.handle.net/10356/181070 en SCSE23-0814 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
spellingShingle Computer and Information Science
Lim, Yao Xian
MAGOR speech to text transcription
description This project presents the development and deployment of MAGOR, a web application designed to facilitate audio and video transcription services without relying on internet connectivity. This project introduces a locally deployed speech-to-text system that mitigates security risks commonly associated with cloud-based solutions. By leveraging Docker for containerization, MAGOR ensures secure, isolated processing of audio and video data, which is crucial for environments with strict data privacy requirements. The core objective was to dockerize MAGOR and the associated Automatic Speech Recognition (ASR) gateway, ensuring seamless integration between them. This allows users to upload video files via MAGOR, which are then processed by the local ASR gateway to perform speech-to-text translations. The system is designed to support transcription differentiation for up to eight speakers, with distinct colours used to identify each speaker in the transcription. The application leverages modern technologies including React and Node.js to ensure a responsive and efficient user experience. Significant efforts were made to enhance the reliability of the system, including the successful implementation of the ASR request tracking feature, which indicates whether a recording has been successfully processed by the ASR gateway. Additionally, the statistics tab was developed to provide comprehensive insights into the usage and performance metrics of the system, demonstrating the project’s success in achieving its goals. Finally, important maintenance tasks, such as bug fixes, are also covered in this project.
author2 Chng Eng Siong
author_facet Chng Eng Siong
Lim, Yao Xian
format Final Year Project
author Lim, Yao Xian
author_sort Lim, Yao Xian
title MAGOR speech to text transcription
title_short MAGOR speech to text transcription
title_full MAGOR speech to text transcription
title_fullStr MAGOR speech to text transcription
title_full_unstemmed MAGOR speech to text transcription
title_sort magor speech to text transcription
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/181070
_version_ 1816858987032739840