MAGOR speech to text transcription
This project presents the development and deployment of MAGOR, a web application designed to facilitate audio and video transcription services without relying on internet connectivity. This project introduces a locally deployed speech-to-text system that mitigates security risks commonly assoc...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181070 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181070 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1810702024-11-13T08:30:38Z MAGOR speech to text transcription Lim, Yao Xian Chng Eng Siong College of Computing and Data Science ASESChng@ntu.edu.sg Computer and Information Science This project presents the development and deployment of MAGOR, a web application designed to facilitate audio and video transcription services without relying on internet connectivity. This project introduces a locally deployed speech-to-text system that mitigates security risks commonly associated with cloud-based solutions. By leveraging Docker for containerization, MAGOR ensures secure, isolated processing of audio and video data, which is crucial for environments with strict data privacy requirements. The core objective was to dockerize MAGOR and the associated Automatic Speech Recognition (ASR) gateway, ensuring seamless integration between them. This allows users to upload video files via MAGOR, which are then processed by the local ASR gateway to perform speech-to-text translations. The system is designed to support transcription differentiation for up to eight speakers, with distinct colours used to identify each speaker in the transcription. The application leverages modern technologies including React and Node.js to ensure a responsive and efficient user experience. Significant efforts were made to enhance the reliability of the system, including the successful implementation of the ASR request tracking feature, which indicates whether a recording has been successfully processed by the ASR gateway. Additionally, the statistics tab was developed to provide comprehensive insights into the usage and performance metrics of the system, demonstrating the project’s success in achieving its goals. Finally, important maintenance tasks, such as bug fixes, are also covered in this project. Bachelor's degree 2024-11-13T08:30:38Z 2024-11-13T08:30:38Z 2024 Final Year Project (FYP) Lim, Y. X. (2024). MAGOR speech to text transcription. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181070 https://hdl.handle.net/10356/181070 en SCSE23-0814 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science |
spellingShingle |
Computer and Information Science Lim, Yao Xian MAGOR speech to text transcription |
description |
This project presents the development and deployment of MAGOR, a web application designed
to facilitate audio and video transcription services without relying on internet connectivity.
This project introduces a locally deployed speech-to-text system that mitigates security risks
commonly associated with cloud-based solutions. By leveraging Docker for containerization,
MAGOR ensures secure, isolated processing of audio and video data, which is crucial for
environments with strict data privacy requirements. The core objective was to dockerize
MAGOR and the associated Automatic Speech Recognition (ASR) gateway, ensuring seamless
integration between them. This allows users to upload video files via MAGOR, which are
then processed by the local ASR gateway to perform speech-to-text translations. The system
is designed to support transcription differentiation for up to eight speakers, with distinct
colours used to identify each speaker in the transcription.
The application leverages modern technologies including React and Node.js to ensure a
responsive and efficient user experience. Significant efforts were made to enhance the reliability
of the system, including the successful implementation of the ASR request tracking feature,
which indicates whether a recording has been successfully processed by the ASR gateway.
Additionally, the statistics tab was developed to provide comprehensive insights into the usage
and performance metrics of the system, demonstrating the project’s success in achieving its
goals.
Finally, important maintenance tasks, such as bug fixes, are also covered in this project. |
author2 |
Chng Eng Siong |
author_facet |
Chng Eng Siong Lim, Yao Xian |
format |
Final Year Project |
author |
Lim, Yao Xian |
author_sort |
Lim, Yao Xian |
title |
MAGOR speech to text transcription |
title_short |
MAGOR speech to text transcription |
title_full |
MAGOR speech to text transcription |
title_fullStr |
MAGOR speech to text transcription |
title_full_unstemmed |
MAGOR speech to text transcription |
title_sort |
magor speech to text transcription |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181070 |
_version_ |
1816858987032739840 |