Boundary detection of instructional video by speech
In the realm of education and online learning, accessing relevant information efficiently from instructional videos can be challenging due to the lack of structured navigation aids. This research proposes a novel method to enhance learning experiences by automatically generating meaningful timestamp...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175004 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-175004 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1750042024-04-19T15:44:56Z Boundary detection of instructional video by speech Tan, Brandon Jun Kai Yeo Chai Kiat School of Computer Science and Engineering ASCKYEO@ntu.edu.sg Computer and Information Science Boundary detection In the realm of education and online learning, accessing relevant information efficiently from instructional videos can be challenging due to the lack of structured navigation aids. This research proposes a novel method to enhance learning experiences by automatically generating meaningful timestamps accompanied with succinct descriptions within the instructional videos. The approach involves converting audio to text using speech-to-text technology, followed by Natural Language Processing (NLP) techniques to identify key moments within the transcribed content. Various methodologies, including SpaCy and MPNet, were explored to analyze semantic nuances and transitions in the video content which yielded bad results. As a result, Large Language Models (LLMs) were utilized for their capability to discern sentence semantics and intent. The study utilized datasets from HowTo100M and YouTube, evaluating the accuracy of the proposed method through metrics such as precision, recall, and missing steps. Results demonstrate promising outcomes, with the model exhibiting competitive performance, particularly in precision and recall for certain instructional tasks. The final product includes a user-friendly interface for seamless interaction, enabling users to access timestamps and descriptions for educational content. Overall, this research contributes to advancing the accessibility and usability of instructional videos, enhancing learning experiences for users worldwide. Bachelor's degree 2024-04-18T06:29:50Z 2024-04-18T06:29:50Z 2024 Final Year Project (FYP) Tan, B. J. K. (2024). Boundary detection of instructional video by speech. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175004 https://hdl.handle.net/10356/175004 en SCSE23-0437 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Boundary detection |
spellingShingle |
Computer and Information Science Boundary detection Tan, Brandon Jun Kai Boundary detection of instructional video by speech |
description |
In the realm of education and online learning, accessing relevant information efficiently from instructional videos can be challenging due to the lack of structured navigation aids. This research proposes a novel method to enhance learning experiences by automatically generating meaningful timestamps accompanied with succinct descriptions within the instructional videos. The approach involves converting audio to text using speech-to-text technology, followed by Natural Language Processing (NLP) techniques to identify key moments within the transcribed content. Various methodologies, including SpaCy and MPNet, were explored to analyze semantic nuances and transitions in the video content which yielded bad results. As a result, Large Language Models (LLMs) were utilized for their capability to discern sentence semantics and intent. The study utilized datasets from HowTo100M and YouTube, evaluating the accuracy of the proposed method through metrics such as precision, recall, and missing steps. Results demonstrate promising outcomes, with the model exhibiting competitive performance, particularly in precision and recall for certain instructional tasks. The final product includes a user-friendly interface for seamless interaction, enabling users to access timestamps and descriptions for educational content. Overall, this research contributes to advancing the accessibility and usability of instructional videos, enhancing learning experiences for users worldwide. |
author2 |
Yeo Chai Kiat |
author_facet |
Yeo Chai Kiat Tan, Brandon Jun Kai |
format |
Final Year Project |
author |
Tan, Brandon Jun Kai |
author_sort |
Tan, Brandon Jun Kai |
title |
Boundary detection of instructional video by speech |
title_short |
Boundary detection of instructional video by speech |
title_full |
Boundary detection of instructional video by speech |
title_fullStr |
Boundary detection of instructional video by speech |
title_full_unstemmed |
Boundary detection of instructional video by speech |
title_sort |
boundary detection of instructional video by speech |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/175004 |
_version_ |
1800916414731649024 |