Boundary detection of instructional video by speech
In the realm of education and online learning, accessing relevant information efficiently from instructional videos can be challenging due to the lack of structured navigation aids. This research proposes a novel method to enhance learning experiences by automatically generating meaningful timestamp...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175004 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | In the realm of education and online learning, accessing relevant information efficiently from instructional videos can be challenging due to the lack of structured navigation aids. This research proposes a novel method to enhance learning experiences by automatically generating meaningful timestamps accompanied with succinct descriptions within the instructional videos. The approach involves converting audio to text using speech-to-text technology, followed by Natural Language Processing (NLP) techniques to identify key moments within the transcribed content. Various methodologies, including SpaCy and MPNet, were explored to analyze semantic nuances and transitions in the video content which yielded bad results. As a result, Large Language Models (LLMs) were utilized for their capability to discern sentence semantics and intent. The study utilized datasets from HowTo100M and YouTube, evaluating the accuracy of the proposed method through metrics such as precision, recall, and missing steps. Results demonstrate promising outcomes, with the model exhibiting competitive performance, particularly in precision and recall for certain instructional tasks. The final product includes a user-friendly interface for seamless interaction, enabling users to access timestamps and descriptions for educational content. Overall, this research contributes to advancing the accessibility and usability of instructional videos, enhancing learning experiences for users worldwide. |
---|