Boundary detection of instructional video by speech

In the realm of education and online learning, accessing relevant information efficiently from instructional videos can be challenging due to the lack of structured navigation aids. This research proposes a novel method to enhance learning experiences by automatically generating meaningful timestamp...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Brandon Jun Kai
Other Authors: Yeo Chai Kiat
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175004
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-175004
record_format dspace
spelling sg-ntu-dr.10356-1750042024-04-19T15:44:56Z Boundary detection of instructional video by speech Tan, Brandon Jun Kai Yeo Chai Kiat School of Computer Science and Engineering ASCKYEO@ntu.edu.sg Computer and Information Science Boundary detection In the realm of education and online learning, accessing relevant information efficiently from instructional videos can be challenging due to the lack of structured navigation aids. This research proposes a novel method to enhance learning experiences by automatically generating meaningful timestamps accompanied with succinct descriptions within the instructional videos. The approach involves converting audio to text using speech-to-text technology, followed by Natural Language Processing (NLP) techniques to identify key moments within the transcribed content. Various methodologies, including SpaCy and MPNet, were explored to analyze semantic nuances and transitions in the video content which yielded bad results. As a result, Large Language Models (LLMs) were utilized for their capability to discern sentence semantics and intent. The study utilized datasets from HowTo100M and YouTube, evaluating the accuracy of the proposed method through metrics such as precision, recall, and missing steps. Results demonstrate promising outcomes, with the model exhibiting competitive performance, particularly in precision and recall for certain instructional tasks. The final product includes a user-friendly interface for seamless interaction, enabling users to access timestamps and descriptions for educational content. Overall, this research contributes to advancing the accessibility and usability of instructional videos, enhancing learning experiences for users worldwide. Bachelor's degree 2024-04-18T06:29:50Z 2024-04-18T06:29:50Z 2024 Final Year Project (FYP) Tan, B. J. K. (2024). Boundary detection of instructional video by speech. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175004 https://hdl.handle.net/10356/175004 en SCSE23-0437 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Boundary detection
spellingShingle Computer and Information Science
Boundary detection
Tan, Brandon Jun Kai
Boundary detection of instructional video by speech
description In the realm of education and online learning, accessing relevant information efficiently from instructional videos can be challenging due to the lack of structured navigation aids. This research proposes a novel method to enhance learning experiences by automatically generating meaningful timestamps accompanied with succinct descriptions within the instructional videos. The approach involves converting audio to text using speech-to-text technology, followed by Natural Language Processing (NLP) techniques to identify key moments within the transcribed content. Various methodologies, including SpaCy and MPNet, were explored to analyze semantic nuances and transitions in the video content which yielded bad results. As a result, Large Language Models (LLMs) were utilized for their capability to discern sentence semantics and intent. The study utilized datasets from HowTo100M and YouTube, evaluating the accuracy of the proposed method through metrics such as precision, recall, and missing steps. Results demonstrate promising outcomes, with the model exhibiting competitive performance, particularly in precision and recall for certain instructional tasks. The final product includes a user-friendly interface for seamless interaction, enabling users to access timestamps and descriptions for educational content. Overall, this research contributes to advancing the accessibility and usability of instructional videos, enhancing learning experiences for users worldwide.
author2 Yeo Chai Kiat
author_facet Yeo Chai Kiat
Tan, Brandon Jun Kai
format Final Year Project
author Tan, Brandon Jun Kai
author_sort Tan, Brandon Jun Kai
title Boundary detection of instructional video by speech
title_short Boundary detection of instructional video by speech
title_full Boundary detection of instructional video by speech
title_fullStr Boundary detection of instructional video by speech
title_full_unstemmed Boundary detection of instructional video by speech
title_sort boundary detection of instructional video by speech
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/175004
_version_ 1800916414731649024