Collecting and annotating videos that teach MS PowerPoint
The central aim of this project is to generate a comprehensive dataset for training an artificial intelligence (AI) that is able to operate Microsoft PowerPoint autonomously. This project encompasses several different phases: Starting with the identification of videos that teach Microsoft PowerPo...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/171932 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The central aim of this project is to generate a comprehensive dataset for training an
artificial intelligence (AI) that is able to operate Microsoft PowerPoint autonomously.
This project encompasses several different phases: Starting with the identification of
videos that teach Microsoft PowerPoint following which we will download the identified
videos using Jupyter Notebook with the help of the Pytube library. This is followed by
the transcribing of videos that lack closed captions with the Whisper Model. Following
this, the annotation process is then executed whereby the keystroke and the mouse
clicks are then labeled using Sequence labeling in Doccano. The project then transits
into the model training phase where both T5 and FLAN-T5 neural network models are
experimented on for their ability to interpret and translate narrated instructions into
corresponding mouse and keyboard actions to decide which model would achieve the
better performance. Given the limitations of YouTube’s dataset, data augmentation
techniques were employed using ChatGPT to improve model training. |
---|