Predicting MS Powerpoint mouse/keyboard actions

This project explores the application of Generative Pre-trained Transformer (GPT) models, specifically GPT-2 and GPT-3, for predicting the textual instructions corresponding to user actions in Microsoft PowerPoint, such as mouse movements and keyboard inputs. Through extensive experimentation and im...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Chong, Kass Min
مؤلفون آخرون: Li Boyang
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2024
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/175104
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:This project explores the application of Generative Pre-trained Transformer (GPT) models, specifically GPT-2 and GPT-3, for predicting the textual instructions corresponding to user actions in Microsoft PowerPoint, such as mouse movements and keyboard inputs. Through extensive experimentation and implementation, we were able to observe how soft prompting with GPT-2 and in-context learning with GPT-3 exceed baseline performance established through hyperparameter tuning of the GPT-2 model. This achievement is particularly notable in two domains: the prediction of user intentions and the prediction of procedural instructions. Hence, this study underscores the efficacy of these techniques in augmenting the capabilities of the employed models. By illustrating the potential of AI-driven solutions to streamline interactions with software applications, this work sets a foundation for a shift in user experience within productivity tools, driven by seamless, natural language commands.