Learning to anticipate and forecast human actions from videos

Action Anticipation and forecasting aims to predict future actions by processing videos containing past and current observations. In this project, we develop new methods based on the encoder-decoder architecture with Transformer models to anticipate and forecast future human actions by proce...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Peh, Eric Zheng Quan
مؤلفون آخرون: Soh Cheong Boon
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2022
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/158618
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Action Anticipation and forecasting aims to predict future actions by processing videos containing past and current observations. In this project, we develop new methods based on the encoder-decoder architecture with Transformer models to anticipate and forecast future human actions by processing videos. The model will observe a video for several seconds (or minutes) and then encodes information of the video to predict plausible human action that are going to happen in the future. Temporal information from videos will be extracted from deep neural networks. The performance of these models will then be evaluated on standard action forecasting datasets such as Breakfast and 50Salads datasets