Determining human intention in videos I

Human intention is a temporal sequence of human actions to achieve a goal. Determining human intentions is highly useful in many situations. It can enable better human-robot collaboration whereby robots are required to help human users. It is also useful in analysing human behaviours in dynamic env...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Hoong, Jia Qi
مؤلفون آخرون: Cham Tat Jen
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2022
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/158063
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
id sg-ntu-dr.10356-158063
record_format dspace
spelling sg-ntu-dr.10356-1580632022-05-26T07:23:32Z Determining human intention in videos I Hoong, Jia Qi Cham Tat Jen School of Computer Science and Engineering ASTJCham@ntu.edu.sg Engineering::Computer science and engineering Human intention is a temporal sequence of human actions to achieve a goal. Determining human intentions is highly useful in many situations. It can enable better human-robot collaboration whereby robots are required to help human users. It is also useful in analysing human behaviours in dynamic environment, such as monitoring mobile patients in hospitals or monitoring athletes in tournaments. In this work, we focus on predicting future action from past observations in egocentric videos. This is known as egocentric action anticipation. Egocentric videos are videos that record the human actions in a first-person perspective. This research shall analyse a deep learning framework proposed by Furnari and Farinella [1]. The framework is a multimodal network consisting of (1) Rolling-Unrolling LSTM models for anticipating actions from egocentric videos using multi-modal features and (2) a Modality ATTention (MATT) mechanism for fusing multi-modal predictions. Moreover, the multimodal network shall be extended on other modalities, specifically using monocular depth for egocentric action anticipation. Bachelor of Engineering (Computer Science) 2022-05-26T07:23:31Z 2022-05-26T07:23:31Z 2022 Final Year Project (FYP) Hoong, J. Q. (2022). Determining human intention in videos I. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/158063 https://hdl.handle.net/10356/158063 en SCSE21-0253 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Hoong, Jia Qi
Determining human intention in videos I
description Human intention is a temporal sequence of human actions to achieve a goal. Determining human intentions is highly useful in many situations. It can enable better human-robot collaboration whereby robots are required to help human users. It is also useful in analysing human behaviours in dynamic environment, such as monitoring mobile patients in hospitals or monitoring athletes in tournaments. In this work, we focus on predicting future action from past observations in egocentric videos. This is known as egocentric action anticipation. Egocentric videos are videos that record the human actions in a first-person perspective. This research shall analyse a deep learning framework proposed by Furnari and Farinella [1]. The framework is a multimodal network consisting of (1) Rolling-Unrolling LSTM models for anticipating actions from egocentric videos using multi-modal features and (2) a Modality ATTention (MATT) mechanism for fusing multi-modal predictions. Moreover, the multimodal network shall be extended on other modalities, specifically using monocular depth for egocentric action anticipation.
author2 Cham Tat Jen
author_facet Cham Tat Jen
Hoong, Jia Qi
format Final Year Project
author Hoong, Jia Qi
author_sort Hoong, Jia Qi
title Determining human intention in videos I
title_short Determining human intention in videos I
title_full Determining human intention in videos I
title_fullStr Determining human intention in videos I
title_full_unstemmed Determining human intention in videos I
title_sort determining human intention in videos i
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/158063
_version_ 1734310169976766464