Survey and design of embodied AI simulator for the research of generalizing task-planning in 3D environment via ActioNet

With the emerging paradigm shift from “internet AI” to “embodied AI”, AI algorithms and agents are no longer just learning from images, videos, or curated text-based datasets from the internet. Instead, learning has been through physical interactions with a dynamic environment, whether real or simul...

Full description

Saved in:
Bibliographic Details
Main Author: Duan, Jiafei
Other Authors: Wen Bihan
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/149171
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With the emerging paradigm shift from “internet AI” to “embodied AI”, AI algorithms and agents are no longer just learning from images, videos, or curated text-based datasets from the internet. Instead, learning has been through physical interactions with a dynamic environment, whether real or simulated. Hence, this project aims to further advance the research effort in embodied AI through its three different portions. The project first presented ActioNet, an interactive end-to-end platform for data collection and augmentation of a task-based dataset in a 3D environment. The ActioNet platform and dataset help facilitate the learning of hierarchical task planning for artificial agents in embodied AI simulators. Afterwhich, to further deepen the understanding of the field, the project proposed a survey of embodied AI from its simulators to research tasks. This survey paper is the first modern and extensive survey of this field. It provides a detailed benchmarking of nine modern embodied AI simulators and further introduced a pyramidal hierarchy that delves into the embodied AI research tasks while giving new insight into the field. Lastly, with the new insights and knowledge gained from the previous portions, the project further proposed SPECIAL, Simulator for Physics Enriched Conditions in Artificially synthesised environments for causal Learning. SPECIAL is a state-of-the-art embodied AI simulation framework that can synthesis three new research task datasets; containment, stability, and contact, which are all fundamental physical interaction. To my knowledge, the SPECIAL dataset is the largest complex physics scenario dataset, consisting of over 60k individual scene instances, with up to 8 million frames. The project also proposed and constructed a SPECIAL model to train AI systems to learn causal reasoning and intuitive physics in a virtual environment. The first portion of the project on ActioNet has been published in the International Conference on Image Processing (ICIP 2020), while the second portion of the project has been submitted to the Computer Vision and Image Understanding Journal. The dataset and results curated from the third portion of the project are also being used to prepare for submitting to the British Machine Vision Conference 2021. Notably, this project has been shortlisted as one of the top 7 finalists for the EEE FYP Challenge 2021.