Machine learning for human-machine interaction

The rise of collaborative robots has transformed human lives, introducing more flexible and efficient robotic interaction and automation. This has opened new possibilities for human-robot collaborations, which are poised to play a crucial role in future manufacturing industries and robots-for-humans...

Full description

Saved in:
Bibliographic Details
Main Author: Zhou, Yufeng
Other Authors: Tan Yap Peng
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/173233
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The rise of collaborative robots has transformed human lives, introducing more flexible and efficient robotic interaction and automation. This has opened new possibilities for human-robot collaborations, which are poised to play a crucial role in future manufacturing industries and robots-for-humans applications. Recent research has enabled promising approaches in various domains, including procedure learning from instructional videos, key action segmentation, and human action anticipation. However, these approaches typically rely on extensive data collection and lack direct applicability to human-robot collaborations, where examples are limited, and situational requirements vary. Existing robots are often constrained by their limited task-specific designs, lacking the adapt ability to different scenarios and the intelligence required for effective collaboration with humans. Visual-based imitation learning, taking advantage of the rapid development of deep learning, provides a framework for learning complex manipulation skills by leveraging human demonstrations. This dissertation aims to analyze these challenges by studying a series of machine vision and learning algorithms which can enable robots to learn and model the key steps involved in manufacturing and robots-for-humans tasks and empower robots with the capability to understand human intentions and behaviors. Specifically, this dissertation studies a self-supervised visual representation module and analyzes the feasibility of a robust and customizable learning agent that can adapt itself to different human-robot collaboration scenarios and tasks. The following are the main research findings of this dissertation: (1) To derive meaningful visual representations from the provided demonstrations, a visual representation model is pre-trained. The visual representation was obtained by taking advantage of a combination of existing techniques, including Time Contrastive Learning (TCL) and Masked Autoencoder (MAE), utilizing a large-scale dataset which consists of frames from the Internet and egocentric videos. This pre-trained visual representation serves as an efficient perception head for subsequent stages of the imitation learning process. By employing this approach, a solid foundation is established for robust and effective perception-based learning. (2) The utilization of multiple established imitation policies is leveraged to assess the efficacy of a robot learning agent integrated with an efficient self-supervised representation module. This evaluation is conducted within the context of simulated robot manipulation tasks, with the intention of achieving proficient performance using a limited number of demonstrations. Through the validation of human-robot collaboration via the integration of machine vision and learning techniques, this dissertation attempts to explore the possibility of collaborative robots in industrial manufacturing and robots-for-humans applications.