Machine learning for human-machine interaction

The rise of collaborative robots has transformed human lives, introducing more flexible and efficient robotic interaction and automation. This has opened new possibilities for human-robot collaborations, which are poised to play a crucial role in future manufacturing industries and robots-for-humans...

Full description

Saved in:
Bibliographic Details
Main Author: Zhou, Yufeng
Other Authors: Tan Yap Peng
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/173233
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-173233
record_format dspace
spelling sg-ntu-dr.10356-1732332024-01-26T15:42:03Z Machine learning for human-machine interaction Zhou, Yufeng Tan Yap Peng School of Electrical and Electronic Engineering EYPTan@ntu.edu.sg Engineering::Electrical and electronic engineering The rise of collaborative robots has transformed human lives, introducing more flexible and efficient robotic interaction and automation. This has opened new possibilities for human-robot collaborations, which are poised to play a crucial role in future manufacturing industries and robots-for-humans applications. Recent research has enabled promising approaches in various domains, including procedure learning from instructional videos, key action segmentation, and human action anticipation. However, these approaches typically rely on extensive data collection and lack direct applicability to human-robot collaborations, where examples are limited, and situational requirements vary. Existing robots are often constrained by their limited task-specific designs, lacking the adapt ability to different scenarios and the intelligence required for effective collaboration with humans. Visual-based imitation learning, taking advantage of the rapid development of deep learning, provides a framework for learning complex manipulation skills by leveraging human demonstrations. This dissertation aims to analyze these challenges by studying a series of machine vision and learning algorithms which can enable robots to learn and model the key steps involved in manufacturing and robots-for-humans tasks and empower robots with the capability to understand human intentions and behaviors. Specifically, this dissertation studies a self-supervised visual representation module and analyzes the feasibility of a robust and customizable learning agent that can adapt itself to different human-robot collaboration scenarios and tasks. The following are the main research findings of this dissertation: (1) To derive meaningful visual representations from the provided demonstrations, a visual representation model is pre-trained. The visual representation was obtained by taking advantage of a combination of existing techniques, including Time Contrastive Learning (TCL) and Masked Autoencoder (MAE), utilizing a large-scale dataset which consists of frames from the Internet and egocentric videos. This pre-trained visual representation serves as an efficient perception head for subsequent stages of the imitation learning process. By employing this approach, a solid foundation is established for robust and effective perception-based learning. (2) The utilization of multiple established imitation policies is leveraged to assess the efficacy of a robot learning agent integrated with an efficient self-supervised representation module. This evaluation is conducted within the context of simulated robot manipulation tasks, with the intention of achieving proficient performance using a limited number of demonstrations. Through the validation of human-robot collaboration via the integration of machine vision and learning techniques, this dissertation attempts to explore the possibility of collaborative robots in industrial manufacturing and robots-for-humans applications. Master's degree 2024-01-22T00:33:24Z 2024-01-22T00:33:24Z 2023 Thesis-Master by Coursework Zhou, Y. (2023). Machine learning for human-machine interaction. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/173233 https://hdl.handle.net/10356/173233 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
spellingShingle Engineering::Electrical and electronic engineering
Zhou, Yufeng
Machine learning for human-machine interaction
description The rise of collaborative robots has transformed human lives, introducing more flexible and efficient robotic interaction and automation. This has opened new possibilities for human-robot collaborations, which are poised to play a crucial role in future manufacturing industries and robots-for-humans applications. Recent research has enabled promising approaches in various domains, including procedure learning from instructional videos, key action segmentation, and human action anticipation. However, these approaches typically rely on extensive data collection and lack direct applicability to human-robot collaborations, where examples are limited, and situational requirements vary. Existing robots are often constrained by their limited task-specific designs, lacking the adapt ability to different scenarios and the intelligence required for effective collaboration with humans. Visual-based imitation learning, taking advantage of the rapid development of deep learning, provides a framework for learning complex manipulation skills by leveraging human demonstrations. This dissertation aims to analyze these challenges by studying a series of machine vision and learning algorithms which can enable robots to learn and model the key steps involved in manufacturing and robots-for-humans tasks and empower robots with the capability to understand human intentions and behaviors. Specifically, this dissertation studies a self-supervised visual representation module and analyzes the feasibility of a robust and customizable learning agent that can adapt itself to different human-robot collaboration scenarios and tasks. The following are the main research findings of this dissertation: (1) To derive meaningful visual representations from the provided demonstrations, a visual representation model is pre-trained. The visual representation was obtained by taking advantage of a combination of existing techniques, including Time Contrastive Learning (TCL) and Masked Autoencoder (MAE), utilizing a large-scale dataset which consists of frames from the Internet and egocentric videos. This pre-trained visual representation serves as an efficient perception head for subsequent stages of the imitation learning process. By employing this approach, a solid foundation is established for robust and effective perception-based learning. (2) The utilization of multiple established imitation policies is leveraged to assess the efficacy of a robot learning agent integrated with an efficient self-supervised representation module. This evaluation is conducted within the context of simulated robot manipulation tasks, with the intention of achieving proficient performance using a limited number of demonstrations. Through the validation of human-robot collaboration via the integration of machine vision and learning techniques, this dissertation attempts to explore the possibility of collaborative robots in industrial manufacturing and robots-for-humans applications.
author2 Tan Yap Peng
author_facet Tan Yap Peng
Zhou, Yufeng
format Thesis-Master by Coursework
author Zhou, Yufeng
author_sort Zhou, Yufeng
title Machine learning for human-machine interaction
title_short Machine learning for human-machine interaction
title_full Machine learning for human-machine interaction
title_fullStr Machine learning for human-machine interaction
title_full_unstemmed Machine learning for human-machine interaction
title_sort machine learning for human-machine interaction
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/173233
_version_ 1789482958867922944