Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning

Exploring an unknown environment by multiple autonomous robots is a major challenge in the robotics domain. The robot or agent needs to incrementally construct a model or a map representation of the environment while performing its domain tasks like surveillance, search and rescue tasks, and cleanin...

Full description

Saved in:

Bibliographic Details
Main Author:	Chen, Zichen
Other Authors:	Tan Ah Hwee
Format:	Thesis-Master by Research
Language:	English
Published:	Nanyang Technological University 2020
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	https://hdl.handle.net/10356/137152
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-137152
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Chen, Zichen Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning
description	Exploring an unknown environment by multiple autonomous robots is a major challenge in the robotics domain. The robot or agent needs to incrementally construct a model or a map representation of the environment while performing its domain tasks like surveillance, search and rescue tasks, and cleaning. What the robot should do or where it should go to visit next can only be determined after the map is constructed at least partially. The typical approach is by taking a frontier point which is located in the boundary between a known area and an unknown region as the target location to visit. This point is selected from other frontiers as revealed whenever the robots observe the environment. However, when multiple robots are involved, the task becomes more challenging as they have to explore the unknown environment as efficient and fast as possible while avoiding conflicts or interferences among the robots that can reduce the efficiency. Although coordinating a team of autonomous robots to explore an unknown environment can be done in an efficient way, partitioning the map of the environment into separate regions or segments as the targets allocated to the robots to visit is an efficient approach. The partitioning must be performed continually and incrementally. There is a trade-off that generating many small segments can provide more details of the environment, but may lose the representation of larger areas that are useful and relevant to the exploration task at hand. A Hierarchical Adaptive Clustering (HAC) segmentation of the indoor environment is introduced in this thesis that can strike a balance between fine-grained clustering and generalized segmentation during the exploration. With the HAC approach, an effective multi-agent task allocation approach is developed, wherein the partitioning and allocation processes can be performed continually and incrementally in real-time. Experimental results on HAC-based exploration method shows that it is comparable with other state-of-the-art approaches including Frontier-based allocation and Voronoi-based exploration. The model outperforms the others in terms of meaningful topological clusters and efficient exploration. However, non-learning based methods usually employ a fixed strategy to allocate the robots or agents to explore selected locations that sometimes can not handle the unpredictable and dynamic situations well. These methods can be effective in a single robot case, but assigning multiple robots to explore different locations is challenging since individual robots may interfere with others, making the overall tasks less efficient. A learning-based approach is proposed to solve those issues in this thesis. The algorithm is called CNN-based Multi-agent Proximal Policy Optimization (CMAPPO), which is for allocating multiple robots to explore different environments while over time improving their strategies to allocate the tasks more efficiently and flexibly. This algorithm combines CNN to process multi-channel visual inputs from the observed environment, curriculum learning for improving learning efficiency, and PPO algorithm for motivation based reinforcement learning. Based on the evaluation, the CMAPPO can learn a more efficient strategy for multiple robots (the robot is named agent in the rest of this thesis) to explore the environment than the conventional frontier-based method. This thesis introduces a novel indoor space segmentation-based exploration method which is based on topological clusters of an enclosed environment to perform multi-agent exploration. Considering the dynamic situations in the environment, this thesis further develops a new end-to-end deep reinforcement learning architecture for multi-agent exploration strategy by using Convolutional Neural Network (CNN) and Proximal Policy Optimization (PPO).
author2	Tan Ah Hwee
author_facet	Tan Ah Hwee Chen, Zichen
format	Thesis-Master by Research
author	Chen, Zichen
author_sort	Chen, Zichen
title	Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning
title_short	Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning
title_full	Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning
title_fullStr	Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning
title_full_unstemmed	Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning
title_sort	towards coordinated multi-agent exploration problem via segmentation and reinforcement learning
publisher	Nanyang Technological University
publishDate	2020
url	https://hdl.handle.net/10356/137152
_version_	1683494148759355392
spelling	sg-ntu-dr.10356-1371522020-10-28T08:29:18Z Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning Chen, Zichen Tan Ah Hwee School of Computer Science and Engineering Centre for Computational Intelligence asahtan@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Exploring an unknown environment by multiple autonomous robots is a major challenge in the robotics domain. The robot or agent needs to incrementally construct a model or a map representation of the environment while performing its domain tasks like surveillance, search and rescue tasks, and cleaning. What the robot should do or where it should go to visit next can only be determined after the map is constructed at least partially. The typical approach is by taking a frontier point which is located in the boundary between a known area and an unknown region as the target location to visit. This point is selected from other frontiers as revealed whenever the robots observe the environment. However, when multiple robots are involved, the task becomes more challenging as they have to explore the unknown environment as efficient and fast as possible while avoiding conflicts or interferences among the robots that can reduce the efficiency. Although coordinating a team of autonomous robots to explore an unknown environment can be done in an efficient way, partitioning the map of the environment into separate regions or segments as the targets allocated to the robots to visit is an efficient approach. The partitioning must be performed continually and incrementally. There is a trade-off that generating many small segments can provide more details of the environment, but may lose the representation of larger areas that are useful and relevant to the exploration task at hand. A Hierarchical Adaptive Clustering (HAC) segmentation of the indoor environment is introduced in this thesis that can strike a balance between fine-grained clustering and generalized segmentation during the exploration. With the HAC approach, an effective multi-agent task allocation approach is developed, wherein the partitioning and allocation processes can be performed continually and incrementally in real-time. Experimental results on HAC-based exploration method shows that it is comparable with other state-of-the-art approaches including Frontier-based allocation and Voronoi-based exploration. The model outperforms the others in terms of meaningful topological clusters and efficient exploration. However, non-learning based methods usually employ a fixed strategy to allocate the robots or agents to explore selected locations that sometimes can not handle the unpredictable and dynamic situations well. These methods can be effective in a single robot case, but assigning multiple robots to explore different locations is challenging since individual robots may interfere with others, making the overall tasks less efficient. A learning-based approach is proposed to solve those issues in this thesis. The algorithm is called CNN-based Multi-agent Proximal Policy Optimization (CMAPPO), which is for allocating multiple robots to explore different environments while over time improving their strategies to allocate the tasks more efficiently and flexibly. This algorithm combines CNN to process multi-channel visual inputs from the observed environment, curriculum learning for improving learning efficiency, and PPO algorithm for motivation based reinforcement learning. Based on the evaluation, the CMAPPO can learn a more efficient strategy for multiple robots (the robot is named agent in the rest of this thesis) to explore the environment than the conventional frontier-based method. This thesis introduces a novel indoor space segmentation-based exploration method which is based on topological clusters of an enclosed environment to perform multi-agent exploration. Considering the dynamic situations in the environment, this thesis further develops a new end-to-end deep reinforcement learning architecture for multi-agent exploration strategy by using Convolutional Neural Network (CNN) and Proximal Policy Optimization (PPO). Master of Engineering 2020-03-03T06:22:51Z 2020-03-03T06:22:51Z 2019 Thesis-Master by Research Chen, Z. (2019). Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/137152 10.32657/10356/137152 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning

Similar Items