Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning
Exploring an unknown environment by multiple autonomous robots is a major challenge in the robotics domain. The robot or agent needs to incrementally construct a model or a map representation of the environment while performing its domain tasks like surveillance, search and rescue tasks, and cleanin...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Research |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/137152 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-137152 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Chen, Zichen Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning |
description |
Exploring an unknown environment by multiple autonomous robots is a major challenge in the robotics domain. The robot or agent needs to incrementally construct a model or a map representation of the environment while performing its domain tasks like surveillance, search and rescue tasks, and cleaning. What the robot should do or where it should go to visit next can only be determined after the map is constructed at least partially.
The typical approach is by taking a frontier point which is located in the boundary between a known area and an unknown region as the target location to visit. This point is selected from other frontiers as revealed whenever the robots observe the environment. However, when multiple robots are involved, the task becomes more challenging as they have to explore the unknown environment as efficient and fast as possible while avoiding conflicts or interferences among the robots that can reduce the efficiency.
Although coordinating a team of autonomous robots to explore an unknown environment can be done in an efficient way, partitioning the map of the environment into separate regions or segments as the targets allocated to the robots to visit is an efficient approach. The partitioning must be performed continually and incrementally. There is a trade-off that generating many small segments can provide more details of the environment, but may lose the representation of larger areas that are useful and relevant to the exploration task at hand. A Hierarchical Adaptive Clustering (HAC) segmentation of the indoor environment is introduced in this thesis that can strike a balance between fine-grained clustering and generalized segmentation during the exploration. With the HAC approach, an effective multi-agent task allocation approach is developed, wherein the partitioning and allocation processes can be performed continually and incrementally in real-time. Experimental results on HAC-based exploration method shows that it is comparable with other state-of-the-art approaches including Frontier-based allocation and Voronoi-based exploration. The model outperforms the others in terms of meaningful topological clusters and efficient exploration.
However, non-learning based methods usually employ a fixed strategy to allocate the robots or agents to explore selected locations that sometimes can not handle the unpredictable and dynamic situations well. These methods can be effective in a single robot case, but assigning multiple robots to explore different locations is challenging since individual robots may interfere with others, making the overall tasks less efficient. A learning-based approach is proposed to solve those issues in this thesis. The algorithm is called CNN-based Multi-agent Proximal Policy Optimization (CMAPPO), which is for allocating multiple robots to explore different environments while over time improving their strategies to allocate the tasks more efficiently and flexibly. This algorithm combines CNN to process multi-channel visual inputs from the observed environment, curriculum learning for improving learning efficiency, and PPO algorithm for motivation based reinforcement learning. Based on the evaluation, the CMAPPO can learn a more efficient strategy for multiple robots (the robot is named agent in the rest of this thesis) to explore the environment than the conventional frontier-based method.
This thesis introduces a novel indoor space segmentation-based exploration method which is based on topological clusters of an enclosed environment to perform multi-agent exploration. Considering the dynamic situations in the environment, this thesis further develops a new end-to-end deep reinforcement learning architecture for multi-agent exploration strategy by using Convolutional Neural Network (CNN) and Proximal Policy Optimization (PPO). |
author2 |
Tan Ah Hwee |
author_facet |
Tan Ah Hwee Chen, Zichen |
format |
Thesis-Master by Research |
author |
Chen, Zichen |
author_sort |
Chen, Zichen |
title |
Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning |
title_short |
Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning |
title_full |
Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning |
title_fullStr |
Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning |
title_full_unstemmed |
Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning |
title_sort |
towards coordinated multi-agent exploration problem via segmentation and reinforcement learning |
publisher |
Nanyang Technological University |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/137152 |
_version_ |
1683494148759355392 |
spelling |
sg-ntu-dr.10356-1371522020-10-28T08:29:18Z Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning Chen, Zichen Tan Ah Hwee School of Computer Science and Engineering Centre for Computational Intelligence asahtan@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Exploring an unknown environment by multiple autonomous robots is a major challenge in the robotics domain. The robot or agent needs to incrementally construct a model or a map representation of the environment while performing its domain tasks like surveillance, search and rescue tasks, and cleaning. What the robot should do or where it should go to visit next can only be determined after the map is constructed at least partially. The typical approach is by taking a frontier point which is located in the boundary between a known area and an unknown region as the target location to visit. This point is selected from other frontiers as revealed whenever the robots observe the environment. However, when multiple robots are involved, the task becomes more challenging as they have to explore the unknown environment as efficient and fast as possible while avoiding conflicts or interferences among the robots that can reduce the efficiency. Although coordinating a team of autonomous robots to explore an unknown environment can be done in an efficient way, partitioning the map of the environment into separate regions or segments as the targets allocated to the robots to visit is an efficient approach. The partitioning must be performed continually and incrementally. There is a trade-off that generating many small segments can provide more details of the environment, but may lose the representation of larger areas that are useful and relevant to the exploration task at hand. A Hierarchical Adaptive Clustering (HAC) segmentation of the indoor environment is introduced in this thesis that can strike a balance between fine-grained clustering and generalized segmentation during the exploration. With the HAC approach, an effective multi-agent task allocation approach is developed, wherein the partitioning and allocation processes can be performed continually and incrementally in real-time. Experimental results on HAC-based exploration method shows that it is comparable with other state-of-the-art approaches including Frontier-based allocation and Voronoi-based exploration. The model outperforms the others in terms of meaningful topological clusters and efficient exploration. However, non-learning based methods usually employ a fixed strategy to allocate the robots or agents to explore selected locations that sometimes can not handle the unpredictable and dynamic situations well. These methods can be effective in a single robot case, but assigning multiple robots to explore different locations is challenging since individual robots may interfere with others, making the overall tasks less efficient. A learning-based approach is proposed to solve those issues in this thesis. The algorithm is called CNN-based Multi-agent Proximal Policy Optimization (CMAPPO), which is for allocating multiple robots to explore different environments while over time improving their strategies to allocate the tasks more efficiently and flexibly. This algorithm combines CNN to process multi-channel visual inputs from the observed environment, curriculum learning for improving learning efficiency, and PPO algorithm for motivation based reinforcement learning. Based on the evaluation, the CMAPPO can learn a more efficient strategy for multiple robots (the robot is named agent in the rest of this thesis) to explore the environment than the conventional frontier-based method. This thesis introduces a novel indoor space segmentation-based exploration method which is based on topological clusters of an enclosed environment to perform multi-agent exploration. Considering the dynamic situations in the environment, this thesis further develops a new end-to-end deep reinforcement learning architecture for multi-agent exploration strategy by using Convolutional Neural Network (CNN) and Proximal Policy Optimization (PPO). Master of Engineering 2020-03-03T06:22:51Z 2020-03-03T06:22:51Z 2019 Thesis-Master by Research Chen, Z. (2019). Towards coordinated multi-agent exploration problem via segmentation and reinforcement learning. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/137152 10.32657/10356/137152 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |