OBJECT-BASED SLAM FOR MOBILE ROBOT USING INDOOR STRUCTURAL LAYOUT AND OBJECT CLASS
SLAM or Simultaneous Localization and Mapping is a problem of mapping an unknown environment, by an agent who simultaneously has to estimate its motion, i.e. localization, within the map. Mapping and localization have to be done simultaneously because this is a chicken and egg problem: one depends...
Saved in:
Main Author: | |
---|---|
Format: | Dissertations |
Language: | Indonesia |
Subjects: | |
Online Access: | https://digilib.itb.ac.id/gdl/view/57400 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:57400 |
---|---|
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
topic |
Teknik (Rekayasa, enjinering dan kegiatan berkaitan) |
spellingShingle |
Teknik (Rekayasa, enjinering dan kegiatan berkaitan) Ismail OBJECT-BASED SLAM FOR MOBILE ROBOT USING INDOOR STRUCTURAL LAYOUT AND OBJECT CLASS |
description |
SLAM or Simultaneous Localization and Mapping is a problem of mapping an
unknown environment, by an agent who simultaneously has to estimate its motion, i.e. localization, within the map. Mapping and localization have to be done
simultaneously because this is a chicken and egg problem: one depends on the
other. The simultaneity is possible because there is a correlation between error in
odometry and that in sensors.
One of the directions in latest SLAM researches is semantic SLAM, i.e. SLAM
that produces map whose components are meaningful to human, namely semantic
map. A semantic map contains richer information to human compared to other
maps such as maps with point cloud or occupancy grid. For example, a semantic
map for an indoor environment could have objects known to human as its components, e.g., tables, chairs, and walls. In this dissertation, an ’object’ refers to
any typical object found in an indoor environment except for walls which we will
call any one of it specifically as a ’wall’.
Typically, objects and walls in semantic SLAM are treated as separate entities.
In this research, we try to pursue the concept of semantic relationships between
those entities. For example, walls could have semantic relationship with other
walls to form a model of space partitioning entities, i.e rooms. This model could
be exploited for the advantage of SLAM, e.g. for data association and loop closure
detection. Specifically, novelty in this research are (1) the development of concept
of room and object to model indoor environments and (2) usage of the said concept
to solve SLAM problem.
Therefore, in this research we set two goals as the followings. (1) To create
a method to model an indoor environment with rooms and objects (room-object
model). (2) To create a SLAM method that use room-object model to achieve more
efficient computation time, more accurate map, and more accurate localization
compared to those that do not use similar concept (object-based landmark and
submapping strategy). The algorithm is named RoomSLAM.
RoomSLAM consists of three modules, i.e. sensor module to detect objects and
iv
walls, front-end module to do pose prediction, room detection, and data association, and lastly back-end module which is responsible for map and localization
estimation. Each module runs in its own thread.
RoomSLAM uses YOLOv3 to detect objects. Outputs from YOLOv3 are combined with point cloud data from RGBD sensor to get 3D position of objects. To
detect walls, RoomSLAM estimates line models from sampled point clouds using
RANSAC. Objects and walls data associations are done using nearest neighbour
algorithm where ’near’ criteria for object is calculated by Mahalanobis distance
and by Euclid distance for walls. In RoomSLAM, data association is executed
within the room where the agent is currently positioned. To know which room
the agent is, RoomSLAM calculates distance from the agent to each room already mapped. If the agent is not in any available room, a new room is created.
In back-end module, map and location estimation is done using graph optimization. The optimization is considered as non-linear least squares problem which
is solved using Levenberg-Marquardt algorithm. Meanwhile, everytime the agent
visits a previous room, RoomSLAM searches for agent’s poses from which the
same objects/walls are observed, i.e., loop closure. When the pose is found, global
trajectory correction is done using pose graph optimization.
RoomSLAM is tested using public dataset, i.e. MIT and TUM dataset. There
are 3 sequences used from MIT dataset and 4 sequences from TUM dataset. Each
dataset is taken using wheeled robot with RGB-D camera mounted on top of it.
Experiments showed that RoomSLAM succeeded in creating map with objects and
rooms. Hence, the first research goal is accomplished. Note however that the
complexity of rooms in MIT dataset (e.g. irregular shape of room and glass walls)
is still a challenge for RoomSLAM. Estimated rooms are still found overlapped and
accuracy of walls mapping still have space for improvement. Relaxing four walls
assumption for each room might improve the mapping performance of RoomSLAM
in future work.
Experiments also showed that RoomSLAM is able to take advantage from semantic
relationships to create efficient, in computation time, and accurate, in map and
localization, algorithm. The efficacy is shown by framerate of RoomSLAM, i.e.
78 fps for front-end module and 1 fps for back-end. Moreover, RoomSLAM also
showed that optimization process is not bounded by environment size. RoomSLAM
accuracy is shown by calculation of Root Mean Squared Error (RMSE) of robot
localization. The RMSE of RoomSLAM is better than those of ORB-SLAM and
RGBD-SLAM when tested both in MIT and TUM dataset. With these results, it
is concluded that second goal of the research is accomplished.
|
format |
Dissertations |
author |
Ismail |
author_facet |
Ismail |
author_sort |
Ismail |
title |
OBJECT-BASED SLAM FOR MOBILE ROBOT USING INDOOR STRUCTURAL LAYOUT AND OBJECT CLASS |
title_short |
OBJECT-BASED SLAM FOR MOBILE ROBOT USING INDOOR STRUCTURAL LAYOUT AND OBJECT CLASS |
title_full |
OBJECT-BASED SLAM FOR MOBILE ROBOT USING INDOOR STRUCTURAL LAYOUT AND OBJECT CLASS |
title_fullStr |
OBJECT-BASED SLAM FOR MOBILE ROBOT USING INDOOR STRUCTURAL LAYOUT AND OBJECT CLASS |
title_full_unstemmed |
OBJECT-BASED SLAM FOR MOBILE ROBOT USING INDOOR STRUCTURAL LAYOUT AND OBJECT CLASS |
title_sort |
object-based slam for mobile robot using indoor structural layout and object class |
url |
https://digilib.itb.ac.id/gdl/view/57400 |
_version_ |
1822002634572169216 |
spelling |
id-itb.:574002021-08-20T11:20:07ZOBJECT-BASED SLAM FOR MOBILE ROBOT USING INDOOR STRUCTURAL LAYOUT AND OBJECT CLASS Ismail Teknik (Rekayasa, enjinering dan kegiatan berkaitan) Indonesia Dissertations SLAM, semantic SLAM, object SLAM, semantic map, structural layout INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/57400 SLAM or Simultaneous Localization and Mapping is a problem of mapping an unknown environment, by an agent who simultaneously has to estimate its motion, i.e. localization, within the map. Mapping and localization have to be done simultaneously because this is a chicken and egg problem: one depends on the other. The simultaneity is possible because there is a correlation between error in odometry and that in sensors. One of the directions in latest SLAM researches is semantic SLAM, i.e. SLAM that produces map whose components are meaningful to human, namely semantic map. A semantic map contains richer information to human compared to other maps such as maps with point cloud or occupancy grid. For example, a semantic map for an indoor environment could have objects known to human as its components, e.g., tables, chairs, and walls. In this dissertation, an ’object’ refers to any typical object found in an indoor environment except for walls which we will call any one of it specifically as a ’wall’. Typically, objects and walls in semantic SLAM are treated as separate entities. In this research, we try to pursue the concept of semantic relationships between those entities. For example, walls could have semantic relationship with other walls to form a model of space partitioning entities, i.e rooms. This model could be exploited for the advantage of SLAM, e.g. for data association and loop closure detection. Specifically, novelty in this research are (1) the development of concept of room and object to model indoor environments and (2) usage of the said concept to solve SLAM problem. Therefore, in this research we set two goals as the followings. (1) To create a method to model an indoor environment with rooms and objects (room-object model). (2) To create a SLAM method that use room-object model to achieve more efficient computation time, more accurate map, and more accurate localization compared to those that do not use similar concept (object-based landmark and submapping strategy). The algorithm is named RoomSLAM. RoomSLAM consists of three modules, i.e. sensor module to detect objects and iv walls, front-end module to do pose prediction, room detection, and data association, and lastly back-end module which is responsible for map and localization estimation. Each module runs in its own thread. RoomSLAM uses YOLOv3 to detect objects. Outputs from YOLOv3 are combined with point cloud data from RGBD sensor to get 3D position of objects. To detect walls, RoomSLAM estimates line models from sampled point clouds using RANSAC. Objects and walls data associations are done using nearest neighbour algorithm where ’near’ criteria for object is calculated by Mahalanobis distance and by Euclid distance for walls. In RoomSLAM, data association is executed within the room where the agent is currently positioned. To know which room the agent is, RoomSLAM calculates distance from the agent to each room already mapped. If the agent is not in any available room, a new room is created. In back-end module, map and location estimation is done using graph optimization. The optimization is considered as non-linear least squares problem which is solved using Levenberg-Marquardt algorithm. Meanwhile, everytime the agent visits a previous room, RoomSLAM searches for agent’s poses from which the same objects/walls are observed, i.e., loop closure. When the pose is found, global trajectory correction is done using pose graph optimization. RoomSLAM is tested using public dataset, i.e. MIT and TUM dataset. There are 3 sequences used from MIT dataset and 4 sequences from TUM dataset. Each dataset is taken using wheeled robot with RGB-D camera mounted on top of it. Experiments showed that RoomSLAM succeeded in creating map with objects and rooms. Hence, the first research goal is accomplished. Note however that the complexity of rooms in MIT dataset (e.g. irregular shape of room and glass walls) is still a challenge for RoomSLAM. Estimated rooms are still found overlapped and accuracy of walls mapping still have space for improvement. Relaxing four walls assumption for each room might improve the mapping performance of RoomSLAM in future work. Experiments also showed that RoomSLAM is able to take advantage from semantic relationships to create efficient, in computation time, and accurate, in map and localization, algorithm. The efficacy is shown by framerate of RoomSLAM, i.e. 78 fps for front-end module and 1 fps for back-end. Moreover, RoomSLAM also showed that optimization process is not bounded by environment size. RoomSLAM accuracy is shown by calculation of Root Mean Squared Error (RMSE) of robot localization. The RMSE of RoomSLAM is better than those of ORB-SLAM and RGBD-SLAM when tested both in MIT and TUM dataset. With these results, it is concluded that second goal of the research is accomplished. text |