OBJECT-BASED SLAM FOR MOBILE ROBOT USING INDOOR STRUCTURAL LAYOUT AND OBJECT CLASS
SLAM or Simultaneous Localization and Mapping is a problem of mapping an unknown environment, by an agent who simultaneously has to estimate its motion, i.e. localization, within the map. Mapping and localization have to be done simultaneously because this is a chicken and egg problem: one depends...
Saved in:
Main Author: | |
---|---|
Format: | Dissertations |
Language: | Indonesia |
Subjects: | |
Online Access: | https://digilib.itb.ac.id/gdl/view/57400 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | SLAM or Simultaneous Localization and Mapping is a problem of mapping an
unknown environment, by an agent who simultaneously has to estimate its motion, i.e. localization, within the map. Mapping and localization have to be done
simultaneously because this is a chicken and egg problem: one depends on the
other. The simultaneity is possible because there is a correlation between error in
odometry and that in sensors.
One of the directions in latest SLAM researches is semantic SLAM, i.e. SLAM
that produces map whose components are meaningful to human, namely semantic
map. A semantic map contains richer information to human compared to other
maps such as maps with point cloud or occupancy grid. For example, a semantic
map for an indoor environment could have objects known to human as its components, e.g., tables, chairs, and walls. In this dissertation, an ’object’ refers to
any typical object found in an indoor environment except for walls which we will
call any one of it specifically as a ’wall’.
Typically, objects and walls in semantic SLAM are treated as separate entities.
In this research, we try to pursue the concept of semantic relationships between
those entities. For example, walls could have semantic relationship with other
walls to form a model of space partitioning entities, i.e rooms. This model could
be exploited for the advantage of SLAM, e.g. for data association and loop closure
detection. Specifically, novelty in this research are (1) the development of concept
of room and object to model indoor environments and (2) usage of the said concept
to solve SLAM problem.
Therefore, in this research we set two goals as the followings. (1) To create
a method to model an indoor environment with rooms and objects (room-object
model). (2) To create a SLAM method that use room-object model to achieve more
efficient computation time, more accurate map, and more accurate localization
compared to those that do not use similar concept (object-based landmark and
submapping strategy). The algorithm is named RoomSLAM.
RoomSLAM consists of three modules, i.e. sensor module to detect objects and
iv
walls, front-end module to do pose prediction, room detection, and data association, and lastly back-end module which is responsible for map and localization
estimation. Each module runs in its own thread.
RoomSLAM uses YOLOv3 to detect objects. Outputs from YOLOv3 are combined with point cloud data from RGBD sensor to get 3D position of objects. To
detect walls, RoomSLAM estimates line models from sampled point clouds using
RANSAC. Objects and walls data associations are done using nearest neighbour
algorithm where ’near’ criteria for object is calculated by Mahalanobis distance
and by Euclid distance for walls. In RoomSLAM, data association is executed
within the room where the agent is currently positioned. To know which room
the agent is, RoomSLAM calculates distance from the agent to each room already mapped. If the agent is not in any available room, a new room is created.
In back-end module, map and location estimation is done using graph optimization. The optimization is considered as non-linear least squares problem which
is solved using Levenberg-Marquardt algorithm. Meanwhile, everytime the agent
visits a previous room, RoomSLAM searches for agent’s poses from which the
same objects/walls are observed, i.e., loop closure. When the pose is found, global
trajectory correction is done using pose graph optimization.
RoomSLAM is tested using public dataset, i.e. MIT and TUM dataset. There
are 3 sequences used from MIT dataset and 4 sequences from TUM dataset. Each
dataset is taken using wheeled robot with RGB-D camera mounted on top of it.
Experiments showed that RoomSLAM succeeded in creating map with objects and
rooms. Hence, the first research goal is accomplished. Note however that the
complexity of rooms in MIT dataset (e.g. irregular shape of room and glass walls)
is still a challenge for RoomSLAM. Estimated rooms are still found overlapped and
accuracy of walls mapping still have space for improvement. Relaxing four walls
assumption for each room might improve the mapping performance of RoomSLAM
in future work.
Experiments also showed that RoomSLAM is able to take advantage from semantic
relationships to create efficient, in computation time, and accurate, in map and
localization, algorithm. The efficacy is shown by framerate of RoomSLAM, i.e.
78 fps for front-end module and 1 fps for back-end. Moreover, RoomSLAM also
showed that optimization process is not bounded by environment size. RoomSLAM
accuracy is shown by calculation of Root Mean Squared Error (RMSE) of robot
localization. The RMSE of RoomSLAM is better than those of ORB-SLAM and
RGBD-SLAM when tested both in MIT and TUM dataset. With these results, it
is concluded that second goal of the research is accomplished.
|
---|