INTEGRATION OF OBJECT DETECTION AND MOBILENETBASED MONOCULAR DEPTH ESTIMATION FOR AN EFFICIENT SYSTEM
Object detection and depth estimation are two computer vision techniques with various real world applications. One example of this would be in an autonomous vehicle. But, the simultaneous use of both techniques in the real world is limited by the computational power that’s available within a cert...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/49897 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Object detection and depth estimation are two computer vision techniques with various
real world applications. One example of this would be in an autonomous vehicle. But, the
simultaneous use of both techniques in the real world is limited by the computational
power that’s available within a certain system. This work proposes a modification towards
a system in order to solve the constraint on computing power.
The system that is proposed in this work is a modified version of the system proposed by
Miclea & Nedevchi (2019). This system merges the feature extractors used in both object
detection and depth estimation into a single component in order to reduce the number of
performed mathematical operations. The proposed modificationa in this work are
changing the feature extractor’s architecture into MobileNet, changing the object
detection algorithm from YOLOv3 into SSD, and the architecture of the depth estimation
component into FastDepth.
The proposed system is trained and tested on the publicly available dataset Cityscapes.
The model obtained from the training process is capable of processing a single image
within 25 ms. Furthermore, the system is also capable of achieving an acceptable accuracy
in both object detection and depth estimation.
With the results obtained during the testing process, it can be concluded that the
modifications proposed in this work are capable of reducing the number of computational
operations required by the system. This can be seen by the reduction in inference time
compared to the original system. But, this comes at the cost of slightly reduced accuracy
in both components. |
---|