Real-time semantic image segmentation via light-weight neural networks
Semantic image segmentation aims to generate the high-level classification of regions, in which each pixel will be associated with a class label from a predefined set. However, not all semantic segmentation architectures are suitable to be applied to real time applications due to their high computat...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/140054 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Semantic image segmentation aims to generate the high-level classification of regions, in which each pixel will be associated with a class label from a predefined set. However, not all semantic segmentation architectures are suitable to be applied to real time applications due to their high computational cost and huge number of parameters. Thus, we consider the task of adapting an efficient and powerful semantic image segmentation architecture, called Light Weight RefineNet, into a further reduction of number of parameters while keeping competitive performance. In particular, we have explored the possibility to use the MobileNet family as the encoder backbone, namely, MobileNet-v2 and its upgraded version MobileNet-v3 architectures and to improve their performance by adding dilated convolution layer to the decoder part as well as solving exploding gradient problem by applying gradient clipping and Residual with Zero initialisation (ReZero). The adapted architecture with MobileNet-v2 as encoder reduces the parameters by 15.27M, which is half of parameters of the original RefineNet-LW-50, yet keeps similar accuracy. For MobileNet-v3, although its performance may not be that competitive comparing to MobileNet-v2, it achieves further parameter reduction to 8.32M. and a fast speed of 77 FPS on 625 × 468 inputs. |
---|