Real-time semantic image segmentation via light-weight neural networks

Semantic image segmentation aims to generate the high-level classification of regions, in which each pixel will be associated with a class label from a predefined set. However, not all semantic segmentation architectures are suitable to be applied to real time applications due to their high computat...

Full description

Saved in:
Bibliographic Details
Main Author: Xu, Jing
Other Authors: Jiang Xudong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/140054
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-140054
record_format dspace
spelling sg-ntu-dr.10356-1400542023-07-07T18:42:34Z Real-time semantic image segmentation via light-weight neural networks Xu, Jing Jiang Xudong School of Electrical and Electronic Engineering EXDJiang@ntu.edu.sg Engineering::Electrical and electronic engineering Semantic image segmentation aims to generate the high-level classification of regions, in which each pixel will be associated with a class label from a predefined set. However, not all semantic segmentation architectures are suitable to be applied to real time applications due to their high computational cost and huge number of parameters. Thus, we consider the task of adapting an efficient and powerful semantic image segmentation architecture, called Light Weight RefineNet, into a further reduction of number of parameters while keeping competitive performance. In particular, we have explored the possibility to use the MobileNet family as the encoder backbone, namely, MobileNet-v2 and its upgraded version MobileNet-v3 architectures and to improve their performance by adding dilated convolution layer to the decoder part as well as solving exploding gradient problem by applying gradient clipping and Residual with Zero initialisation (ReZero). The adapted architecture with MobileNet-v2 as encoder reduces the parameters by 15.27M, which is half of parameters of the original RefineNet-LW-50, yet keeps similar accuracy. For MobileNet-v3, although its performance may not be that competitive comparing to MobileNet-v2, it achieves further parameter reduction to 8.32M. and a fast speed of 77 FPS on 625 × 468 inputs. Bachelor of Engineering (Electrical and Electronic Engineering) 2020-05-26T05:52:36Z 2020-05-26T05:52:36Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/140054 en A3100-191 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
spellingShingle Engineering::Electrical and electronic engineering
Xu, Jing
Real-time semantic image segmentation via light-weight neural networks
description Semantic image segmentation aims to generate the high-level classification of regions, in which each pixel will be associated with a class label from a predefined set. However, not all semantic segmentation architectures are suitable to be applied to real time applications due to their high computational cost and huge number of parameters. Thus, we consider the task of adapting an efficient and powerful semantic image segmentation architecture, called Light Weight RefineNet, into a further reduction of number of parameters while keeping competitive performance. In particular, we have explored the possibility to use the MobileNet family as the encoder backbone, namely, MobileNet-v2 and its upgraded version MobileNet-v3 architectures and to improve their performance by adding dilated convolution layer to the decoder part as well as solving exploding gradient problem by applying gradient clipping and Residual with Zero initialisation (ReZero). The adapted architecture with MobileNet-v2 as encoder reduces the parameters by 15.27M, which is half of parameters of the original RefineNet-LW-50, yet keeps similar accuracy. For MobileNet-v3, although its performance may not be that competitive comparing to MobileNet-v2, it achieves further parameter reduction to 8.32M. and a fast speed of 77 FPS on 625 × 468 inputs.
author2 Jiang Xudong
author_facet Jiang Xudong
Xu, Jing
format Final Year Project
author Xu, Jing
author_sort Xu, Jing
title Real-time semantic image segmentation via light-weight neural networks
title_short Real-time semantic image segmentation via light-weight neural networks
title_full Real-time semantic image segmentation via light-weight neural networks
title_fullStr Real-time semantic image segmentation via light-weight neural networks
title_full_unstemmed Real-time semantic image segmentation via light-weight neural networks
title_sort real-time semantic image segmentation via light-weight neural networks
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/140054
_version_ 1772827613361864704