Toward achieving robust low-level and high-level scene parsing

In this paper, we address the challenging task of scene segmentation. We first discuss and compare two widely used approaches to retain detailed spatial information from pretrained CNN - "dilation" and "skip". Then, we demonstrate that the parsing performance of "skip"...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shuai, Bing, Ding, Henghui, Liu, Ting, Wang, Gang, Jiang, Xudong
Other Authors:	School of Electrical and Electronic Engineering
Format:	Article
Language:	English
Published:	2020
Subjects:	Engineering::Electrical and electronic engineering Scene Parsing Convolution Neural Network
Online Access:	https://hdl.handle.net/10356/142866
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-142866
record_format	dspace
spelling	sg-ntu-dr.10356-1428662020-07-06T06:05:50Z Toward achieving robust low-level and high-level scene parsing Shuai, Bing Ding, Henghui Liu, Ting Wang, Gang Jiang, Xudong School of Electrical and Electronic Engineering Rapid-Rich Object Search Lab Engineering::Electrical and electronic engineering Scene Parsing Convolution Neural Network In this paper, we address the challenging task of scene segmentation. We first discuss and compare two widely used approaches to retain detailed spatial information from pretrained CNN - "dilation" and "skip". Then, we demonstrate that the parsing performance of "skip" network can be noticeably improved by modifying the parameterization of skip layers. Furthermore, we introduce a "dense skip" architecture to retain a rich set of low-level information from pre-trained CNN, which is essential to improve the low-level parsing performance. Meanwhile, we propose a convolutional context network (CCN) and place it on top of pre-trained CNNs, which is used to aggregate contexts for high-level feature maps so that robust high-level parsing can be achieved. We name our segmentation network enhanced fully convolutional network (EFCN) based on its significantly enhanced structure over FCN. Extensive experimental studies justify each contribution separately. Without bells and whistles, EFCN achieves state-of-the-arts on segmentation datasets of ADE20K, Pascal Context, SUN-RGBD and Pascal VOC 2012. NRF (Natl Research Foundation, S’pore) MOE (Min. of Education, S’pore) Accepted version 2020-07-06T06:05:50Z 2020-07-06T06:05:50Z 2018 Journal Article Shuai, B., Ding, H., Liu, T., Wang, G., & Jiang, X. (2019). Toward achieving robust low-level and high-level scene parsing. IEEE Transactions on Image Processing, 28(3), 1378-1390. doi:10.1109/TIP.2018.2878975 1057-7149 https://hdl.handle.net/10356/142866 10.1109/TIP.2018.2878975 30387733 2-s2.0-85055869185 3 28 1378 1390 en IEEE Transactions on Image Processing © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/TIP.2018.2878975 application/pdf
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering Scene Parsing Convolution Neural Network
spellingShingle	Engineering::Electrical and electronic engineering Scene Parsing Convolution Neural Network Shuai, Bing Ding, Henghui Liu, Ting Wang, Gang Jiang, Xudong Toward achieving robust low-level and high-level scene parsing
description	In this paper, we address the challenging task of scene segmentation. We first discuss and compare two widely used approaches to retain detailed spatial information from pretrained CNN - "dilation" and "skip". Then, we demonstrate that the parsing performance of "skip" network can be noticeably improved by modifying the parameterization of skip layers. Furthermore, we introduce a "dense skip" architecture to retain a rich set of low-level information from pre-trained CNN, which is essential to improve the low-level parsing performance. Meanwhile, we propose a convolutional context network (CCN) and place it on top of pre-trained CNNs, which is used to aggregate contexts for high-level feature maps so that robust high-level parsing can be achieved. We name our segmentation network enhanced fully convolutional network (EFCN) based on its significantly enhanced structure over FCN. Extensive experimental studies justify each contribution separately. Without bells and whistles, EFCN achieves state-of-the-arts on segmentation datasets of ADE20K, Pascal Context, SUN-RGBD and Pascal VOC 2012.
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Shuai, Bing Ding, Henghui Liu, Ting Wang, Gang Jiang, Xudong
format	Article
author	Shuai, Bing Ding, Henghui Liu, Ting Wang, Gang Jiang, Xudong
author_sort	Shuai, Bing
title	Toward achieving robust low-level and high-level scene parsing
title_short	Toward achieving robust low-level and high-level scene parsing
title_full	Toward achieving robust low-level and high-level scene parsing
title_fullStr	Toward achieving robust low-level and high-level scene parsing
title_full_unstemmed	Toward achieving robust low-level and high-level scene parsing
title_sort	toward achieving robust low-level and high-level scene parsing
publishDate	2020
url	https://hdl.handle.net/10356/142866
_version_	1681059639374905344

Toward achieving robust low-level and high-level scene parsing

Similar Items