Semantic segmentation with less annotation efforts
Semantic segmentation is a pixel-wise classification task, which is to predict class label to every pixel within an image. However, one of the obstacles limiting the development of semantic segmentation is that the pixel-wise segmentation annotations for the training images are quite difficult and e...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/140292 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Semantic segmentation is a pixel-wise classification task, which is to predict class label to every pixel within an image. However, one of the obstacles limiting the development of semantic segmentation is that the pixel-wise segmentation annotations for the training images are quite difficult and expensive to obtain. Thus, it is difficult to apply the segmentation methods to new datasets or semantic classes since it is labor-intensive to obtain new annotated data. The target of this thesis is to reduce the annotation work load for semantic segmentation tasks on images. From the view of data domain, the approaches could be roughly classified into intra-domain approaches and inter-domain approaches.
The category of intra-domain approaches is to utilize weaker level of supervisions within the datasets of same data domain. The supervision format of image-level labels for weakly supervised semantic segmentation is investigated in this thesis. To recover the pixel-wise annotations from image-level labels, region-mining models are trained to approximate the target object regions. The goal is to train the region-mining models which could highlight the integral object regions instead of only the most discriminative regions. In this thesis, I investigate regularizing the region-mining model both in the forward pass and the backward pass of the training process.
The category of inter-domain approaches is to transfer the pixel-wise knowledge from another domain, whose data and pixel-wise annotations are easier to generate, to the target data domain. This thesis investigates the case of transferring from the synthetic source data with pixel-wise annotations to the real-world unlabeled target data. Adversarial learning approaches are applied to narrow the domain gap between the synthetic data and real-world data. Adversarial learning approaches usually suffer from the problem of content misalignment. To alleviate the content misalignment problem, two approaches are proposed in this thesis to regularize adversarial learning methods: the first is to embed the global structure knowledge into the feature-level adversarial learning step. The second is to back-propagate the final task loss into the pixel-wise adversarial learning step.
This thesis presents the methods of both the categories to alleviate the need of pixelwise annotations for semantic image segmentation. The experiments show that the proposed methods could achieve promising segmentation performance without utilizing the pixel-wise annotations. |
---|