Machine learning-based image demosaicing

In this thesis, the research work is mainly focused on exploiting data-driven convolutional neural network (CNN) methodology to further improve the performance of conventional signal processing approach. This investigation is motivated by our goal on the study of how these two methodologies can syne...

Full description

Saved in:

Bibliographic Details
Main Author:	Zhou, Huan
Other Authors:	Kai-Kuang Ma
Format:	Thesis-Master by Research
Language:	English
Published:	Nanyang Technological University 2020
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Computer graphics
Online Access:	https://hdl.handle.net/10356/144887
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-144887
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Computing methodologies::Computer graphics
spellingShingle	Engineering::Computer science and engineering::Computing methodologies::Computer graphics Zhou, Huan Machine learning-based image demosaicing
description	In this thesis, the research work is mainly focused on exploiting data-driven convolutional neural network (CNN) methodology to further improve the performance of conventional signal processing approach. This investigation is motivated by our goal on the study of how these two methodologies can synergistically work together to improve the performance of various image processing tasks. In this work, image demosaicing, which is an indispensable processing for consumer-grade digital cameras and our handphones, is chosen as our focused image application. In the consideration of cost saving, the above-mentioned gadgets are equipped with only one charge-coupled device (CCD) sensor, and therefore image demosaicing is needed in order to restore the full-color image; the recovered image is called the demosaicked image. With decades of research on image demosaicing, a plethora of signal processing-based algorithms have been developed. However, it has been observed that their performances become saturated. Owing to this, we would like to investigate how to harness the potential of CNN to boost the performance of a signal processing-based method as its post-processing module. For that, our first contribution is on the development of a new CNN structure, called the U-Net, which is a modified version of the existing U-Net. Our second contribution is to address a fundamental question that has been completely overlooked in the machine learning area---that is, how to construct a reliable and robust training dataset by selecting ‘good’ samples? For that, a set of formal procedures is developed and exploited to train our proposed U-Net for our image demosaicing task. The significance of the second contribution is discussed in the following. It is well-recognized that one of the main concerns of using the machine learning methodology lies in the establishment of its training dataset, exploited for training the designed network. In the past, such training dataset are constructed in an ad-hoc manner. As a result, such data-driven training can easily get into insufficient coverage of the addressed problem. Owing to this, there is a strong need to investigate this fundamental issue that is quite critical and has been ignored in the past. To tackle this issue, a fundamental principle that we consider is to utilize the correlation existing among the three color components (i.e., red, green, and blue), since this is the key attribute used in the IRI image demosaicking method. It is important to further note that the above-mentioned correlation is only measured on the image’s high-frequency regions, which are extracted by a set of high-pass filters along with four different directions. The reliability and robustness of our proposed procedures has been measured via the student t-test on the demosaicked images. For evaluating the performance of various image demosaicing algorithms, two well-known benchmark test datasets, Kodak and Imax, are also used in this thesis. The training dataset established for training our proposed U*-Net is based on our suggested data selection procedures. The obtained simulation results are documented in this thesis, in terms of subjective evaluation on the demosaicked image quality and objective measurements; that is, peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). It shows that the demosaicked color images are closer to the ground truth with less image artifacts, and there is an extra 1.31 dB improvements on Kodak test dataset and 1.11 dB gains on the Imax test dataset, in addition to what the most advanced signal processing-based method (iterative residual interpolation, IRI) has achieved. For those less sophisticated methods, such gains (in dB) are even much higher; for example, there is a gain of 1.34 dB on Kodak and a gain of 4.26 dB on Imax for gradient based threshold free (GBTF) demosaicing. Last but not least, it is worthy to highlight that our works on training dataset establishment (i.e., the second contribution) can be also beneficial to other deep learning-based image processing tasks with some modifications.
author2	Kai-Kuang Ma
author_facet	Kai-Kuang Ma Zhou, Huan
format	Thesis-Master by Research
author	Zhou, Huan
author_sort	Zhou, Huan
title	Machine learning-based image demosaicing
title_short	Machine learning-based image demosaicing
title_full	Machine learning-based image demosaicing
title_fullStr	Machine learning-based image demosaicing
title_full_unstemmed	Machine learning-based image demosaicing
title_sort	machine learning-based image demosaicing
publisher	Nanyang Technological University
publishDate	2020
url	https://hdl.handle.net/10356/144887
_version_	1772826276393910272
spelling	sg-ntu-dr.10356-1448872023-07-04T16:43:11Z Machine learning-based image demosaicing Zhou, Huan Kai-Kuang Ma School of Electrical and Electronic Engineering EKKMA@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Computer graphics In this thesis, the research work is mainly focused on exploiting data-driven convolutional neural network (CNN) methodology to further improve the performance of conventional signal processing approach. This investigation is motivated by our goal on the study of how these two methodologies can synergistically work together to improve the performance of various image processing tasks. In this work, image demosaicing, which is an indispensable processing for consumer-grade digital cameras and our handphones, is chosen as our focused image application. In the consideration of cost saving, the above-mentioned gadgets are equipped with only one charge-coupled device (CCD) sensor, and therefore image demosaicing is needed in order to restore the full-color image; the recovered image is called the demosaicked image. With decades of research on image demosaicing, a plethora of signal processing-based algorithms have been developed. However, it has been observed that their performances become saturated. Owing to this, we would like to investigate how to harness the potential of CNN to boost the performance of a signal processing-based method as its post-processing module. For that, our first contribution is on the development of a new CNN structure, called the U-Net, which is a modified version of the existing U-Net. Our second contribution is to address a fundamental question that has been completely overlooked in the machine learning area---that is, how to construct a reliable and robust training dataset by selecting ‘good’ samples? For that, a set of formal procedures is developed and exploited to train our proposed U-Net for our image demosaicing task. The significance of the second contribution is discussed in the following. It is well-recognized that one of the main concerns of using the machine learning methodology lies in the establishment of its training dataset, exploited for training the designed network. In the past, such training dataset are constructed in an ad-hoc manner. As a result, such data-driven training can easily get into insufficient coverage of the addressed problem. Owing to this, there is a strong need to investigate this fundamental issue that is quite critical and has been ignored in the past. To tackle this issue, a fundamental principle that we consider is to utilize the correlation existing among the three color components (i.e., red, green, and blue), since this is the key attribute used in the IRI image demosaicking method. It is important to further note that the above-mentioned correlation is only measured on the image’s high-frequency regions, which are extracted by a set of high-pass filters along with four different directions. The reliability and robustness of our proposed procedures has been measured via the student t-test on the demosaicked images. For evaluating the performance of various image demosaicing algorithms, two well-known benchmark test datasets, Kodak and Imax, are also used in this thesis. The training dataset established for training our proposed U*-Net is based on our suggested data selection procedures. The obtained simulation results are documented in this thesis, in terms of subjective evaluation on the demosaicked image quality and objective measurements; that is, peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). It shows that the demosaicked color images are closer to the ground truth with less image artifacts, and there is an extra 1.31 dB improvements on Kodak test dataset and 1.11 dB gains on the Imax test dataset, in addition to what the most advanced signal processing-based method (iterative residual interpolation, IRI) has achieved. For those less sophisticated methods, such gains (in dB) are even much higher; for example, there is a gain of 1.34 dB on Kodak and a gain of 4.26 dB on Imax for gradient based threshold free (GBTF) demosaicing. Last but not least, it is worthy to highlight that our works on training dataset establishment (i.e., the second contribution) can be also beneficial to other deep learning-based image processing tasks with some modifications. Master of Engineering 2020-12-02T05:21:13Z 2020-12-02T05:21:13Z 2020 Thesis-Master by Research Zhou, H. (2020). Machine learning-based image demosaicing. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/144887 10.32657/10356/144887 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Machine learning-based image demosaicing

Similar Items