Visual signal coding and quality evaluation
Visual signal (i.e., images and videos) coding is to compress digital visual data to be as small in size as possible in order to make use of limited bandwidth of networks and cater for compact storage, by exploring various data redundancy. It exploits the redundancy in signal itself (statistical red...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2012
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/47587 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-47587 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-475872023-03-04T00:37:39Z Visual signal coding and quality evaluation Liu, Anmin Lin Weisi School of Computer Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Visual signal (i.e., images and videos) coding is to compress digital visual data to be as small in size as possible in order to make use of limited bandwidth of networks and cater for compact storage, by exploring various data redundancy. It exploits the redundancy in signal itself (statistical redundancy, i.e., spatial-temporal redundancy and spectral/color redundancy). Since the human visual system (HVS) is the ultimate receiver and appreciator of most processed visual signal, we should also consider the redundancy due to the human vision properties (i.e., perceptual/psycho-visual redundancy) in the course of coding. The effectiveness of image and video coding methods is traditionally evaluated with their rate-distortion (RD) performance where rate is the number of bits required for the compressed visual signal (or its variants such as bits per pixel (bpp) and bits per second) and distortion is usually measured as peak signal to noise ratio (PSNR). However, it has been found that PSNR is not always in accordance with the human judgment and therefore the measurement for perceptual distortion is an active research area. Firstly, in this work, we discuss the statistical redundancy of video and then propose a novel optimal compression plane (OCP) based video coding scheme. In the sense of data structure, video is nothing more than a three dimensional data matrix, and the distinction among X (a spatial dimension), Y (the other spatial dimension), and T (the temporal dimension) is not absolutely necessary. We ignore the physical meaning of X, Y, and T axes for a video during the video coding process; frames are allowed to be formed in the TX (or TY) plane rather than the traditional XY plane to exploit the redundancy more effectively, and therefore better coder performance is achieved. Secondly, the model reflecting the masking characteristics of the HVS is studied as it is fundamental for perceptual redundancy exploring and visual distortion (quality) measurement. Just noticeable difference (JND) accounts for various masking effects of the HVS. We improve the pixel domain JND model by better contrast masking (CM) evaluation and appropriately accounting for the difference of CM for textural and edge regions. We also investigate into the application of the perceptual models (i.e., visual attention model and JND model) in the context of adaptive sampling based low-bit-rate image coding and JND based histogram adjustment for visually lossless image coding. Lastly, an effective and efficient metric of visual quality/distortion evaluation is proposed. The metric is based on the similarity between the gradient profiles of the reference and distorted signals which accounts for both the high level premise of the HVS (i.e., high sensitivity to image edges and structure) and the masking property. This new metric is with simple calculation and high accuracy (verified with extensive cross-database tests); it is robust to various distortion types and can be easily embedded in coding systems (as well as other visual signal processing algorithms). DOCTOR OF PHILOSOPHY (SCE) 2012-01-09T07:48:23Z 2012-01-09T07:48:23Z 2011 2011 Thesis Liu, A. M. (2011). Visual signal coding and quality evaluation . Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/47587 10.32657/10356/47587 en 162 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Liu, Anmin Visual signal coding and quality evaluation |
description |
Visual signal (i.e., images and videos) coding is to compress digital visual data to be as small in size as possible in order to make use of limited bandwidth of networks and cater for compact storage, by exploring various data redundancy. It exploits the redundancy in signal itself (statistical redundancy, i.e., spatial-temporal redundancy and spectral/color redundancy). Since the human visual system (HVS) is the ultimate receiver and appreciator of most processed visual signal, we should also consider the redundancy due to the human vision properties (i.e., perceptual/psycho-visual redundancy) in the course of coding. The effectiveness of image and video coding methods is traditionally evaluated with their rate-distortion (RD) performance where rate is the number of bits required for the compressed visual signal (or its variants such as bits per pixel (bpp) and bits per second) and distortion is usually measured as peak signal to noise ratio (PSNR). However, it has been found that PSNR is not always in accordance with the human judgment and therefore the measurement for perceptual distortion is an active research area.
Firstly, in this work, we discuss the statistical redundancy of video and then propose a novel optimal compression plane (OCP) based video coding scheme. In the sense of data structure, video is nothing more than a three dimensional data matrix, and the distinction among X (a spatial dimension), Y (the other spatial dimension), and T (the temporal dimension) is not absolutely necessary. We ignore the physical meaning of X, Y, and T axes for a video during the video coding process; frames are allowed to be formed in the TX (or TY) plane rather than the traditional XY plane to exploit the redundancy more effectively, and therefore better coder performance is achieved.
Secondly, the model reflecting the masking characteristics of the HVS is studied as it is fundamental for perceptual redundancy exploring and visual distortion (quality) measurement. Just noticeable difference (JND) accounts for various masking effects of the HVS. We improve the pixel domain JND model by better contrast masking (CM) evaluation and appropriately accounting for the difference of CM for textural and edge regions. We also investigate into the application of the perceptual models (i.e., visual attention model and JND model) in the context of adaptive sampling based low-bit-rate image coding and JND based histogram adjustment for visually lossless image coding.
Lastly, an effective and efficient metric of visual quality/distortion evaluation is proposed. The metric is based on the similarity between the gradient profiles of the reference and distorted signals which accounts for both the high level premise of the HVS (i.e., high sensitivity to image edges and structure) and the masking property. This new metric is with simple calculation and high accuracy (verified with extensive cross-database tests); it is robust to various distortion types and can be easily embedded in coding systems (as well as other visual signal processing algorithms). |
author2 |
Lin Weisi |
author_facet |
Lin Weisi Liu, Anmin |
format |
Theses and Dissertations |
author |
Liu, Anmin |
author_sort |
Liu, Anmin |
title |
Visual signal coding and quality evaluation |
title_short |
Visual signal coding and quality evaluation |
title_full |
Visual signal coding and quality evaluation |
title_fullStr |
Visual signal coding and quality evaluation |
title_full_unstemmed |
Visual signal coding and quality evaluation |
title_sort |
visual signal coding and quality evaluation |
publishDate |
2012 |
url |
https://hdl.handle.net/10356/47587 |
_version_ |
1759854982869286912 |