Perceptual image coding and transcoding with subband discrete cosine transform

One of the biggest challenges in the field of image and video processing is evaluating and optimizing the quality of digital imaging system with respect to storage and transmission of visual information. Due to physiological and psychological mechanisms of the human visual system (HVS), the HVS is u...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Ee Leng
Other Authors: Gan Woon Seng
Format: Theses and Dissertations
Language:English
Published: 2012
Subjects:
Online Access:https://hdl.handle.net/10356/48915
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:One of the biggest challenges in the field of image and video processing is evaluating and optimizing the quality of digital imaging system with respect to storage and transmission of visual information. Due to physiological and psychological mechanisms of the human visual system (HVS), the HVS is unable to detect all changes in an image. By exploiting limitations of the HVS, scarce resources of digital imaging system such as storage space as well as transmission bandwidth can be optimally allocated for the optimal visual experience. Discrete cosine transform (DCT) is used in many image and video standards (JPEG, MPEG-1/2/4, H.261/3), therefore this thesis focuses on a computational model to determine the threshold of just-noticeable-difference (JND) based on DCT subband. Several well accepted factors such as contrast sensitivity function (CSF), luminance adaptation and contrast masking, which influence the visibility of differences (or distortion) in an image are investigated. Specifically, these visual factors are incorporated in a computational model for JND to guide an image compression algorithm in achieving higher compression ratio without introducing perceivable distortion. A variant of the Type-II DCT (DCT-II) namely the subband discrete cosine transform (SBDCT) has been applied in this thesis. One of the main differences between the two-dimensional (2-D) DCT-II and 2-D SBDCT is the 2-D DCT-II and 2-D SBDCT operate on an N  N input sequence and subbands of an N  N input sequence, respectively. In the case of SBDCT, the subbands of an N × N input sequence can be independently processed, thus, SBDCT is well-suited for parallel implementation. Furthermore, it has been shown that significant reduction of computational cost is achieved by approximating the 2-D DCT-II using the SBDCT in a image compression system, while maintaining reasonable image quality in the reconstructed image. The first part of this thesis discusses on a fast computational structure for the SBDCT. Based on a symmetry property found in the SBDCT, a fast computational structure for the SBDCT is derived. Followed by, seven approximations of the 2-D DCT-II based on the SBDCT are obtained. The second part of this thesis focuses on the development of an image compression system, which integrates the SBDCT with a computational model for JND. A computational model for JND is adapted to the SBDCT, which takes into account the computational structure of SBDCT to obtain an effective image compression system. The proposed image compression system is subjected to a series of subjective evaluations to validate its effectiveness in terms of compression ratio and perceived image quality of the reconstructed image. With the emergence of mobile multimedia capable devices and available of broadband networks, users have easy access to high definition image and video contents. Hence, an efficient method to reduce the spatial resolution of an image or video content before delivery to such mobile devices is desired. The third and the final part of this thesis presents a computationally efficient image transcoding algorithm to resize images, which does not require any DCT or inverse DCT (IDCT) algorithm employed in many existing image resizing algorithms. Instead, the proposed algorithm applies the spatial relationship of the DCT-II coefficients between a block and its sub-blocks as well as the observation of the SBDCT to obtain an efficient computational structure. Generally, arbitrarily resizing of images in the DCT domain is achieved by first up-sampling and followed by down-sampling of DCT-II coefficients. To reduce computational cost without degrading visual quality, the proposed transcoding algorithm utilizes a subset of the up-sampled DCT coefficients, instead of all the up-sampled DCT-II coefficients, in the down-sampling operation. The selection of the up-sampled DCT-II coefficients is obtained empirically through the analysis of the image degradation with respect to the number of up-sampled DCT-II coefficients used in the down-sampling process. To further reduce computational cost of the proposed image transcoding algorithm, the base visibility threshold derived in the computational model for JND is used to determine the number of up-sampled DCT-II coefficients used in the down-sampling process. Subjective evaluation of the resized images confirms that no perceptible distortion is introduced in the resized images using this JND guided image transcoding algorithm, as compared to the resized images obtained with all the up-sampled DCT-II coefficients in the down-sampling operation.