Learning based signal quality assessment for multimedia communications

Multimedia contents (including image/video, speech, audio, graphic and so on) can be affected by a wide variety of distortions during the process of acquisition, compression, processing, transmission, and reproduction which generally leads to loss of perceptual quality. As a result, signal quality a...

Full description

Saved in:
Bibliographic Details
Main Author: Manish Narwaria
Other Authors: Lin Weisi
Format: Theses and Dissertations
Language:English
Published: 2012
Subjects:
Online Access:https://hdl.handle.net/10356/50753
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-50753
record_format dspace
spelling sg-ntu-dr.10356-507532023-03-04T00:45:22Z Learning based signal quality assessment for multimedia communications Manish Narwaria Lin Weisi School of Computer Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Multimedia contents (including image/video, speech, audio, graphic and so on) can be affected by a wide variety of distortions during the process of acquisition, compression, processing, transmission, and reproduction which generally leads to loss of perceptual quality. As a result, signal quality assessment is an important component in today’s multimedia communication systems. In this thesis, perceptual quality assessment algorithms are proposed for three important types of multimedia signals, namely image, video, and speech. This involves two crucial stages: (a) feature extraction/detection, and (b) feature pooling. The first stage calls for investigation and analysis into appropriate and effective signal features to extract meaningful information and provide a compact representation of the signal with the regard of quality. This is crucial because the selected features form the basis of the resultant quality metric. In this thesis, we discuss and provide detailed analysis of features based on Singular Value Decomposition, 2D mel-cepstrum and phase of Fourier Transform for visual quality assessment. We analyse the advantages and disadvantages of these features with regards to prediction accuracy and complexity. We also investigate into mel filter bank energies as features for evaluating quality of noisesuppressed speech and provide justification for their effectiveness via theoretical and experimental analysis. On the other hand, the second stage requires the determination of appropriate weights for fusing the features into a single score that can accurately reflect the human judgement of perceptual quality. We tackle this by using machine learning techniques which have been successfully employed in numerous research areas (for example in computer vision tasks such as object localization/tracking/recognition) but have not been adequately addressed in the literature within the realm of objective quality evaluation. Their major advantage is the introduction of a more systematic pooling methodology thereby avoiding unrealistic assumptions imposed in existing pooling methods. In this thesis, we demonstrate that machine learning can be effective in quality assessment if proper signal features are detected. We also provide insights into machine learning based feature pooling by analyzing the system trained on subjective scores which quantify human perception. The proposed algorithms have been validated on a large number of subjectively rated databases which are publicly available. We have performed careful experimental analysis (including within database and cross database tests) and demonstrated that the proposed schemes overall perform better than several relevant methods. The better alignment with human perception confirms the effectiveness of the algorithms proposed in this thesis. DOCTOR OF PHILOSOPHY (SCE) 2012-10-16T04:58:46Z 2012-10-16T04:58:46Z 2012 2012 Thesis Manish, N. (2012). Learning based signal quality assessment for multimedia communications. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/50753 10.32657/10356/50753 en 240 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Manish Narwaria
Learning based signal quality assessment for multimedia communications
description Multimedia contents (including image/video, speech, audio, graphic and so on) can be affected by a wide variety of distortions during the process of acquisition, compression, processing, transmission, and reproduction which generally leads to loss of perceptual quality. As a result, signal quality assessment is an important component in today’s multimedia communication systems. In this thesis, perceptual quality assessment algorithms are proposed for three important types of multimedia signals, namely image, video, and speech. This involves two crucial stages: (a) feature extraction/detection, and (b) feature pooling. The first stage calls for investigation and analysis into appropriate and effective signal features to extract meaningful information and provide a compact representation of the signal with the regard of quality. This is crucial because the selected features form the basis of the resultant quality metric. In this thesis, we discuss and provide detailed analysis of features based on Singular Value Decomposition, 2D mel-cepstrum and phase of Fourier Transform for visual quality assessment. We analyse the advantages and disadvantages of these features with regards to prediction accuracy and complexity. We also investigate into mel filter bank energies as features for evaluating quality of noisesuppressed speech and provide justification for their effectiveness via theoretical and experimental analysis. On the other hand, the second stage requires the determination of appropriate weights for fusing the features into a single score that can accurately reflect the human judgement of perceptual quality. We tackle this by using machine learning techniques which have been successfully employed in numerous research areas (for example in computer vision tasks such as object localization/tracking/recognition) but have not been adequately addressed in the literature within the realm of objective quality evaluation. Their major advantage is the introduction of a more systematic pooling methodology thereby avoiding unrealistic assumptions imposed in existing pooling methods. In this thesis, we demonstrate that machine learning can be effective in quality assessment if proper signal features are detected. We also provide insights into machine learning based feature pooling by analyzing the system trained on subjective scores which quantify human perception. The proposed algorithms have been validated on a large number of subjectively rated databases which are publicly available. We have performed careful experimental analysis (including within database and cross database tests) and demonstrated that the proposed schemes overall perform better than several relevant methods. The better alignment with human perception confirms the effectiveness of the algorithms proposed in this thesis.
author2 Lin Weisi
author_facet Lin Weisi
Manish Narwaria
format Theses and Dissertations
author Manish Narwaria
author_sort Manish Narwaria
title Learning based signal quality assessment for multimedia communications
title_short Learning based signal quality assessment for multimedia communications
title_full Learning based signal quality assessment for multimedia communications
title_fullStr Learning based signal quality assessment for multimedia communications
title_full_unstemmed Learning based signal quality assessment for multimedia communications
title_sort learning based signal quality assessment for multimedia communications
publishDate 2012
url https://hdl.handle.net/10356/50753
_version_ 1759856813827686400