Learning based signal quality assessment for multimedia communications
Multimedia contents (including image/video, speech, audio, graphic and so on) can be affected by a wide variety of distortions during the process of acquisition, compression, processing, transmission, and reproduction which generally leads to loss of perceptual quality. As a result, signal quality a...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2012
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/50753 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-50753 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-507532023-03-04T00:45:22Z Learning based signal quality assessment for multimedia communications Manish Narwaria Lin Weisi School of Computer Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Multimedia contents (including image/video, speech, audio, graphic and so on) can be affected by a wide variety of distortions during the process of acquisition, compression, processing, transmission, and reproduction which generally leads to loss of perceptual quality. As a result, signal quality assessment is an important component in today’s multimedia communication systems. In this thesis, perceptual quality assessment algorithms are proposed for three important types of multimedia signals, namely image, video, and speech. This involves two crucial stages: (a) feature extraction/detection, and (b) feature pooling. The first stage calls for investigation and analysis into appropriate and effective signal features to extract meaningful information and provide a compact representation of the signal with the regard of quality. This is crucial because the selected features form the basis of the resultant quality metric. In this thesis, we discuss and provide detailed analysis of features based on Singular Value Decomposition, 2D mel-cepstrum and phase of Fourier Transform for visual quality assessment. We analyse the advantages and disadvantages of these features with regards to prediction accuracy and complexity. We also investigate into mel filter bank energies as features for evaluating quality of noisesuppressed speech and provide justification for their effectiveness via theoretical and experimental analysis. On the other hand, the second stage requires the determination of appropriate weights for fusing the features into a single score that can accurately reflect the human judgement of perceptual quality. We tackle this by using machine learning techniques which have been successfully employed in numerous research areas (for example in computer vision tasks such as object localization/tracking/recognition) but have not been adequately addressed in the literature within the realm of objective quality evaluation. Their major advantage is the introduction of a more systematic pooling methodology thereby avoiding unrealistic assumptions imposed in existing pooling methods. In this thesis, we demonstrate that machine learning can be effective in quality assessment if proper signal features are detected. We also provide insights into machine learning based feature pooling by analyzing the system trained on subjective scores which quantify human perception. The proposed algorithms have been validated on a large number of subjectively rated databases which are publicly available. We have performed careful experimental analysis (including within database and cross database tests) and demonstrated that the proposed schemes overall perform better than several relevant methods. The better alignment with human perception confirms the effectiveness of the algorithms proposed in this thesis. DOCTOR OF PHILOSOPHY (SCE) 2012-10-16T04:58:46Z 2012-10-16T04:58:46Z 2012 2012 Thesis Manish, N. (2012). Learning based signal quality assessment for multimedia communications. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/50753 10.32657/10356/50753 en 240 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Manish Narwaria Learning based signal quality assessment for multimedia communications |
description |
Multimedia contents (including image/video, speech, audio, graphic and so on) can be affected by a wide variety of distortions during the process of acquisition, compression, processing, transmission, and reproduction which generally leads to loss of perceptual quality. As a result, signal quality assessment is an important component in today’s multimedia communication systems. In this thesis, perceptual quality assessment algorithms are proposed for three important types of multimedia signals, namely image, video, and speech. This involves two crucial stages: (a) feature extraction/detection, and (b) feature pooling.
The first stage calls for investigation and analysis into appropriate and effective signal features to extract meaningful information and provide a compact representation of the signal with the regard of quality. This is crucial because the selected features form the
basis of the resultant quality metric. In this thesis, we discuss and provide detailed analysis of features based on Singular Value Decomposition, 2D mel-cepstrum and phase of Fourier Transform for visual quality assessment. We analyse the advantages and
disadvantages of these features with regards to prediction accuracy and complexity. We also investigate into mel filter bank energies as features for evaluating quality of noisesuppressed
speech and provide justification for their effectiveness via theoretical and experimental analysis.
On the other hand, the second stage requires the determination of appropriate weights for fusing the features into a single score that can accurately reflect the human judgement of perceptual quality. We tackle this by using machine learning techniques which have
been successfully employed in numerous research areas (for example in computer vision tasks such as object localization/tracking/recognition) but have not been adequately
addressed in the literature within the realm of objective quality evaluation. Their major advantage is the introduction of a more systematic pooling methodology thereby avoiding unrealistic assumptions imposed in existing pooling methods. In this thesis, we demonstrate that machine learning can be effective in quality assessment if proper signal features are detected. We also provide insights into machine learning based feature pooling by analyzing the system trained on subjective scores which quantify human perception.
The proposed algorithms have been validated on a large number of subjectively rated databases which are publicly available. We have performed careful experimental analysis (including within database and cross database tests) and demonstrated that the proposed schemes overall perform better than several relevant methods. The better alignment with human perception confirms the effectiveness of the algorithms proposed in this thesis. |
author2 |
Lin Weisi |
author_facet |
Lin Weisi Manish Narwaria |
format |
Theses and Dissertations |
author |
Manish Narwaria |
author_sort |
Manish Narwaria |
title |
Learning based signal quality assessment for multimedia communications |
title_short |
Learning based signal quality assessment for multimedia communications |
title_full |
Learning based signal quality assessment for multimedia communications |
title_fullStr |
Learning based signal quality assessment for multimedia communications |
title_full_unstemmed |
Learning based signal quality assessment for multimedia communications |
title_sort |
learning based signal quality assessment for multimedia communications |
publishDate |
2012 |
url |
https://hdl.handle.net/10356/50753 |
_version_ |
1759856813827686400 |