Perceptual and content based analysis for coding multimedia information

This report covers the RGM project “Perceptual and Content Based Analysis for Coding Multimedia Information” (22 October 2002 – 30 June 2006). Chapter 1 deals with the topic of content-based image retrieval (CBIR), where three new approaches are proposed: 1) a framework of knowledge-driven CBIR; 2)...

Full description

Saved in:
Bibliographic Details
Main Author: Xue, Ping.
Other Authors: School of Electrical and Electronic Engineering
Format: Research Report
Language:English
Published: 2008
Subjects:
Online Access:http://hdl.handle.net/10356/14519
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-14519
record_format dspace
spelling sg-ntu-dr.10356-145192023-03-04T03:20:03Z Perceptual and content based analysis for coding multimedia information Xue, Ping. School of Electrical and Electronic Engineering DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory This report covers the RGM project “Perceptual and Content Based Analysis for Coding Multimedia Information” (22 October 2002 – 30 June 2006). Chapter 1 deals with the topic of content-based image retrieval (CBIR), where three new approaches are proposed: 1) a framework of knowledge-driven CBIR; 2) a criterion to maximize the retrieval performance of Kernel-based Biased Discriminant Analysis (KBDA); and 3) a novel saliency-weighted region-based image retrieval (SW-RBIR) algorithm. This leads to a new image retrieval strategy, where a suitable CBIR algorithm is selected based on image characteristics of foreground objects and backgrounds. Chapter 2 extends the work to video analysis. Two multi-resolution video representation schemes, based on the Kernel-based Principal Component Analysis (KPCA), and the Mean Shift Analysis (MSA) are proposed. Both can lead to efficient multi-resolution video representation by simply tuning the internal parameters. With low computation cost, motion vectors are extracted from MPEG video stream and then processed. This approach is especially suitable for Skycam-based application where the image resolution is usually low. Chapter 3 treats the model selection issue of kernel methods. A unified model selection for both bi-class and multi-class Support Vector Machines (SVMs) is proposed, based on the gradient descent method and conceptually simple and easy to implement. The criterion is then extended to the Kernel-based Linear Discriminant Analysis (KLDA). A generalized radius-margin bound is developed for multi-class SVMs to efficiently perform both model selection and feature selection. ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library Chapter 4 discusses the video repeat identification and video structure analysis. Effective solutions to identify short video repeats from video collections or streams are developed for the purpose of video structure analysis, important event mining, and commercial detection and skipping. Chapter 5 is focused on the image segmentation based on the disjoint set union. A new watershed algorithm is proposed to address issues like the over-segmentation and the memory overflow in some existing methods. Chapter 6 explores the characteristics of the human visual system (HVS) and proposed an improved scheme for estimating just-noticeable distortion (JND). In general, any information below JND can be ignored. Applying the JND gauge to image and video compression can lead to the improvement on coding efficiency as bits can be located according to the JND thresholds in different areas. In other words, the improvement on the perceptual quality of coded images and videos can be observed as compared with traditional coding methods. 2008-11-26T08:25:09Z 2008-11-26T08:25:09Z 2006 2006 Research Report http://hdl.handle.net/10356/14519 en 221 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory
spellingShingle DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory
Xue, Ping.
Perceptual and content based analysis for coding multimedia information
description This report covers the RGM project “Perceptual and Content Based Analysis for Coding Multimedia Information” (22 October 2002 – 30 June 2006). Chapter 1 deals with the topic of content-based image retrieval (CBIR), where three new approaches are proposed: 1) a framework of knowledge-driven CBIR; 2) a criterion to maximize the retrieval performance of Kernel-based Biased Discriminant Analysis (KBDA); and 3) a novel saliency-weighted region-based image retrieval (SW-RBIR) algorithm. This leads to a new image retrieval strategy, where a suitable CBIR algorithm is selected based on image characteristics of foreground objects and backgrounds. Chapter 2 extends the work to video analysis. Two multi-resolution video representation schemes, based on the Kernel-based Principal Component Analysis (KPCA), and the Mean Shift Analysis (MSA) are proposed. Both can lead to efficient multi-resolution video representation by simply tuning the internal parameters. With low computation cost, motion vectors are extracted from MPEG video stream and then processed. This approach is especially suitable for Skycam-based application where the image resolution is usually low. Chapter 3 treats the model selection issue of kernel methods. A unified model selection for both bi-class and multi-class Support Vector Machines (SVMs) is proposed, based on the gradient descent method and conceptually simple and easy to implement. The criterion is then extended to the Kernel-based Linear Discriminant Analysis (KLDA). A generalized radius-margin bound is developed for multi-class SVMs to efficiently perform both model selection and feature selection. ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library Chapter 4 discusses the video repeat identification and video structure analysis. Effective solutions to identify short video repeats from video collections or streams are developed for the purpose of video structure analysis, important event mining, and commercial detection and skipping. Chapter 5 is focused on the image segmentation based on the disjoint set union. A new watershed algorithm is proposed to address issues like the over-segmentation and the memory overflow in some existing methods. Chapter 6 explores the characteristics of the human visual system (HVS) and proposed an improved scheme for estimating just-noticeable distortion (JND). In general, any information below JND can be ignored. Applying the JND gauge to image and video compression can lead to the improvement on coding efficiency as bits can be located according to the JND thresholds in different areas. In other words, the improvement on the perceptual quality of coded images and videos can be observed as compared with traditional coding methods.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Xue, Ping.
format Research Report
author Xue, Ping.
author_sort Xue, Ping.
title Perceptual and content based analysis for coding multimedia information
title_short Perceptual and content based analysis for coding multimedia information
title_full Perceptual and content based analysis for coding multimedia information
title_fullStr Perceptual and content based analysis for coding multimedia information
title_full_unstemmed Perceptual and content based analysis for coding multimedia information
title_sort perceptual and content based analysis for coding multimedia information
publishDate 2008
url http://hdl.handle.net/10356/14519
_version_ 1759853803703631872