Discovering class-specific visual patterns for visual recognition

Similar to frequent patterns in data mining, visual pattern refers to a recurring composition of visual contents in images or videos, such as repetitive texture regions, common objects among images, or similar actions among videos. Such visual patterns capture the recurrence nature of visual data...

全面介紹

Saved in:

書目詳細資料
主要作者:	Weng, Chaoqun
其他作者:	Yuan Junsong
格式:	Theses and Dissertations
語言:	English
出版:	2017
主題:	DRNTU::Engineering::Electrical and electronic engineering
在線閱讀:	http://hdl.handle.net/10356/72462
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

id	sg-ntu-dr.10356-72462
record_format	dspace
spelling	sg-ntu-dr.10356-724622023-07-04T17:09:52Z Discovering class-specific visual patterns for visual recognition Weng, Chaoqun Yuan Junsong School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Similar to frequent patterns in data mining, visual pattern refers to a recurring composition of visual contents in images or videos, such as repetitive texture regions, common objects among images, or similar actions among videos. Such visual patterns capture the recurrence nature of visual data and can represent the essence of the visual data. Finding such visual patterns is critical to image and video data analysis. In spite of the recent successes of unsupervised mining of representative visual patterns in unlabeled visual data, for visual recognition tasks, the unsupervised mined visual patterns are often not discriminative enough to distinguish among different classes. One natural way to overcome this limitation is to leverage supervised learning and discover class-specific visual patterns, which is the focus of this thesis. Particularly, we target at discovering the following visual patterns of different structures: (1) class-specific local spatial patterns, e.g., local texture structure that can help differentiate different object images; (2) class-specific spatial layout patterns, e.g., spatial grid patterns that can help differentiate different scene images; (3) class-specific visual pattern of compositional structures, e.g., conjunction (AND) and disjunction (OR) forms of individual visual features that can help differentiate different scene images and action videos. To discover the above-mentioned class-specific visual patterns, this thesis is composed by the following technical works. In the first work, we propose to mine mid-level visual phrases from low-level visual primitives, e.g., local image patches or regions, by leveraging local spatial context of visual primitives, multi-feature fusion of visual primitives, and also the weaklysupervised image label information. In the second work, we propose to discover class-specific spatial layouts for each scene category by casting a l1-regularized max-margin optimization problem. In the third work, we propose a novel branch-and-bound based co-occurrence pattern mining algorithm that can directly mine both optimal conjunctions (AND) and disjunctions (OR) of individual features at arbitrary orders simultaneously with minimum classification error for boosting algorithm. Similar to the third work, in the fourth work we aim to discover highorder AND/OR patterns of skeleton features from depth camera for action recognition. We also propose to integrate the discovered AND/OR patterns in an attention LSTM model for temporal modeling to improve action recognition performance. Compared with unsupervised visual pattern discovery, which usually separates the step of pattern discovery and classification, our method can provide a joint learning of visual pattern discovery and visual recognition. Also, different from conventional visual recognition which emphasize purely on the classification performance, our class-specific visual patterns target more on capturing the essence of difference visual classes, such that we not only can recognize the visual classes, but also can explain and understand why they are different visual classes, thanks to the discovered class-specific visual patterns. Doctor of Philosophy (EEE) 2017-07-26T01:57:41Z 2017-07-26T01:57:41Z 2017 Thesis Weng, C. (2017). Discovering class-specific visual patterns for visual recognition. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/72462 10.32657/10356/72462 en 123 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Electrical and electronic engineering
spellingShingle	DRNTU::Engineering::Electrical and electronic engineering Weng, Chaoqun Discovering class-specific visual patterns for visual recognition
description	Similar to frequent patterns in data mining, visual pattern refers to a recurring composition of visual contents in images or videos, such as repetitive texture regions, common objects among images, or similar actions among videos. Such visual patterns capture the recurrence nature of visual data and can represent the essence of the visual data. Finding such visual patterns is critical to image and video data analysis. In spite of the recent successes of unsupervised mining of representative visual patterns in unlabeled visual data, for visual recognition tasks, the unsupervised mined visual patterns are often not discriminative enough to distinguish among different classes. One natural way to overcome this limitation is to leverage supervised learning and discover class-specific visual patterns, which is the focus of this thesis. Particularly, we target at discovering the following visual patterns of different structures: (1) class-specific local spatial patterns, e.g., local texture structure that can help differentiate different object images; (2) class-specific spatial layout patterns, e.g., spatial grid patterns that can help differentiate different scene images; (3) class-specific visual pattern of compositional structures, e.g., conjunction (AND) and disjunction (OR) forms of individual visual features that can help differentiate different scene images and action videos. To discover the above-mentioned class-specific visual patterns, this thesis is composed by the following technical works. In the first work, we propose to mine mid-level visual phrases from low-level visual primitives, e.g., local image patches or regions, by leveraging local spatial context of visual primitives, multi-feature fusion of visual primitives, and also the weaklysupervised image label information. In the second work, we propose to discover class-specific spatial layouts for each scene category by casting a l1-regularized max-margin optimization problem. In the third work, we propose a novel branch-and-bound based co-occurrence pattern mining algorithm that can directly mine both optimal conjunctions (AND) and disjunctions (OR) of individual features at arbitrary orders simultaneously with minimum classification error for boosting algorithm. Similar to the third work, in the fourth work we aim to discover highorder AND/OR patterns of skeleton features from depth camera for action recognition. We also propose to integrate the discovered AND/OR patterns in an attention LSTM model for temporal modeling to improve action recognition performance. Compared with unsupervised visual pattern discovery, which usually separates the step of pattern discovery and classification, our method can provide a joint learning of visual pattern discovery and visual recognition. Also, different from conventional visual recognition which emphasize purely on the classification performance, our class-specific visual patterns target more on capturing the essence of difference visual classes, such that we not only can recognize the visual classes, but also can explain and understand why they are different visual classes, thanks to the discovered class-specific visual patterns.
author2	Yuan Junsong
author_facet	Yuan Junsong Weng, Chaoqun
format	Theses and Dissertations
author	Weng, Chaoqun
author_sort	Weng, Chaoqun
title	Discovering class-specific visual patterns for visual recognition
title_short	Discovering class-specific visual patterns for visual recognition
title_full	Discovering class-specific visual patterns for visual recognition
title_fullStr	Discovering class-specific visual patterns for visual recognition
title_full_unstemmed	Discovering class-specific visual patterns for visual recognition
title_sort	discovering class-specific visual patterns for visual recognition
publishDate	2017
url	http://hdl.handle.net/10356/72462
_version_	1772826090291593216

Discovering class-specific visual patterns for visual recognition

相似書籍