Towards better fine-grained visual classification: an attention-based, hierarchical approach

Unlike general object classification, which uses convolutional neural networks (CNNs), fine-grained visual classification (FGVC) is a challenging problem that involves categorizing objects belong to different subcategories with subtle fine-grained details. Furthermore, most fine-grained categories i...

Full description

Saved in:
Bibliographic Details
Main Author: Gao, Manrong
Other Authors: Jiang Xudong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/167399
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Unlike general object classification, which uses convolutional neural networks (CNNs), fine-grained visual classification (FGVC) is a challenging problem that involves categorizing objects belong to different subcategories with subtle fine-grained details. Furthermore, most fine-grained categories inherently exhibit a hierarchical structure, as exemplified by the hierarchical classification of birds into orders, families, genera, and species. This type of hierarchical structure can capture intricate relationships among categories at different levels, thereby assisting in reducing ambiguity in predictions. Existing attention-based approaches focus on localize discriminative parts to learn fine-grained details of one certain level belongs to a category, ignoring utilization of hierarchical information in the category. In this paper, we explored different levels in the hierarchy of predicting categories and proposed a novel model by incorporating the hierarchical structure into a deep neural network. The proposed model consists of two parts: 1) a visual attention sampling module to emphasize the most discriminative parts of the image, 2) a hierarchical classifier with one base net and four branch nets.