scispace - formally typeset
Proceedings ArticleDOI

Hierarchical Part Matching for Fine-Grained Visual Categorization

TLDR
A powerful flowchart named Hierarchical Part Matching (HPM) is proposed to cope with fine-grained classification tasks and achieves the state-of-the-art classification accuracy in the Caltech-UCSD-Birds-200-2011 dataset by making full use of the ground-truth part annotations.
Abstract
As a special topic in computer vision, fine-grained visual categorization (FGVC) has been attracting growing attention these years. Different with traditional image classification tasks in which objects have large inter-class variation, the visual concepts in the fine-grained datasets, such as hundreds of bird species, often have very similar semantics. Due to the large inter-class similarity, it is very difficult to classify the objects without locating really discriminative features, therefore it becomes more important for the algorithm to make full use of the part information in order to train a robust model. In this paper, we propose a powerful flowchart named Hierarchical Part Matching (HPM) to cope with fine-grained classification tasks. We extend the Bag-of-Features (BoF) model by introducing several novel modules to integrate into image representation, including foreground inference and segmentation, Hierarchical Structure Learning (HSL), and Geometric Phrase Pooling (GPP). We verify in experiments that our algorithm achieves the state-of-the-art classification accuracy in the Caltech-UCSD-Birds-200-2011 dataset by making full use of the ground-truth part annotations.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

Part-Based R-CNNs for Fine-Grained Category Detection

TL;DR: In this article, the authors propose a model for fine-grained categorization by leveraging deep convolutional features computed on bottom-up region proposals, which learns whole-object and part detectors, enforces learned geometric constraints between them, and predicts a finegrained category from a pose normalized representation.
Book ChapterDOI

Learning to Navigate for Fine-grained Classification

TL;DR: In this paper, a self-supervision mechanism is proposed to locate informative regions without the need of bounding-box/part annotations, which consists of a navigator agent, a teacher agent and a scrutinizer agent.
Proceedings ArticleDOI

Deep LAC: Deep localization, alignment and classification for fine-grained recognition

TL;DR: A valve linkage function (VLF) for back-propagation chaining is proposed to form the deep localization, alignment and classification (LAC) system and can adaptively compromise the errors of classification and alignment when training the LAC model.
Proceedings ArticleDOI

Picking Deep Filter Responses for Fine-Grained Image Recognition

TL;DR: In this article, the authors propose a unified framework based on two steps of deep filter response picking, one picking filter responses to find distinctive filters which respond to specific patterns significantly and consistently, and learn a set of part detectors via iteratively alternating between positive sample mining and part model retraining.
Journal ArticleDOI

Object-Part Attention Model for Fine-Grained Image Classification.

TL;DR: Zhang et al. as discussed by the authors proposed the object-part attention model (OPAM) for weakly supervised fine-grained image classification, which integrates two level attentions: object-level attention localizes objects of images, and partlevel attention selects discriminative parts of object.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Journal ArticleDOI

Object Detection with Discriminatively Trained Part-Based Models

TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.
Proceedings ArticleDOI

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.
Journal Article

LIBLINEAR: A Library for Large Linear Classification

TL;DR: LIBLINEAR is an open source library for large-scale linear classification that supports logistic regression and linear support vector machines and provides easy-to-use command-line tools and library calls for users and developers.
Related Papers (5)