Higher-Order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization

doi:10.1109/ICCV.2017.63

Proceedings ArticleDOI

Higher-Order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization

Sijia Cai, +2 more

- pp 511-520

Chats0

TLDR

This work proposes an end-to-end framework based on higherorder integration of hierarchical convolutional activations for FGVC that yields more discriminative representation and achieves competitive results on the widely used FGVC datasets.

Abstract:

The success of fine-grained visual categorization (FGVC) extremely relies on the modeling of appearance and interactions of various semantic parts. This makes FGVC very challenging because: (i) part annotation and detection require expert guidance and are very expensive; (ii) parts are of different sizes; and (iii) the part interactions are complex and of higher-order. To address these issues, we propose an end-to-end framework based on higherorder integration of hierarchical convolutional activations for FGVC. By treating the convolutional activations as local descriptors, hierarchical convolutional activations can serve as a representation of local parts from different scales. A polynomial kernel based predictor is proposed to capture higher-order statistics of convolutional activations for modeling part interaction. To model inter-layer part interactions, we extend polynomial predictor to integrate hierarchical activations via kernel fusion. Our work also provides a new perspective for combining convolutional activations from multiple layers. While hypercolumns simply concatenate maps from different layers, and holistically-nested network uses weighted fusion to combine side-outputs, our approach exploits higher-order intra-layer and inter-layer relations for better integration of hierarchical convolutional features. The proposed framework yields more discriminative representation and achieves competitive results on the widely used FGVC datasets.

Higher-Order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization

Citations

Cascading Hierarchical Networks with Multi-task Balanced Loss for Fine-grained hashing

Privileged Pooling: Better Sample Efficiency Through Supervised Attention

A Channel Mix Method for Fine-Grained Cross-Modal Retrieval

Selecting Discriminative Features for Fine-Grained Visual Classification

Learning more discriminative clues with gradual attention for fine-grained visual categorization

References

Deep Residual Learning for Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Going deeper with convolutions

ImageNet Large Scale Visual Recognition Challenge

Related Papers (5)

Deep Residual Learning for Image Recognition

Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition

3D Object Representations for Fine-Grained Categorization

Bilinear CNN Models for Fine-Grained Visual Recognition

The Caltech-UCSD Birds-200-2011 Dataset