The Caltech-UCSD Birds-200-2011 Dataset

Open Access

The Caltech-UCSD Birds-200-2011 Dataset

Chats0

TLDR

CUB-200-2011 as mentioned in this paper is an extended version of CUB200, which roughly doubles the number of images per category and adds new part localization annotations, annotated with bounding boxes, part locations, and at-ribute labels.

Abstract:

CUB-200-2011 is an extended version of CUB-200 [7], a challenging dataset of 200 bird species. The extended version roughly doubles the number of images per category and adds new part localization annotations. All images are annotated with bounding boxes, part locations, and at- tribute labels. Images and annotations were filtered by mul- tiple users of Mechanical Turk. We introduce benchmarks and baseline experiments for multi-class categorization and part localization.

Citations

PDF

Open Access

More filters

Proceedings Article

Spatial transformer networks

Max Jaderberg, +3 more

TL;DR: This work introduces a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network, and can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps.

...read moreread less

Posted Content

CNN Features off-the-shelf: an Astounding Baseline for Recognition

Ali Sharif Razavian, +3 more

- 23 Mar 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A series of experiments conducted for different recognition tasks using the publicly available code and model of the OverFeat network which was trained to perform object classification on ILSVRC13 suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual recognition tasks.

...read moreread less

Journal ArticleDOI

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Ranjay Krishna, +11 more

- 01 May 2017 -

International Journal of Computer Vision

TL;DR: The Visual Genome dataset as mentioned in this paper contains over 108k images where each image has an average of $35$35 objects, $26$26 attributes, and $21$21 pairwise relationships between objects.

...read moreread less

Proceedings ArticleDOI

CNN Features Off-the-Shelf: An Astounding Baseline for Recognition

Ali Sharif Razavian, +3 more

TL;DR: In this paper, features extracted from the OverFeat network are used as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of datasets.

...read moreread less

Proceedings ArticleDOI

CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features

Sangdoo Yun, +5 more

TL;DR: CutMix as discussed by the authors augments the training data by cutting and pasting patches among training images, where the ground truth labels are also mixed proportionally to the area of the patches.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010 -

International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Proceedings ArticleDOI

A discriminatively trained, multiscale, deformable part model

Pedro F. Felzenszwalb, +2 more

TL;DR: A discriminatively trained, multiscale, deformable part model for object detection, which achieves a two-fold improvement in average precision over the best performance in the 2006 PASCAL person detection challenge and outperforms the best results in the 2007 challenge in ten out of twenty categories.

...read moreread less

Caltech-256 Object Category Dataset

G. S. Griffin, +2 more

TL;DR: A challenging set of 256 object categories containing a total of 30607 images is introduced and the clutter category is used to train an interest detector which rejects uninformative background regions.

...read moreread less

Proceedings ArticleDOI

Learning to detect unseen object classes by between-class attribute transfer

Christoph H. Lampert, +2 more

TL;DR: The experiments show that by using an attribute layer it is indeed possible to build a learning object detection system that does not require any training images of the target classes, and assembled a new large-scale dataset, “Animals with Attributes”, of over 30,000 animal images that match the 50 classes in Osherson's classic table of how strongly humans associate 85 semantic attributes with animal classes.

...read moreread less

Related Papers (5)

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

The Caltech-UCSD Birds-200-2011 Dataset

Citations

Spatial transformer networks

CNN Features off-the-shelf: an Astounding Baseline for Recognition

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

CNN Features Off-the-Shelf: An Astounding Baseline for Recognition

CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features

References

ImageNet: A large-scale hierarchical image database

The Pascal Visual Object Classes (VOC) Challenge

A discriminatively trained, multiscale, deformable part model

Caltech-256 Object Category Dataset

Learning to detect unseen object classes by between-class attribute transfer

Related Papers (5)

Deep Residual Learning for Image Recognition

ImageNet Large Scale Visual Recognition Challenge

ImageNet: A large-scale hierarchical image database

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition