Microsoft COCO: Common Objects in Context

Open AccessPosted Content

Microsoft COCO: Common Objects in Context

- 01 May 2014 -

arXiv: Computer Vision and Pattern Recog...

TLDR

In this article, the authors present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding.

Abstract:

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object localization. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement via novel user interfaces for category detection, instance spotting and instance segmentation. We present a detailed statistical analysis of the dataset in comparison to PASCAL, ImageNet, and SUN. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.

Citations

PDF

Open Access

More filters

Posted Content

Fast R-CNN

Ross Girshick

- 30 Apr 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection that builds on previous work to efficiently classify object proposals using deep convolutional networks.

...read moreread less

Posted Content

Long-term Recurrent Convolutional Networks for Visual Recognition and Description

Jeff Donahue, +6 more

- 17 Nov 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and shows such models have distinct advantages over state-of-the-art models for recognition or generation which are separately defined and/or optimized.

...read moreread less

Posted Content

Recurrent Neural Network Regularization

Wojciech Zaremba, +2 more

- 08 Sep 2014 -

arXiv: Neural and Evolutionary Computing

TL;DR: This paper shows how to correctly apply dropout to LSTMs, and shows that it substantially reduces overfitting on a variety of tasks.

...read moreread less

Proceedings ArticleDOI

Conditional Random Fields as Recurrent Neural Networks

Shuai Zheng, +7 more

TL;DR: In this article, a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling is introduced.

...read moreread less

Proceedings ArticleDOI

Conditional Random Fields as Recurrent Neural Networks

Shuai Zheng, +7 more

- 11 Feb 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling is introduced, and top results are obtained on the challenging Pascal VOC 2012 segmentation benchmark.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010 -

International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Collapse

International Journal of Computer Vision

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

Microsoft COCO: Common Objects in Context

Citations

Fast R-CNN

Long-term Recurrent Convolutional Networks for Visual Recognition and Description

Recurrent Neural Network Regularization

Conditional Random Fields as Recurrent Neural Networks

Conditional Random Fields as Recurrent Neural Networks

References

ImageNet: A large-scale hierarchical image database

Histograms of oriented gradients for human detection

Microsoft COCO: Common Objects in Context

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

The Pascal Visual Object Classes (VOC) Challenge

Related Papers (5)

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

ImageNet Large Scale Visual Recognition Challenge

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks