Open AccessProceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton +2 more
- Vol. 25, pp 1097-1105
TLDR
The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.Abstract:
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.read more
Citations
More filters
Proceedings ArticleDOI
SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks
TL;DR: This work proves the core reason Siamese trackers still have accuracy gap comes from the lack of strict translation invariance, and proposes a new model architecture to perform depth-wise and layer-wise aggregations, which not only improves the accuracy but also reduces the model size.
Proceedings ArticleDOI
CNN architectures for large-scale audio classification
Shawn Hershey,Sourish Chaudhuri,Daniel P. W. Ellis,Jort F. Gemmeke,Aren Jansen,R. Channing Moore,Manoj Plakal,Devin Platt,Rif A. Saurous,Bryan Seybold,Malcolm Slaney,Ron Weiss,Kevin W. Wilson +12 more
TL;DR: In this paper, the authors used various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with 30,871 video-level labels.
Proceedings ArticleDOI
Learning to Adapt Structured Output Space for Semantic Segmentation
Yi-Hsuan Tsai,Wei-Chih Hung,Samuel Schulter,Kihyuk Sohn,Ming-Hsuan Yang,Manmohan Chandraker,Manmohan Chandraker +6 more
TL;DR: In this paper, a multi-level adversarial network is proposed to perform output space domain adaptation at different feature levels, including synthetic-to-real and cross-city scenarios.
Posted Content
Self-Normalizing Neural Networks
TL;DR: Self-normalizing neural networks (SNNs) are introduced to enable high-level abstract representations and it is proved that activations close to zero mean and unit variance that are propagated through many network layers will converge towards zero meanand unit variance -- even under the presence of noise and perturbations.
Journal ArticleDOI
CellProfiler 3.0: Next-generation image processing for biology.
Claire McQuin,Allen Goodman,Vasiliy S. Chernyshev,Vasiliy S. Chernyshev,Lee Kamentsky,Beth A. Cimini,Kyle W. Karhohs,Minh Doan,Liya Ding,Susanne M. Rafelski,Derek Thirstrup,Winfried Wiegraebe,Shantanu Singh,Tim Becker,Juan C. Caicedo,Anne E. Carpenter +15 more
TL;DR: CellProfiler 3.0 is described, a new version of the software supporting both whole-volume and plane-wise analysis of three-dimensional image stacks, increasingly common in biomedical research.
References
More filters
Journal ArticleDOI
Random Forests
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Book ChapterDOI
Learning internal representations by error propagation
TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.
Dissertation
Learning Multiple Layers of Features from Tiny Images
TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.
Proceedings Article
Rectified Linear Units Improve Restricted Boltzmann Machines
Vinod Nair,Geoffrey E. Hinton +1 more
TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.