scispace - formally typeset
Book ChapterDOI

GradientBased Learning Applied to Document Recognition

TLDR
Various methods applied to handwritten character recognition are reviewed and compared and Convolutional Neural Networks, that are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques.
Abstract
Multilayer Neural Networks trained with the backpropagation algorithm constitute the best example of a successful Gradient-Based Learning technique. Given an appropriate network architecture, Gradient-Based Learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional Neural Networks, that are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation, recognition, and language modeling. A new learning paradigm, called Graph Transformer Networks (GTN), allows such multi-module systems to be trained globally using Gradient-Based methods so as to minimize an overall performance measure. Two systems for on-line handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of Graph Transformer Networks. A Graph Transformer Network for reading bank check is also described. It uses Convolutional Neural Network character recognizers combined with global training techniques to provides record accuracy on business and personal checks. It is deployed commercially and reads several million checks per day.

read more

Citations
More filters
Proceedings ArticleDOI

A deep learning approach to identifying source code in images and video

TL;DR: A deep learning solution based on convolutional neural networks and autoencoders is developed that provides a more scalable basis for video indexing that can be incorporated into existing software search and mining tools.
Posted Content

Robust Anomaly Detection in Images using Adversarial Autoencoders

TL;DR: In this article, an adversarial autoencoder architecture is adapted, which imposes a prior distribution on the latent representation, typically placing anomalies into low likelihood-regions, which results in an anomaly detector that is significantly more robust to the presence of outliers during training.
Posted Content

Learning Deep ResNet Blocks Sequentially using Boosting Theory

TL;DR: BoostResNet as discussed by the authors proposes a boosting theory for the ResNet architecture and proves that the training error decays exponentially with the depth of the network if the weak module classifiers perform slightly better than some weak baseline.
Journal ArticleDOI

Learning Scene Illumination by Pairwise Photos from Rear and Front Mobile Cameras

TL;DR: A learning based method to recover low‐frequency scene illumination represented as spherical harmonic functions by pairwise photos from rear and front cameras on mobile devices is proposed and produces visually and quantitatively superior results compared to the state‐of‐the‐arts.
Posted Content

PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

TL;DR: The results demonstrate that PAGE not only converges much faster than SGD in training but also achieves the higher test accuracy, validating the theoretical results and confirming the practical superiority of PAGE.
Related Papers (5)