GradientBased Learning Applied to Document Recognition

doi:10.1109/9780470544976.CH9

Book ChapterDOI

GradientBased Learning Applied to Document Recognition

- pp 306-351

TLDR

Various methods applied to handwritten character recognition are reviewed and compared and Convolutional Neural Networks, that are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques.

Abstract:

Multilayer Neural Networks trained with the backpropagation algorithm constitute the best example of a successful Gradient-Based Learning technique. Given an appropriate network architecture, Gradient-Based Learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional Neural Networks, that are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation, recognition, and language modeling. A new learning paradigm, called Graph Transformer Networks (GTN), allows such multi-module systems to be trained globally using Gradient-Based methods so as to minimize an overall performance measure. Two systems for on-line handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of Graph Transformer Networks. A Graph Transformer Network for reading bank check is also described. It uses Convolutional Neural Network character recognizers combined with global training techniques to provides record accuracy on business and personal checks. It is deployed commercially and reads several million checks per day.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Multi-manifold deep metric learning for image set classification

Jiwen Lu, +4 more

TL;DR: A multi-manifold deep metric learning method for image set classification, which aims to recognize an object of interest from a set of image instances captured from varying viewpoints or under varying illuminations, achieves the state-of-the-art performance on five widely used datasets.

...read moreread less

Posted Content

Unsupervised Discovery of Interpretable Directions in the GAN Latent Space

Andrey Voynov, +1 more

- 10 Feb 2020 -

arXiv: Learning

TL;DR: This paper introduces an unsupervised method to identify interpretable directions in the latent space of a pretrained GAN model by a simple model-agnostic procedure, and finds directions corresponding to sensible semantic manipulations without any form of (self-)supervision.

...read moreread less

Proceedings ArticleDOI

Cross-Domain Self-Supervised Multi-task Feature Learning Using Synthetic Imagery

Zhongzheng Ren, +1 more

TL;DR: A novel multi-task deep network to learn generalizable high-level visual representations based on adversarial learning is proposed and it is demonstrated that the network learns more transferable representations compared to single-task baselines.

...read moreread less

Posted Content

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.

Huikai Wu, +4 more

- 28 Mar 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work proposes a novel joint upsampling module named Joint Pyramid Upsampling (JPU), which achieves the state-of-the-art performance in Pascal Context dataset and ADE20K dataset while running 3 times faster.

...read moreread less

Proceedings ArticleDOI

Improved musical onset detection with Convolutional Neural Networks

Jan Schlüter, +1 more

TL;DR: It is shown that CNNs outperform the previous state-of-the-art while requiring less manual preprocessing, suggesting that even for well-understood signal processing tasks, machine learning can be superior to knowledge engineering.

...read moreread less

Collapse

GradientBased Learning Applied to Document Recognition

Citations

Multi-manifold deep metric learning for image set classification

Unsupervised Discovery of Interpretable Directions in the GAN Latent Space

Cross-Domain Self-Supervised Multi-task Feature Learning Using Synthetic Imagery

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.

Improved musical onset detection with Convolutional Neural Networks

Related Papers (5)

Gradient-based learning applied to document recognition

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet: A large-scale hierarchical image database