Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition

Open AccessPosted Content

Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition

Max Jaderberg, +3 more

- 09 Jun 2014 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

This work presents a framework for the recognition of natural scene text that does not require any human-labelled data, and performs word recognition on the whole image holistically, departing from the character based recognition systems of the past.

Abstract:

In this work we present a framework for the recognition of natural scene text. Our framework does not require any human-labelled data, and performs word recognition on the whole image holistically, departing from the character based recognition systems of the past. The deep neural network models at the centre of this framework are trained solely on data produced by a synthetic text generation engine -- synthetic data that is highly realistic and sufficient to replace real data, giving us infinite amounts of training data. This excess of data exposes new possibilities for word recognition models, and here we consider three models, each one "reading" words in a different way: via 90k-way dictionary encoding, character sequence encoding, and bag-of-N-grams encoding. In the scenarios of language based and completely unconstrained text recognition we greatly improve upon state-of-the-art performance on standard datasets, using our fast, simple machinery and requiring zero data-acquisition costs.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Japanese Scene Character Recognition using Random Image Feature and Ensemble Scheme.

Fuma Horie, +1 more

TL;DR: Random Image Feature (RI-Feature) method is newly proposed for improving the ensemble learning and it is shown that HOG feature outperforms CNN in the Japanese scene character recognition.

...read moreread less

Deep Neural Networks for Selected Natural Language Processing Tasks

Jiří Martínek

TL;DR: The report is focused on modern deep neural network classifiers, which are first introduced theoretically with the support of relevant publications and then they are experimentally verified on suitable datasets to achieve excellent results in natural language processing tasks.

...read moreread less

Book ChapterDOI

A Multi-level Progressive Rectification Mechanism for Irregular Scene Text Recognition

Qianying Liao, +6 more

TL;DR: Zhang et al. as discussed by the authors proposed a multi-level progressive rectification mechanism, which consists of global and local rectification modules at the image level and a refinement rectification module at the feature level.

...read moreread less

Proceedings ArticleDOI

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

Qi Song, +4 more

TL;DR: In this article, a Rectified Attentional Double Supervised Network (ReADS) is proposed to overcome the weaknesses of Connectionist Temporal Classification (CTC) and Attentional Sequence Recognition (Attn).

...read moreread less

Dissertation

The robustness of animated text CAPTCHAs

Mohamad Tayara

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal ArticleDOI

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Posted Content

Improving neural networks by preventing co-adaptation of feature detectors

Geoffrey E. Hinton, +4 more

- 03 Jul 2012 -

arXiv: Neural and Evolutionary Computing

TL;DR: The authors randomly omits half of the feature detectors on each training case to prevent complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors.

...read moreread less

Journal ArticleDOI

DRC: a dual route cascaded model of visual word recognition and reading aloud.

Max Coltheart, +4 more

- 01 Jan 2001 -

Psychological Review

TL;DR: The DRC model is a computational realization of the dual-route theory of reading, and is the only computational model of reading that can perform the 2 tasks most commonly used to study reading: lexical decision and reading aloud.

...read moreread less

Proceedings Article

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

Pierre Sermanet, +5 more

TL;DR: In this article, a multiscale and sliding window approach is proposed to predict object boundaries, which is then accumulated rather than suppressed in order to increase detection confidence, and OverFeat is the winner of the ImageNet Large Scale Visual Recognition Challenge 2013.

...read moreread less

Collapse

Related Papers (5)

An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition

Baoguang Shi, +2 more

- 01 Nov 2017 -

IEEE Transactions on Pattern Analysis an...

Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition

Citations

Japanese Scene Character Recognition using Random Image Feature and Ensemble Scheme.

Deep Neural Networks for Selected Natural Language Processing Tasks

A Multi-level Progressive Rectification Mechanism for Irregular Scene Text Recognition

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

The robustness of animated text CAPTCHAs

References

ImageNet Classification with Deep Convolutional Neural Networks

Gradient-based learning applied to document recognition

Improving neural networks by preventing co-adaptation of feature detectors

DRC: a dual route cascaded model of visual word recognition and reading aloud.

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

Related Papers (5)

An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition

End-to-end scene text recognition

Synthetic Data for Text Localisation in Natural Images

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks

ICDAR 2013 Robust Reading Competition