Scene Text Detection and Recognition: The Deep Learning Era

Open AccessPosted Content

Scene Text Detection and Recognition: The Deep Learning Era

Shangbang Long, +2 more

- 10 Nov 2018 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

This survey is aimed at summarizing and analyzing the major changes and significant progresses of scene text detection and recognition in the deep learning era.

Abstract:

With the rise and development of deep learning, computer vision has been tremendously transformed and reshaped. As an important research area in computer vision, scene text detection and recognition has been inescapably influenced by this wave of revolution, consequentially entering the era of deep learning. In recent years, the community has witnessed substantial advancements in mindset, approach and performance. This survey is aimed at summarizing and analyzing the major changes and significant progresses of scene text detection and recognition in the deep learning era. Through this article, we devote to: (1) introduce new insights and ideas; (2) highlight recent techniques and benchmarks; (3) look ahead into future trends. Specifically, we will emphasize the dramatic differences brought by deep learning and the grand challenges still remained. We expect that this review paper would serve as a reference book for researchers in this field. Related resources are also collected and compiled in our Github repository: this https URL.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

MASTER: Multi-aspect non-local network for scene text recognition

Ning Lu, +6 more

- 01 Sep 2021 -

Pattern Recognition

TL;DR: Wen et al. as discussed by the authors proposed MASTER, a self-attention based scene text recognizer that not only encodes the input-output attention but also learns selfattention which encodes feature-feature and target-target relationships inside the encoder and decoder and owns a great training efficiency because of high training parallelization and a high speed inference because of an efficient memory-cache mechanism.

...read moreread less

Posted Content

Decoupled Attention Network for Text Recognition

Tianwei Wang, +7 more

- 21 Dec 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A decoupled attention network (DAN), which decouples the alignment operation from using historical decoding results, and achieves state-of-the-art performance on multiple text recognition tasks, including offline handwritten text recognition and regular/irregular scene text recognition.

...read moreread less

Posted Content

Text Recognition in the Wild: A Survey

Xiaoxue Chen, +4 more

- 07 May 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This literature review attempts to present the entire picture of the field of scene text recognition, which provides a comprehensive reference for people entering this field, and could be helpful to inspire future research.

...read moreread less

Journal ArticleDOI

Text Recognition in the Wild: A Survey

Xiaoxue Chen, +4 more

- 05 Mar 2021 -

ACM Computing Surveys

TL;DR: A recent literature review as discussed by the authors summarizes the fundamental problems and the state-of-the-art associated with scene text recognition, introduces new insights and ideas, provides a comprehensive review of publicly available resources, and points out directions for future work.

...read moreread less

Posted Content

Towards Unconstrained End-to-End Text Spotting

Siyang Qin, +4 more

- 24 Aug 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This article proposed an end-to-end trainable network that can simultaneously detect and recognize text of arbitrary shape, making substantial progress on the open problem of reading scene text of irregular shape.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Book ChapterDOI

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, +2 more

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

...read moreread less

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less