Text localization, enhancement and binarization in multimedia documents

doi:10.1109/ICPR.2002.1048482

Proceedings ArticleDOI

Text localization, enhancement and binarization in multimedia documents

- Vol. 2, pp 1037-1040

TLDR

An algorithm to localize artificial text in images and videos using a measure of accumulated gradients and morphological post processing to detect the text is presented and the quality of the localized text is improved by robust multiple frame integration.

Abstract:

The systems currently available for content based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localize artificial text in images and videos using a measure of accumulated gradients and morphological post processing to detect the text. The quality of the localized text is improved by robust multiple frame integration. Anew technique for the binarization of the text boxes is proposed. Finally, detection and OCR results for a commercial OCR are presented.

Citations

PDF

Open Access

More filters

Dissertation

Apprentissage neuronal de caractéristiques spatio-temporelles pour la classification automatique de séquences vidéo

Moez Baccouche

TL;DR: Cette these s'interesse a la problematique de the classification automatique des sequences video est de se demarquer de la methodologie dominante qui se base sur l'utilisation de caracteristiques concues manuellement, and de proposer des modeles qui soient les plus generiques possibles and independants du domaine.

...read moreread less

Posted Content

Full-Page Text Recognition: Learning Where to Start and When to Stop

Bastien Moysset, +2 more

- 27 Apr 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A new approach for full page text recognition based on regressions with Fully Convolutional Neural Networks and Multidimensional Long Short-Term Memory as contextual layers and only the position of the left side of the text lines are predicted.

...read moreread less

Dissertation

Camera-Captured Document Image Analysis

T. Kasar

TL;DR: This thesis designs a robust feature named connected component descriptor that is tailored for mosaicing camera-captured document images and addresses two critical issues often encountered in correspondence matching, the stability of features and robustness against false matches due to multiple instances of many characters in a document image.

...read moreread less

Journal ArticleDOI

A Novel Degraded Document Binarization Model through Vision Transformer Network

Mingming Yang, +1 more

- 01 Dec 2022 -

Information Fusion

TL;DR: Wang et al. as mentioned in this paper proposed a dual-branched encoding feature fusion module, which combines architectural components from the vision transformer framework and deep convolutional neural networks to extract features from an input document that are sensitive to both global and local characteristics.

...read moreread less

Journal ArticleDOI

Restoration of deteriorated text sections in ancient document images using a tri-level semi-adaptive thresholding technique

N. Shobha Rani, +4 more

- 23 Feb 2022 -

Automatika

TL;DR: The proposed research aims to restore deteriorated text sections that are affected by stain markings, ink seepages and document ageing in ancient document photographs, as these challenges confront document enhancement with a tri-level semi-adaptive thresholding technique.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

A threshold selection method from gray level histograms

Nobuyuki Otsu

- 01 Jan 1979 -

IEEE Transactions on Systems, Man, and C...

IEEE transactions on pattern analysis and machine intelligence

Ieee Xplore

TL;DR: This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis and addressing interesting real-world computer Vision and multimedia applications.

...read moreread less

Book

An introduction to digital image processing

Wayne Niblack

Journal ArticleDOI

Goal-directed evaluation of binarization methods

O.D. Trier, +1 more

- 01 Dec 1995 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper presents a methodology for evaluation of low-level image analysis methods, using binarization (two-level thresholding) as an example, and defines the performance of the character recognition module as the objective measure.

...read moreread less

Proceedings ArticleDOI

Automatic text location in images and video frames

Anil K. Jain, +1 more

TL;DR: Compared with some traditional text location methods, this method has the following advantages: 1) low computational cost; 2) robust to font size; and 3) high accuracy.

...read moreread less

Text localization, enhancement and binarization in multimedia documents

Citations

Apprentissage neuronal de caractéristiques spatio-temporelles pour la classification automatique de séquences vidéo

Full-Page Text Recognition: Learning Where to Start and When to Stop

Camera-Captured Document Image Analysis

A Novel Degraded Document Binarization Model through Vision Transformer Network

Restoration of deteriorated text sections in ancient document images using a tri-level semi-adaptive thresholding technique

References

A threshold selection method from gray level histograms

IEEE transactions on pattern analysis and machine intelligence

An introduction to digital image processing

Goal-directed evaluation of binarization methods

Automatic text location in images and video frames

Related Papers (5)

A threshold selection method from gray level histograms

Adaptive document image binarization

An introduction to digital image processing

Text information extraction in images and video: a survey

Detecting text in natural scenes with stroke width transform