scispace - formally typeset
Proceedings ArticleDOI

Text localization, enhancement and binarization in multimedia documents

TLDR
An algorithm to localize artificial text in images and videos using a measure of accumulated gradients and morphological post processing to detect the text is presented and the quality of the localized text is improved by robust multiple frame integration.
Abstract
The systems currently available for content based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localize artificial text in images and videos using a measure of accumulated gradients and morphological post processing to detect the text. The quality of the localized text is improved by robust multiple frame integration. Anew technique for the binarization of the text boxes is proposed. Finally, detection and OCR results for a commercial OCR are presented.

read more

Citations
More filters
Book ChapterDOI

Adaptive fuzzy text segmentation in images with complex backgrounds using color and texture

TL;DR: The proposed TS method takes as input the localized text and proceeds as follows: first, the number of initial clusters is determined by analyzing the colors of the image, then the image pixels are clustered using theNumber of clusters defined in the first step.
Proceedings ArticleDOI

Language Model Supervision for Handwriting Recognition Model Adaptation

TL;DR: In this paper, the authors adapt HWR models trained on a source language to a target language that uses the same writing script using only labeled data from the source language, unlabeled data in the target language, and a language model in target language.
Proceedings ArticleDOI

Scale Space Binarization Using Edge Information Weighted by a Foreground Estimation

TL;DR: The proposed binarization algorithm uses a scale space to avoid the estimation of script size dependent parameters and the use of integral images for the calculation of the mean, standard deviation and morphological operations allow for an efficient implementation of the method presented.

A Novel Method for Efficient Text Extraction from Real Time Images with Diversified Background using Haar Discrete Wavelet Transform and K-Means Clustering

TL;DR: The proposed system highlights a novel approach of extracting a text from image using two dimensional Haar Discrete Wavelet Transformation and K-Means Clustering to accurately distinguish the text and non-text area for better text localization and extraction.
Posted Content

UDBNET: Unsupervised Document Binarization Network via Adversarial Game

TL;DR: A novel approach towards document image binarization is presented by introducing three-player minmax adversarial game and the superior performance of the proposed model over existing state-of-the-art algorithm on widely used DIBCO datasets is indicated.
References
More filters

IEEE transactions on pattern analysis and machine intelligence

Ieee Xplore
TL;DR: This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis and addressing interesting real-world computer Vision and multimedia applications.
Journal ArticleDOI

Goal-directed evaluation of binarization methods

TL;DR: This paper presents a methodology for evaluation of low-level image analysis methods, using binarization (two-level thresholding) as an example, and defines the performance of the character recognition module as the objective measure.
Proceedings ArticleDOI

Automatic text location in images and video frames

TL;DR: Compared with some traditional text location methods, this method has the following advantages: 1) low computational cost; 2) robust to font size; and 3) high accuracy.