scispace - formally typeset
Search or ask a question
Author

Palaiahnakote Shivakumara

Bio: Palaiahnakote Shivakumara is an academic researcher from Information Technology University. The author has contributed to research in topics: Pixel & Feature extraction. The author has an hindex of 32, co-authored 215 publications receiving 3377 citations. Previous affiliations of Palaiahnakote Shivakumara include National University of Singapore & University of Malaya.


Papers
More filters
Proceedings ArticleDOI
01 Dec 2013
TL;DR: This paper introduces a new dataset called StreetViewText-Perspective, which contains texts in street images with a great variety of viewpoints and significantly outperforms the state-of-the-art on perspective texts of arbitrary orientations.
Abstract: This paper presents an approach to text recognition in natural scene images. Unlike most existing works which assume that texts are horizontal and frontal parallel to the image plane, our method is able to recognize perspective texts of arbitrary orientations. For individual character recognition, we adopt a bag-of-key points approach, in which Scale Invariant Feature Transform (SIFT) descriptors are extracted densely and quantized using a pre-trained vocabulary. Following [1, 2], the context information is utilized through lexicons. We formulate word recognition as finding the optimal alignment between the set of characters and the list of lexicon words. Furthermore, we introduce a new dataset called StreetViewText-Perspective, which contains texts in street images with a great variety of viewpoints. Experimental results on public datasets and the proposed dataset show that our method significantly outperforms the state-of-the-art on perspective texts of arbitrary orientations.

378 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposed method is able to handle graphics text and scene text of both horizontal and nonhorizontal orientation.
Abstract: In this paper, we propose a method based on the Laplacian in the frequency domain for video text detection. Unlike many other approaches which assume that text is horizontally-oriented, our method is able to handle text of arbitrary orientation. The input image is first filtered with Fourier-Laplacian. K-means clustering is then used to identify candidate text regions based on the maximum difference. The skeleton of each connected component helps to separate the different text strings from each other. Finally, text string straightness and edge density are used for false positive elimination. Experimental results show that the proposed method is able to handle graphics text and scene text of both horizontal and nonhorizontal orientation.

278 citations

Journal ArticleDOI
TL;DR: An efficient approach for face image feature extraction, namely, (2D)^2LDA method is presented, which obtains good recognition accuracy despite having less number of coefficients.

156 citations

Journal ArticleDOI
TL;DR: A new enhancement method that includes the product of Laplacian and Sobel operations to enhance text pixels in videos and proposes a Bayesian classifier without assuming a priori probability about the input frame but estimating it based on three probable matrices.
Abstract: Multioriented text detection in video frames is not as easy as detection of captions or graphics or overlaid texts, which usually appears in the horizontal direction and has high contrast compared to its background. Multioriented text generally refers to scene text that makes text detection more challenging and interesting due to unfavorable characteristics of scene text. Therefore, conventional text detection methods may not give good results for multioriented scene text detection. Hence, in this paper, we present a new enhancement method that includes the product of Laplacian and Sobel operations to enhance text pixels in videos. To classify true text pixels, we propose a Bayesian classifier without assuming a priori probability about the input frame but estimating it based on three probable matrices. Three different ways of clustering are performed on the output of the enhancement method to obtain the three probable matrices. Text candidates are obtained by intersecting the output of the Bayesian classifier with the Canny edge map of the input frame. A boundary growing method is introduced to traverse the multioriented scene text lines using text candidates. The boundary growing method works based on the concept of nearest neighbors. The robustness of the method has been tested on a variety of datasets that include our own created data (nonhorizontal and horizontal text data) and two publicly available data, namely, video frames of Hua and complex scene text data of ICDAR 2003 competition (camera images). Experimental results show that the performance of the proposed method is encouraging compared with results of existing methods in terms of recall, precision, F-measures, and computational times.

114 citations

Proceedings ArticleDOI
26 Jul 2009
TL;DR: The proposed text detection method outperforms three existing methods in terms of detection and false positive rates and employs empirical rules to eliminate false positives based on geometrical properties.
Abstract: In this paper, we propose an efficient text detection method based on the Laplacian operator. The maximum gradient difference value is computed for each pixel in the Laplacian-filtered image. K-means is then used to classify all the pixels into two clusters: text and non-text. For each candidate text region, the corresponding region in the Sobel edge map of the input image undergoes projection profile analysis to determine the boundary of the text blocks. Finally, we employ empirical rules to eliminate false positives based on geometrical properties. Experimental results show that the proposed method is able to detect text of different fonts, contrast and backgrounds. Moreover, it outperforms three existing methods in terms of detection and false positive rates.

97 citations


Cited by
More filters
Reference EntryDOI
15 Oct 2004

2,118 citations

Journal ArticleDOI
TL;DR: The Rotation Region Proposal Networks are designed to generate inclined proposals with text orientation angle information that are adapted for bounding box regression to make the proposals more accurately fit into the text region in terms of the orientation.
Abstract: This paper introduces a novel rotation-based framework for arbitrary-oriented text detection in natural scene images. We present the Rotation Region Proposal Networks , which are designed to generate inclined proposals with text orientation angle information. The angle information is then adapted for bounding box regression to make the proposals more accurately fit into the text region in terms of the orientation. The Rotation Region-of-Interest pooling layer is proposed to project arbitrary-oriented proposals to a feature map for a text region classifier. The whole framework is built upon a region-proposal-based architecture, which ensures the computational efficiency of the arbitrary-oriented text detection compared with previous text detection systems. We conduct experiments using the rotation-based framework on three real-world scene text detection datasets and demonstrate its superiority in terms of effectiveness and efficiency over previous approaches.

1,002 citations

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A system which detects texts of arbitrary orientations in natural images using a two-level classification scheme and two sets of features specially designed for capturing both the intrinsic characteristics of texts to better evaluate its algorithm and compare it with other competing algorithms.
Abstract: With the increasing popularity of practical vision systems and smart phones, text detection in natural scenes becomes a critical yet challenging task. Most existing methods have focused on detecting horizontal or near-horizontal texts. In this paper, we propose a system which detects texts of arbitrary orientations in natural images. Our algorithm is equipped with a two-level classification scheme and two sets of features specially designed for capturing both the intrinsic characteristics of texts. To better evaluate our algorithm and compare it with other competing algorithms, we generate a new dataset, which includes various texts in diverse real-world scenarios; we also propose a protocol for performance evaluation. Experiments on benchmark datasets and the proposed dataset demonstrate that our algorithm compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on texts of arbitrary orientations in complex natural scenes.

750 citations