scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Benchmark databases of handwritten Bangla-Roman and Devanagari-Roman mixed-script document images

01 Apr 2018-Multimedia Tools and Applications (Springer US)-Vol. 77, Iss: 7, pp 8441-8473
TL;DR: This paper addresses three key challenges here: collection, compilation and organization of benchmark databases of images of 150 Bangla-Roman and 150 Devanagari-Roman mixed-script handwritten document pages respectively, and development of a bi-script and tri-script word-level script identification module using Modified log-Gabor filter as feature extractor.
Abstract: Handwritten document image dataset is one of the basic necessities to conduct research on developing Optical Character Recognition (OCR) systems. In a multilingual country like India, handwritten documents often contain more than one script, leading to complex pattern analysis problems. In this paper, we highlight two such situations where Devanagari and Bangla scripts, two most widely used scripts in Indian sub-continent, are individually used along with Roman script in documents. We address three key challenges here: 1) collection, compilation and organization of benchmark databases of images of 150 Bangla-Roman and 150 Devanagari-Roman mixed-script handwritten document pages respectively, 2) script-level annotation of 18931 Bangla words, 15528 Devanagari words and 10331 Roman words in those 300 document pages, and 3) development of a bi-script and tri-script word-level script identification module using Modified log-Gabor filter as feature extractor. The technique is statistically validated using multiple classifiers and it is found that Multi-Layer Perceptron (MLP) classifier performs the best. Average word-level script identification accuracies of 92.32%, 95.30% and 93.78% are achieved using 3-fold cross validation for Bangla-Roman, Devanagari-Roman and Bangla-Devanagari-Roman databases respectively. Both the mixed-script document databases along with the script-level annotations and 44790 extracted word images of the three aforementioned scripts are available freely at https://code.google.com/p/cmaterdb/ .
Citations
More filters
Journal ArticleDOI
TL;DR: A novel feature selection (FS) model called cooperative genetic algorithm (CGA) is proposed to select some of the most important and discriminating features from the entire feature set to improve the classification accuracy as well as the time requirement of the activity recognition mechanism.
Abstract: Recognition of human actions from visual contents is a budding field of computer vision and image understanding. The problem with such a recognition system is the huge dimensions of the feature vectors. Many of these features are irrelevant to the classification mechanism. For this reason, in this paper, we propose a novel feature selection (FS) model called cooperative genetic algorithm (CGA) to select some of the most important and discriminating features from the entire feature set to improve the classification accuracy as well as the time requirement of the activity recognition mechanism. In CGA, we have made an effort to embed the concepts of cooperative game theory in GA to create a both-way reinforcement mechanism to improve the solution of the FS model. The proposed FS model is tested on four benchmark video datasets named Weizmann, KTH, UCF11, HMDB51, and two sensor-based UCI HAR datasets. The experiments are conducted using four state-of-the-art feature descriptors, namely HOG, GLCM, SURF, and GIST. It is found that there is a significant improvement in the overall classification accuracy while considering very small fraction of the original feature vector.

42 citations

Journal ArticleDOI
TL;DR: A comprehensive survey of the techniques developed for handwritten Indic script identification is presented, including classifiers used for script identification, and their merits and demerits are discussed.
Abstract: Script identification is crucial for automating optical character recognition (OCR) in multi-script documents since OCRs are script-dependent. In this paper, we present a comprehensive survey of th...

26 citations

Journal ArticleDOI
TL;DR: In this paper, an attempt is made to analyze and classify various script identification schemes for document images, and the comparison is made between these schemes, and discussion is made based upon their merits and demerits on a common platform.
Abstract: Script identification is being widely accepted techniques for selection of the particular script OCR (Optical Character Recognition) in multilingual document images. Extensive research has been done in this field, but still it suffers from low identification accuracy. This is due to the presence of faded document images, illuminations and positions while scanning. Noise is also a major obstacle in the script identification process. However, it can only be minimized up to a level, but cannot be removed completely. In this paper, an attempt is made to analyze and classify various script identification schemes for document images. The comparison is also made between these schemes, and discussion is made based upon their merits and demerits on a common platform. This will help the researchers to understand the complexity of the issue and identify possible directions for research in this field.

21 citations


Cites methods from "Benchmark databases of handwritten ..."

  • ...[107] used modified Gabor filter-based features for classification of Bangla, Devanagari and Roman scripts....

    [...]

Journal ArticleDOI
TL;DR: A novel benchmark performance is conceived that has delivered state-of-the-art decisions on two regional handwritten character identifications and the mathematical rationale for using non-linearity in the deep learning (DL) model is stretched.
Abstract: Recognition of handwritten characters in two Indic scripts Bangla and Meitei Mayek is one of the challenging responsibilities due to intricate patterns and scarcity of standard datasets. Convolutional Neural Network (CNN) is one of the stablest well-known techniques for classifying objects in distinctive specialties as it has an extraordinary capability of discovering complex patterns. In this paper, we hook a different layout and obtain a unique CNN architecture from scratch, which has manifold advantages over classical machine learning (ML) approaches, and it has a unique ability to consolidate feature extraction and classification altogether. Further, we stretch our work to uncover the mathematical rationale for using non-linearity in the deep learning (DL) model. Our proposed CNN architecture consists of four layers, including convolutional layer (CL), nonlinear activation layer (AL), pooling layer (PL), and fully connected layer (FCL), which are used in the existing two accessible Bangla datasets named cMATERdb and ISI Bangla datasets. The identical model also validates on proposed Manipuri Character dataset, called “Mayek27”. Moreover, we perform an in-depth comparison with different batch sizes and optimization techniques over all the datasets for understanding their functionality. We conceive a novel benchmark performance that has delivered state-of-the-art decisions on two regional handwritten character identifications.

20 citations

Journal ArticleDOI
TL;DR: A novel framework based on improved particle swarm optimization (PSO) algorithm to automatically construct optimal convolutional neural network (CNN) architecture has been proposed with an aim to outperform the existing techniques.

19 citations

References
More filters
Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations

Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations


"Benchmark databases of handwritten ..." refers background in this paper

  • ...& SVM: SVM classifier [7] effectively maps pattern vectors to a high dimensional feature space where a ‘best’ separating hyperplane (the maximal margin hyperplane) is constructed....

    [...]

Proceedings Article
02 Aug 1996
TL;DR: In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.
Abstract: Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, we present the new clustering algorithm DBSCAN relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN requires only one input parameter and supports the user in determining an appropriate value for it. We performed an experimental evaluation of the effectiveness and efficiency of DBSCAN using synthetic data and real data of the SEQUOIA 2000 benchmark. The results of our experiments demonstrate that (1) DBSCAN is significantly more effective in discovering clusters of arbitrary shape than the well-known algorithm CLAR-ANS, and that (2) DBSCAN outperforms CLARANS by a factor of more than 100 in terms of efficiency.

17,056 citations

Journal ArticleDOI
01 Aug 1996
TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.
Abstract: Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.

16,118 citations

Proceedings Article
01 Jan 1996
TL;DR: DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.
Abstract: Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, we present the new clustering algorithm DBSCAN relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN requires only one input parameter and supports the user in determining an appropriate value for it. We performed an experimental evaluation of the effectiveness and efficiency of DBSCAN using synthetic data and real data of the SEQUOIA 2000 benchmark. The results of our experiments demonstrate that (1) DBSCAN is significantly more effective in discovering clusters of arbitrary shape than the well-known algorithm CLARANS, and that (2) DBSCAN outperforms CLARANS by a factor of more than 100 in terms of efficiency.

14,297 citations


"Benchmark databases of handwritten ..." refers methods in this paper

  • ...The feature points generated from Harris corner point detection are passed on to Density-based Spatial Clustering of Applications with Noise (DBSCAN) algorithm [19]....

    [...]