scispace - formally typeset
Search or ask a question
Author

Ujjwal Bhattacharya

Bio: Ujjwal Bhattacharya is an academic researcher from Indian Statistical Institute. The author has contributed to research in topics: Handwriting recognition & Intelligent character recognition. The author has an hindex of 25, co-authored 97 publications receiving 2480 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: P pioneering development of two databases for handwritten numerals of two most popular Indian scripts, a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and application for the recognition of mixed handwritten numeral recognition of three Indian scripts Devanagari, Bangla and English.
Abstract: This article primarily concerns the problem of isolated handwritten numeral recognition of major Indian scripts. The principal contributions presented here are (a) pioneering development of two databases for handwritten numerals of two most popular Indian scripts, (b) a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and (c) application of (b) for the recognition of mixed handwritten numerals of three Indian scripts Devanagari, Bangla and English. The present databases include respectively 22,556 and 23,392 handwritten isolated numeral samples of Devanagari and Bangla collected from real-life situations and these can be made available free of cost to researchers of other academic Institutions. In the proposed scheme, a numeral is subjected to three multilayer perceptron classifiers corresponding to three coarse-to-fine resolution levels in a cascaded manner. If rejection occurred even at the highest resolution, another multilayer perceptron is used as the final attempt to recognize the input numeral by combining the outputs of three classifiers of the previous stages. This scheme has been extended to the situation when the script of a document is not known a priori or the numerals written on a document belong to different scripts. Handwritten numerals in mixed scripts are frequently found in Indian postal mails and table-form documents.

328 citations

Journal ArticleDOI
TL;DR: A technique is proposed, which at first cleverly picks up samples near the decision boundary without actually knowing the position of decision boundary, which results in quick and better convergence of the training algorithm.

194 citations

Proceedings ArticleDOI
31 Aug 2005
TL;DR: Three image databases of handwritten isolated numerals of three different Indian scripts namely Devnagari, Bangla and Oriya are described in this paper.
Abstract: Three image databases of handwritten isolated numerals of three different Indian scripts namely Devnagari, Bangla and Oriya are described in this paper. Grayscale images of 22556 Devnagari numerals written by 1049 persons, 12938 Bangla numerals written by 556 persons and 5970 Oriya numerals written by 356 persons form the respective databases. These images were scanned from three different kinds of handwritten documents - postal mails, job application form and another set of forms specially designed by the collectors for the purpose. The only restriction imposed on the writers is to write each numeral within a rectangular box. These databases are free from the limitations that they are neither developed in laboratory environments nor they are non-uniformly distributed over different classes. Also, for comparison purposes, each database has been properly divided into respective training and test sets.

132 citations

Journal ArticleDOI
TL;DR: The Histogram of Oriented Gradient is extended and two new feature descriptors are proposed: Co-occurrence HOG (Co-HOG) and Convolutional Co-Hog (ConvCo- HOG) for accurate recognition of scene texts of different languages.

130 citations

Proceedings ArticleDOI
23 Aug 2015
TL;DR: A convolutional neural network trained for a larger class recognition problem towards feature extraction of samples of several smaller class recognition problems of English, Devanagari, Bangla, Telugu and Oriya each of which is an official Indian script.
Abstract: There are many scripts in the world, several of which are used by hundreds of millions of people. Handwritten character recognition studies of several of these scripts are found in the literature. Different hand-crafted feature sets have been used in these recognition studies. However, convolutional neural network (CNN) has recently been used as an efficient unsupervised feature vector extractor. Although such a network can be used as a unified framework for both feature extraction and classification, it is more efficient as a feature extractor than as a classifier. In the present study, we performed certain amount of training of a 5-layer CNN for a moderately large class character recognition problem. We used this CNN trained for a larger class recognition problem towards feature extraction of samples of several smaller class recognition problems. In each case, a distinct Support Vector Machine (SVM) was used as the corresponding classifier. In particular, the CNN of the present study is trained using samples of a standard 50-class Bangla basic character database and features have been extracted for 5 different 10-class numeral recognition problems of English, Devanagari, Bangla, Telugu and Oriya each of which is an official Indian script. Recognition accuracies are comparable with the state-of-the-art.

117 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: A double-loop Expectation-Maximization (EM) algorithm has been introduced to the ME network structure for detection of epileptic seizure and the results confirmed that the proposed Me network structure has some potential in detecting epileptic seizures.
Abstract: Mixture of experts (ME) is modular neural network architecture for supervised learning. A double-loop Expectation-Maximization (EM) algorithm has been introduced to the ME network structure for detection of epileptic seizure. The detection of epileptiform discharges in the EEG is an important component in the diagnosis of epilepsy. EEG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT). Then these sub-band frequencies were used as an input to a ME network with two discrete outputs: normal and epileptic. In order to improve accuracy, the outputs of expert networks were combined according to a set of local weights called the ''gating function''. The invariant transformations of the ME probability density functions include the permutations of the expert labels and the translations of the parameters in the gating functions. The performance of the proposed model was evaluated in terms of classification accuracies and the results confirmed that the proposed ME network structure has some potential in detecting epileptic seizures. The ME network structure achieved accuracy rates which were higher than that of the stand-alone neural network model.

1,053 citations

Journal ArticleDOI
TL;DR: This review provides a fundamental comparison and analysis of the remaining problems in the field and summarizes the fundamental problems and enumerates factors that should be considered when addressing these problems.
Abstract: This paper analyzes, compares, and contrasts technical challenges, methods, and the performance of text detection and recognition research in color imagery It summarizes the fundamental problems and enumerates factors that should be considered when addressing these problems Existing techniques are categorized as either stepwise or integrated and sub-problems are highlighted including text localization, verification, segmentation and recognition Special issues associated with the enhancement of degraded text and the processing of video text, multi-oriented, perspectively distorted and multilingual text are also addressed The categories and sub-categories of text are illustrated, benchmark datasets are enumerated, and the performance of the most representative approaches is compared This review provides a fundamental comparison and analysis of the remaining problems in the field

709 citations

Journal ArticleDOI
TL;DR: This work created a dataset of solar PV arrays to initiate and develop the process of automatically identifying solar PV locations using remote sensing imagery, and contains the geospatial coordinates and border vertices for over 19,000 solar panels across 601 high-resolution images from four cities in California.
Abstract: Earth-observing remote sensing data, including aerial photography and satellite imagery, offer a snapshot of the world from which we can learn about the state of natural resources and the built environment. The components of energy systems that are visible from above can be automatically assessed with these remote sensing data when processed with machine learning methods. Here, we focus on the information gap in distributed solar photovoltaic (PV) arrays, of which there is limited public data on solar PV deployments at small geographic scales. We created a dataset of solar PV arrays to initiate and develop the process of automatically identifying solar PV locations using remote sensing imagery. This dataset contains the geospatial coordinates and border vertices for over 19,000 solar panels across 601 high-resolution images from four cities in California. Dataset applications include training object detection and other machine learning algorithms that use remote sensing imagery, developing specific algorithms for predictive detection of distributed PV systems, estimating installed PV capacity, and analysis of the socioeconomic correlates of PV deployment. Machine-accessible metadata file describing the reported data (ISA-Tab format)

633 citations

Journal ArticleDOI
TL;DR: A review of the OCR work done on Indian language scripts and the scope of future work and further steps needed for Indian script OCR development is presented.

592 citations