scispace - formally typeset
Search or ask a question

Showing papers by "Ching Y. Suen published in 2009"


Proceedings ArticleDOI
28 Sep 2009
TL;DR: A novel age estimation technique that combines Active Appearance Models (AAMs) and Support Vector Machines (SVMs) to dramatically improve the accuracy of age estimation over the current state-of-the-art techniques is introduced.
Abstract: In this paper, we introduce a novel age estimation technique that combines Active Appearance Models (AAMs) and Support Vector Machines (SVMs), to dramatically improve the accuracy of age estimation over the current state-of-the-art techniques. In this method, characteristics of the input images, face image, are interpreted as feature vectors by AAMs, which are used to discriminate between childhood and adulthood, prior to age estimation. Faces classified as adults are passed to the adult age-determination function and the others are passed to the child age-determination function. Compared to published results, this method yields the highest accuracy recognition rates, both in overall mean-absolute error (MAE) and mean-absolute error for the two periods of human development: childhood and adulthood.

151 citations


Journal ArticleDOI
TL;DR: Some results of handwritten Bangla and Farsi numeral recognition on binary and gray-scale images are presented and some implementation choices of gradient direction feature extraction, some advanced normalization and classification methods are compared.

129 citations


Book ChapterDOI
29 Aug 2009
TL;DR: A new large Urdu handwriting database, which includes isolated digits, numeral strings with/without decimal points, five special symbols, 44 isolated characters, 57 Urdu words, and Urdu dates in different patterns, was designed at CENPARMI.
Abstract: A new large Urdu handwriting database, which includes isolated digits, numeral strings with/without decimal points, five special symbols, 44 isolated characters, 57 Urdu words (mostly financial related), and Urdu dates in different patterns, was designed at Centre for Pattern Recognition and Machine Intelligence (CENPARMI). It is the first database for Urdu off-line handwriting recognition. It involves a large number of Urdu native speakers from different regions of the world. Moreover, the database has different formats --- true color, gray level and binary. Experiments on Urdu digits recognition has been conducted with an accuracy of 98.61%. Methodologies in image pre-processing, gradient feature extraction and classification using SVM have been described, and a detailed error analysis is presented on the recognition results.

56 citations


Book ChapterDOI
07 Jul 2009
TL;DR: The Center for Pattern Recognition and Machine Intelligence (CENPARMI) Farsi dataset is introduced which can be used to measure the performance of handwritten recognition and word spotting systems.
Abstract: This paper introduces the Center for Pattern Recognition and Machine Intelligence (CENPARMI) Farsi dataset which can be used to measure the performance of handwritten recognition and word spotting systems. This dataset is unique in terms of its large number of gray and binary images (432,357 each) consisting of dates, words, isolated letters, isolated digits, numeral strings, special symbols, and documents. The data was collected from 400 native Farsi writers. The selection of Farsi words has been based on their high frequency in financial documents. The dataset is divided into grouped and ungrouped subsets which will give the user the flexibility of whether or not to use CENPARMI's pre-divided dataset (60% of the images are used as the Training set, 20% of the images as the Validation set, and the rest as the Testing set). Finally, experiments have been conducted on the Farsi isolated digits with a recognition rate of 96.85%.

28 citations


Proceedings ArticleDOI
26 Jul 2009
TL;DR: A new isolated handwritten Farsi numeral recognition algorithm is proposed in this paper, which exploits the sparse and over-complete structure from the handwritten F Persian numeral data, represented as an over- complete dictionary, which is learned by the K-SVD algorithm.
Abstract: A new isolated handwritten Farsi numeral recognition algorithm is proposed in this paper, which exploits the sparse and over-complete structure from the handwritten Farsi numeral data. In this research, the sparse structure is represented as an over-complete dictionary, which is learned by the K-SVD algorithm. These atoms in this dictionary are adopted to initialize the first layer of the Convolutional Neural Network (CNN), the latter is then trained to do the classification task. Data distortion techniques are also applied to promote the generalization capability of the trained classifier. Experiments have shown that good results have been achieved in CENPARMI handwritten Farsi numeral database.

27 citations


Journal ArticleDOI
TL;DR: Experiments on handwritten characters have shown that a proposed variant of the ensemble training algorithm, employing ensembles of HMMs, can lead to very promising performances and the use of a validation dataset demonstrated that it is possible to reach better performances than the ones presented by batch learning.

25 citations


Proceedings ArticleDOI
26 Jul 2009
TL;DR: ECOC is applied to the CNN for segmentation-free OCR such that the CNN target outputs are designed according to code words of length N; the minimum Hamming distance of the code words is designed to be as large as possible given N.
Abstract: It is known that convolutional neural networks (CNNs) are efficient for optical character recognition (OCR) and many other visual classification tasks. This paper applies error-correcting output coding (ECOC) to the CNN for segmentation-free OCR such that: 1) the CNN target outputs are designed according to code words of length N; 2) the minimum Hamming distance of the code words is designed to be as large as possible given N. ECOC provides the CNN with the ability to reject or correct output errors to reduce character insertions and substitutions in the recognized text. Also, using code words instead of letter images as the CNN target outputs makes it possible to construct an OCR for a new language without designing the letter images as the target outputs. Experiments on the recognition of English letters, 10 digits, and some special characters show the effectiveness of ECOC in reducing insertions and substitutions.

25 citations


Book ChapterDOI
29 Aug 2009
TL;DR: In this paper, a new approach on segmentation and recognition of off-line unconstrained Arabic handwritten numerals, which failed to be segmented with connected component analysis is proposed.
Abstract: In this paper, we propose a new approach on segmentation and recognition of off-line unconstrained Arabic handwritten numerals, which failed to be segmented with connected component analysis. In our approach, the touching numerals are automatically segmented when a set of parameters is chosen. Models with different sets of parameters for each numeral pair are designed for recognition. Each image in each model is recognized as an isolated numeral. After normalizing and binarizing the images, gradient features are extracted and recognized using SVMs. Finally, a post-processing is proposed by based on the optimal combinations of the recognition probabilities for each model. Experiments were conducted on the CENPARMI Arabic, Dari, and Urdu touching numeral pair databases [1,12].

16 citations


Journal ArticleDOI
TL;DR: This special issue on Handwriting Recognition of the Journal, Pattern Recognition describes the significant contributions described in this special issue, which were presented at the ICFHR 2008 Conference held in Montreal from August 18 to 21, 2008.

15 citations


Proceedings ArticleDOI
26 Jul 2009
TL;DR: LDAM is designed to take into consideration the confidence values of the classifier outputs & the relations between them, and it is an improvement over traditional rejection measurements such as First Rank Measurement (FRM) and First Two Ranks Measurements (FTRM).
Abstract: This paper presents a Linear Discriminant Analysis based Measurement (LDAM) on the output from classifiers as a criterion to reject the patterns which cannot be classified with high reliability. This is important in applications (such as in processing of financial documents) where errors can be very costly and therefore less tolerable than rejections. To implement the rejection, which can be considered to be a two-class problem of accepting the classification result or otherwise, Linear Discriminant Analysis (LDA) is used to determine the rejection threshold at a new approach. LDAM is designed to take into consideration the confidence values of the classifier outputs & the relations between them, and it is an improvement over traditional rejection measurements such as First Rank Measurement (FRM) and First Two Ranks Measurement (FTRM). Experiments are conducted on the CENPARMI Arabic Isolated Numerals Database. The results show that LDAM is more effective, and it can achieve a higher reliability while achieving a high recognition rate.

15 citations


Book ChapterDOI
29 Aug 2009
TL;DR: This paper aims to scale up the efficiency of online recognition systems for Arabic characters by integrating novel representation techniques into efficient classification methods using neural networks and support vector machines and investigates the idea of incorporating two novel feature representations for online character data.
Abstract: Robust handwriting recognition of complex patterns of arbitrary scale, orientation and location is yet elusive to date as reaching a good recognition rate is not trivial for most of the application developments in this field. Cursive scripts with complex character shapes, such as Arabic and Persian, make the recognition task even more challenging. This complexity requires sophisticated representations and learning methods, and comprehensive data samples. A direct approaches to achieve a better performance is focusing on designing more powerful building blocks of a handwriting recognition system which are pattern representation and pattern classification . In this paper we aim to scale up the efficiency of online recognition systems for Arabic characters by integrating novel representation techniques into efficient classification methods. We investigate the idea of incorporating two novel feature representations for online character data. We advocate the usefulness and practicality of these features in classification methods using neural networks and support vector machines. The combinations of proposed representations with related classifiers can offer a module for recognition tasks which can deal with any two-dimensional online pattern. Our empirical results confirm the higher distinctiveness and robustness to character deformations obtained by the proposed representation compared to currently available techniques.

Book ChapterDOI
29 Aug 2009
TL;DR: An algorithm which is developed for document margin removal based upon the detection of document corners from projection profiles which was successfully applied to all document images in the authors' databases of French and Arabic document images.
Abstract: Document images obtained from scanners or photocopiers usually have a black margin which interferes with subsequent stages of page segmentation algorithms. Thus, the margins must be removed at the initial stage of a document processing application. This paper presents an algorithm which we have developed for document margin removal based upon the detection of document corners from projection profiles. The algorithm does not make any restrictive assumptions regarding the input document image to be processed. It neither needs all four margins to be present nor needs the corners to be right angles. In the case of the tilted documents, it is able to detect and correct the skew. In our experiments, the algorithm was successfully applied to all document images in our databases of French and Arabic document images which contain more than two hundred images with different types of layouts, noise, and intensity levels.