scispace - formally typeset
Search or ask a question

Showing papers by "Ching Y. Suen published in 2004"


Proceedings ArticleDOI
26 Oct 2004
TL;DR: This paper combines complementary features based on foreground and background information in an HMM-based classifier to recognize handwritten isolated characters and numeral strings to avoid the character normalization.
Abstract: In this paper we combine complementary features based on foreground and background information in an HMM-based classifier to recognize handwritten isolated characters and numeral strings. A zoning scheme based on column and row models provides a way of dividing the character into zones without making the features size variant. This strategy allows us to avoid the character normalization, while it provides a way of having information from specific zones of the character. The experimental results on 10 digit classes, 52 character classes and 6 classes of numeral strings of different lengths have shown that the proposed features are highly discriminant.

40 citations


Journal ArticleDOI
TL;DR: This study shows that the topology has a stronger influence on increasing the performance of HMM-based classifiers than the number of states.

35 citations


01 Jan 2004
TL;DR: A word slant normalization method is modified in order to improve the results for handwritten numeral strings and contextual information regarding string slant and digit size variations within the string are used to train numeral HMMs.
Abstract: This work describes a way of enhancing handwritten numeral string recognition by considering slant normalization and contextual information to train an implicit segmentation­based system. A word slant normalization method is modified in order to improve the results for handwritten numeral strings. We assume that each connected component (CC) in the string has its own slant. The slant and contour length of each CC are used for obtaining the mean slant of the string. Both the original and modified methods are evaluated by means of some interesting analyses on the NIST SD19 database. These analyses show (a) the positive impact of slant correction on the number of overlapping numerals in strings, and (b) the difference in normalizing isolated numerals based on the slant estimated from their own images and the slant estimated from their original string images. Slant normalization and contextual information regarding string slant and digit size variations within the string are used to train numeral HMMs. Preliminary string recognition results, produced by a system under construction, are shown.

27 citations


Proceedings ArticleDOI
26 Oct 2004
TL;DR: A new method of segmenting unconstrained handwritten numeral strings is proposed, based on the extracting of foreground and background features, which can provide a list of good segmentation hypotheses for segmentation-based recognition systems.
Abstract: A new method of segmenting unconstrained handwritten numeral strings is proposed. It is based on the extracting of foreground and background features. In order to find foreground features for the first time an algorithm based on skeleton tracing is introduced. The skeleton of each connected component is traversed in clockwise and anti-clockwise directions, and intersection points which are visited in each traversal, are mapped on the outer contour to form foreground feature points. In order to find background features, another new algorithm is proposed. Considering vertical projections of top and bottom profiles, two background skeletons are found. After processing these two background skeletons, background feature points are extracted. Background and foreground feature points are assigned together to construct candidate segmentation paths. Finally each segmentation path is evaluated based on the properties of its left and right connected components. Our method can provide a list of good segmentation hypotheses for segmentation-based recognition systems. The NIST SD19 database (handwritten numeral strings) is used for evaluating of the method, and experiments show a very promising result.

22 citations


Proceedings ArticleDOI
26 Oct 2004
TL;DR: A general generative/discriminative hybrid that uses HMMs to map the variable length time-series data into a fixed p-dimensional vector that can be easily classified using any discriminative model is proposed.
Abstract: Classification of time-series data using discriminative models such as SVMs is very hard due to the variable length of this type of data. On the other hand generative models such as HMMs have become the standard tool for modeling time-series data due to their efficiency. This paper proposes a general generative/discriminative hybrid that uses HMMs to map the variable length time-series data into a fixed p-dimensional vector that can be easily classified using any discriminative model. The hybrid system was tested on the MNIST database for unconstrained handwritten numerals and has achieved an improvement of 1.23% (on the test set) over traditional 2D discrete HMMs.

21 citations


01 Jan 2004
TL;DR: Sophisticated hybrid schemes of the homogeneous and heterogeneous classifiers for cursive word recognition are presented, based on the idea that classifiers with more different methodologies and different features can better complement each other.
Abstract: Sophisticated hybrid schemes of the homogeneous and heterogeneous classifiers for cursive word recognition are presented. Two homogeneous MLPs (multi­layer perceptrons) are combined into a new single powerful classifier at the architectural level, and HMM (hidden Markov model) is added to the new classifier as a heterogeneous one at the output level. This is based on the idea that classifiers with more different methodologies and different features can better complement each other. The presented scheme achieves a recognition rate of 92.7% for English legal words of a CENPARMI database, a performance which is better than several previous hybrid schemes reported in the literature.

20 citations


Proceedings ArticleDOI
26 Oct 2004
TL;DR: A spiral recognition methodology is presented that enables the system to increase its recognition power (both the recognition rate and the number of recognized characters) during the training iterations and has a high performance in the recognition of unconstrained handwritten Chinese legal amounts.
Abstract: This paper presents the spiral recognition methodology with its application in unconstrained handwritten Chinese legal amount recognition in a practical environment of a CheckReader/spl trade/. This paper first describes the failed application of neural network - hidden Markov model hybrid recognizer on Chinese bank check legal amount recognition, and explains the reasons for the failure: the neural network - hidden Markov model hybrid recognizer cannot handle the complexity in the training for Chinese legal amounts. Then a spiral recognition methodology is presented. This methodology enables the system to increase its recognition power (both the recognition rate and the number of recognized characters) during the training iterations. Some experiments were done to show that the spiral recognition methodology has a high performance in the recognition of unconstrained handwritten Chinese legal amounts. The recognition rate at the character level is 93.5%, and the recognition rate at the legal amount level is 60%. Combined with the recognition of courtesy amount, the overall error rate is less than 1%.

16 citations


Book ChapterDOI
TL;DR: This paper presents a method to recognize the various defect patterns of a cold mill strip using a binary decision tree constructed by genetic algorithm, and the final recognizer is implemented by a neural network trained by standard patterns at each node.
Abstract: This paper presents a method to recognize the various defect patterns of a cold mill strip using a binary decision tree constructed by genetic algorithm(GA). In this paper, GA was used to select a subset of the suitable features at each node in the binary decision tree. The feature subset with maximum fitness is chosen and the patterns are divided into two classes using a linear decision function. In this way, the classifier using the binary decision tree can be constructed automatically, and the final recognizer is implemented by a neural network trained by standard patterns at each node. Experimental results are given to demonstrate the usefulness of the proposed scheme.

16 citations


Book ChapterDOI
TL;DR: To improve recognition rate, mutually beneficial features such as directional features, crossing point features and mesh features are selected, and three new hybrid feature sets are created, which hold the local and global characteristics of input numeral images.
Abstract: Off-line handwritten numeral recognition is a very difficult task. It is hard to achieve high recognition results using a single set of features and a single classifier, since handwritten numerals contain many pattern variations which mostly depend upon individual writing styles. In this paper, we propose a recognition system using hybrid features and a combined classifier. To improve recognition rate, we select mutually beneficial features such as directional features, crossing point features and mesh features, and create three new hybrid feature sets from them. These feature sets hold the local and global characteristics of input numeral images. We also implement a combined classifier from three neural network classifiers to achieve a high recognition rate, using fuzzy integral for multiple network fusion. In order to verify the performance of the proposed recognition system, experiments with the unconstrained handwritten numeral database of Concordia University, Canada were performed, producing a recognition rate of 97.85%.

15 citations


01 Jan 2004
TL;DR: A generic system to automatically extract and clean handwritten items from business forms, based on a model template generated automatically from a blank form, shows promising results.
Abstract: A generic system is proposed to automatically extract and clean handwritten items from business forms. Handwritten data usually touch or cross preprinted form frames and texts. Having assumed that the item­of­interest can be located roughly by existing form registration methods, we focus only on the extraction and cleaning of the filled­in items. The proposed system includes training and cleaning phases. In the training phase, a model template is generated automatically from a blank form. Features such as the position and stroke width of the preprinted entities (including form frames and instructions) are extracted. In the cleaning phase, the system registers the template to the input form by landmark alignment. The form frames are removed and the handwritings are restored by morphological operations. When the handwritings are found touching or crossing preprinted texts, morphological operations based on statistical features are used to clean them. Both subjective and objective evaluations show promising results of the proposed system.

15 citations


Journal ArticleDOI
TL;DR: A novel method based on multi-modal discriminant analysis that can reduce artificial neural network (ANN) training complexity and make the ANN classifier more reliable is proposed to reduce feature dimensionality.
Abstract: A novel method based on multi-modal discriminant analysis is proposed to reduce feature dimensionality. First, each class is divided into several clusters by the k-means algorithm. The optimal discriminant analysis is implemented by multi-modal mapping. Our method utilizes only those training samples on and near the effective decision boundary to generate a between-class scatter matrix, which requires less CPU time than other nonparametric discriminant analysis (NDA) approaches [Fukunaga and Mantock in IEEE Trans PAMI 5(6):671---677, 1983; Bressan and Vitria in Pattern Recognit Lett 24(5):2473---2749, 2003]. In addition, no prior assumptions about class and cluster densities are needed. In order to achieve a high verification performance of confusing handwritten numeral pairs, a hybrid feature extraction scheme is developed, which consists of a set of gradient-based wavelet features and a set of geometric features. Our proposed dimensionality reduction algorithm is used to congregate features, and it outperforms the principal component analysis (PCA) and other NDA approaches. Experiments proved that our proposed method could achieve a high feature compression performance without sacrificing its discriminant ability for classification. As a result, this new method can reduce artificial neural network (ANN) training complexity and make the ANN classifier more reliable.

Proceedings ArticleDOI
17 May 2004
TL;DR: A general generative-discriminative framework that uses HMMs to map the variable length sequential data into a fixed size P-dimensional vector (likelihood score) that can be easily classified using any discriminative model is proposed.
Abstract: Classification of sequential data using discriminative models such as support vector machines is very hard due to the variable length of this type of data. On the other hand, generative models such as HMMs have become the standard tool for representing sequential data due to their efficiency. This paper proposes a general generative-discriminative framework that uses HMMs to map the variable length sequential data into a fixed size P-dimensional vector (likelihood score) that can be easily classified using any discriminative model. The preliminary experiments of the framework on the MNIST database for handwritten digits have achieved a better recognition rate of 98.02% than that of standard HMMs (94.19%).

Proceedings ArticleDOI
23 Aug 2004
TL;DR: This work presents a novel research investigation on legal amount recognition of unconstrained cursive handwritten Chinese character in the environment of A2iA CheckReader - a commercial bank check recognition system.
Abstract: This work presents a novel research investigation on legal amount recognition of unconstrained cursive handwritten Chinese character in the environment of A2iA CheckReader - a commercial bank check recognition system. The following problems and their solutions are described: character set of Chinese legal amounts, preprocessing (slant detection and correction), segmentation, feature extraction, grammar, automatic annotation of Chinese characters before and during training, and neural network/hidden Markov model training and recognition. The system is trained with 47.8 thousand real bank checks, and validated with 12 thousand real bank checks. The recognition rate at the character level is 93.5%, and the recognition rate at the legal amount level is 60%. This is the first successful commercial product in this domain.

Proceedings ArticleDOI
26 Oct 2004
TL;DR: A novel hybrid feature extraction method is proposed for the verification of handwritten numerals that could make the ANN classifier more reliable and convergence easily.
Abstract: A novel hybrid feature extraction method is proposed for the verification of handwritten numerals. The hybrid features consist of one set of two dimensional complex wavelet transform (2D-CWT) coefficients and one set of geometrical features. As 2D-CWT does not only keep wavelet transform's properties of multiresolution decomposition analysis and perfect reconstruction, but also adds its new merits: its magnitudes being insensitive to the small image shifts and multiple directional selectivity, which are useful for handwritten numeral feature extraction. Experiments demonstrated that the features extracted by our proposed method could make the ANN classifier more reliable and convergence easily. A high verification performance has been observed in the series of experiments on handwritten numeral pairs and clusters.

Proceedings ArticleDOI
26 Oct 2004
TL;DR: A non-heuristic fast decoding algorithm which is based on hidden Markov model representation of characters, which enables the reuse of character likelihoods to decode all words in the lexicon, avoiding repeated computation of state sequences.
Abstract: To support large vocabulary handwriting recognition in standard computer platforms, a fast algorithm for hidden Markov model alignment is necessary. To address this problem, we propose a non-heuristic fast decoding algorithm which is based on hidden Markov model representation of characters. The decoding algorithm breaks up the computation of word likelihoods into two levels: state level and character level. Given an observation sequence, the two level decoding enables the reuse of character likelihoods to decode all words in the lexicon, avoiding repeated computation of state sequences. In an 80,000-word recognition task, the proposed decoding algorithm is about 15 times faster than a conventional Viterbi algorithm, while maintaining the same recognition accuracy.

Proceedings ArticleDOI
23 Aug 2004
TL;DR: By improving the recognition rate on italic fonts, this work uses slant projection and contour analysis and the shortest path approach for accurately locating the cut path of each candidate segmentation point.
Abstract: Segmentation is an essential part of a recognition system. It is difficult to handle touching characters, especially for italic fonts. We present a method to achieve the accurate segmentation of touching italic characters. It is free of slant correction, so extra noises will not be introduced. We use slant projection and contour analysis to find the segmentation points. Then the shortest path approach is adopted for accurately locating the cut path of each candidate segmentation point. Based on dynamic programming, we can find the best segmentation result from those cut paths. By this method, we can improve the recognition rate on italic fonts.

Patent
26 Feb 2004
TL;DR: In this article, a character recognition apparatus has space storage to store Eigen spaces made from a plurality of rotated character images, and loci storage is used to store loci drawn for projection points obtained by projecting the plurality of character images in corresponding eigen spaces.
Abstract: A character recognition apparatus has space storage to store Eigen spaces made from a plurality of rotated character images, loci storage to store loci drawn for projection points obtained by projecting the plurality of rotated character images in corresponding Eigen spaces; an input unit to input the recognition target images; a distance calculation unit to obtain distances between projection points obtained by projecting the recognition target images in Eigen space and respective loci for the plurality of character types, and a candidate selection unit to select candidates for images for recognition target characters from the plurality of character types based on the distance.

Journal ArticleDOI
TL;DR: This paper presents a new method of detecting ridges and ravines by using local min and max operations that uses erosion and dilation properties of fuzzy logic operations and requires no information of ridge or ravine direction.

Journal ArticleDOI
TL;DR: A hierarchical classifier and several typographical features have been devised for the system, and their effectiveness are proven by an experiment with a database of 100 sets of 264 font categories.
Abstract: Previous research efforts on optical font recognition have mostly limited applications since they deal with only a few types of font attributes and estimate them from a line or block of text This paper proposes a word-level optical font recognition system for printed Korean and English documents At the word-level, it has the advantages of obtaining more detailed font attributes including the following: script (Korean and English), font style (regular, bold, italic, and underlined), typeface (Myung-jo and Gothic), point size (10, 12, 14 pts), and word length (2, 3, 4, 5 for Korean, and 4 to 10 for English) A hierarchical classifier and several typographical features have been devised for the system, and their effectiveness are proven by an experiment with a database of 100 sets of 264 font categories

BookDOI
17 May 2004
TL;DR: GA and K-means algorithm were used to select a subset of the suitable features at each node in the binary decision tree and the feature subset with maximum fitness is chosen and the patterns are divided into two classes using a linear decision function.
Abstract: This paper proposes a method to recognize the various defect patterns of a cold mill strip using a binary decision. In classifying complex patterns with high similarity like these defect patterns, the selection of an optimal feature set and an appropriate recognizer is a pre-requisite to a high recognition rate. In this paper GA and K-means algorithm were used to select a subset of the suitable features at each node in the binary decision tree. The feature subset with maximum fitness is chosen and the patterns are divided into two classes using a linear decision function. This process is repeated at each node until all the patterns are classified into individual classes. In this way, the classifier using the binary decision tree can be constructed automatically, and the final recognizer is implemented by a neural network trained by standard patterns at each node. Experimental results are given to demonstrate the usefulness of the proposed scheme.

Proceedings ArticleDOI
17 May 2004
TL;DR: A novel method of multi-modal nonlinear feature reduction is proposed for the recognition of handwritten numerals that can reduce ANN training complexity and make the ANN classifier more reliable.
Abstract: A novel method of multi-modal nonlinear feature reduction is proposed for the recognition of handwritten numerals. In order to find an effective decision boundary, each class is divided into several clusters. Then the k-NN sorting algorithm is applied to each cluster to get the training data along the effective decision boundary. Optimal discriminant analysis is implemented by multimodal nonlinear mapping to generate a between-class scatter matrix, which requires less CPU time than other nonparametric approaches. Experiments demonstrated that our proposed method could achieve a high feature reduction without sacrificing much discriminant ability. As a result, this new method can reduce ANN training complexity and make the ANN classifier more reliable. Its feature dimensionality reduction outperforms the PCA and mono-modal nonparametric analysis.

BookDOI
17 May 2004
TL;DR: By applying the combining method, further improvement in word separation was achieved and a new method of combining three different types of distance measures based on 4-class clustering is proposed to reduce the errors generated by each measure.
Abstract: This paper presents an efficient method of separating words in handwritten legal amounts on bank cheques based on the spatial gaps between connected components Currently all typical existing gap measures suffer from poor performance due to the inherent problem of underestimation and overestimation In order to decrease such burden, a modified version for each of those existing measures is explored Also, a new method of combining three different types of distance measures based on 4-class clustering is proposed to reduce the errors generated by each measure In experiments on real bank cheque database, the modified distance measures show about 3% of better separation rate than their original counterparts In addition, by applying the combining method, further improvement in word separation was achieved

Proceedings ArticleDOI
23 Aug 2004
TL;DR: In this article, a robust recognition method for rotated character images is presented, where an eigen sub-space for each category using the covariance matrix calculated from a sufficient number of rotated patterns averaged by several fonts is constructed.
Abstract: This work presents a new robust recognition method for rotated character images. We first construct an eigen sub-space for each category using the covariance matrix calculated from a sufficient number of rotated patterns averaged by several fonts. Next, we can obtain a locus by projecting their rotated characters onto the eigen subspace and interpolating between their projected points. An unknown character is also projected onto the eigen sub-space of each category. Then, verification is carried out by calculating the distance between the projected point of the unknown character and the locus. In our experiment, we obtained quite good results for three fonts of 26 capital letters of the English alphabet.

21 Jun 2004
TL;DR: In this paper, a technique permettant l'application of l'algorithme LDA sur nos donnees is proposed, which permet d'integrer l'information concernant le style d'ecriture de l'echantillon traite.
Abstract: Cet article decrit plusieurs methodes permettant l'amelioration du pouvoir discriminant d'ensembles de primitives discretes. En tirant partie des possibilites offertes par la modelisation markovienne des caracteres, nous proposons une technique permettant l'application de l'algorithme LDA sur nos donnees. Differentes methodes de segmentation de la zone d'interet lors de l'extraction des caracteristiques sont presentees. Finalement une strategie de ponderation de ces differentes zones est definie. Elle permet d'integrer l'information concernant le style d'ecriture de l'echantillon traite. Les resultats experimentaux obtenus montrent l'interet des differentes strategies.