scispace - formally typeset
Search or ask a question

Showing papers by "Ching Y. Suen published in 1997"


Proceedings ArticleDOI
18 Aug 1997
TL;DR: A hidden Markov model (HMM) based word recognition engine being developed to be integrated with the CENPARMI bank cheque processing system is described and preliminary results are compared with the previous global feature recognition scheme.
Abstract: We describe a hidden Markov model (HMM) based word recognition engine being developed to be integrated with the CENPARMI bank cheque processing system. The various modules are described in detail, and preliminary results are compared with our previous global feature recognition scheme. The engine is tested on words from a database of over 4,500 cheques of 1,400 writers.

51 citations


Journal ArticleDOI
TL;DR: A novel approach to extract data from check images is proposed based on the determination of baselines of checks, a priori information about the positions of data on checks, and a layout-driven item extraction method that is effective and performs well.
Abstract: A novel approach to extract data from check images is proposed based on the determination of baselines of checks, a priori information about the positions of data on checks, and a layout-driven item extraction method. Several techniques and algorithms have been developed in this approach including check image preprocessing, the extraction and identification of baselines, the extraction of the strokes of handwritten legal amounts, courtesy amounts and date, and the separation of strokes connected to baselines. A complete working system has been developed. The results of both testing experiments and on-line applications show that this approach is effective and the proposed techniques and algorithms perform well.

35 citations


Journal ArticleDOI
TL;DR: This paper describes a generic document segmentation and geometric relation labeling method with applications to Chinese document analysis that begins with a hierarchy of partitioned image layers where inhomogeneous higher-level regions are recursively partitioned into lower-level rectangular subregions.

23 citations


Proceedings ArticleDOI
18 Aug 1997
TL;DR: A model for reading cursive scripts which has an architecture inspired by a reading model and which is based on perceptual concepts and is concentrating now on validating the model using a larger database.
Abstract: Presents a model for reading cursive scripts which has an architecture inspired by a reading model and which is based on perceptual concepts. We limit the scope of our study to the off-line recognition of isolated cursive words. First of all, we justify why we chose McClelland & Rumelhart's (1981) reading model as the inspiration for our system. A brief resume/spl acute/ of the method's behavior is presented and the main originalities of our model are underlined. After this, we focus on the new updates added to the original system: a new baseline extraction module, a new feature extraction module and a new generation, validation and hypothesis insertion process. After implementation of our method, new results have been obtained on real images from a training set of 184 images, and a testing set of 100 images, and are discussed. We are concentrating now on validating the model using a larger database.

21 citations


Proceedings ArticleDOI
18 Aug 1997
TL;DR: A new feature based on DDD (directional distance distribution) information is proposed, which regards the input pattern array as being circular and contains very rich information by encoding in one representation both the white/black distribution and the directional distance distribution.
Abstract: The performance of a character recognition system depends heavily on what features are being used. Though many kinds of features have been developed and their test performances on a standard database have been reported, there is still room to improve the recognition rate by developing an improved feature. The authors propose a new feature based on DDD (directional distance distribution) information. This new concept regards the input pattern array as being circular. It also contains very rich information by encoding in one representation both the white/black distribution and the directional distance distribution. A test performed on the CENPARMI handwritten numeral database showed a promising result of 97.3% recognition with a neural network classifier using the DDD feature.

17 citations


Proceedings ArticleDOI
18 Aug 1997
TL;DR: An innovative verification module is applied which drastically improves the recognition rate and a practical automatic form reading system TOCR V1.0 was developed based on the algorithms.
Abstract: The paper describes a high performance offline system for recognizing hand printed numerals. An innovative verification module is applied which drastically improves the recognition rate. The approaches used in the modules are described. The importance of the verification module is analysed in detail. A practical automatic form reading system TOCR V1.0 was developed based on the algorithms. The system was put into practical use in several provinces of China for statistical analysis of Revenue China. Test results are given based on: 1) data collected when the system was used in China, as well as 2) the CENPARMI database.

13 citations


Proceedings ArticleDOI
18 Aug 1997
TL;DR: The authors have extended existing methods to identify the language of an on-line document after the characters have been coded using 10 character classes based on visual characteristics and exploit word bigrams and trigrams in both a linear combination of score values and an expert systems approach.
Abstract: The authors have extended existing methods to identify the language of an on-line document after the characters have been coded using 10 character classes based on visual characteristics. In particular, they exploit word bigrams and trigrams in both a linear combination of score values and an expert systems approach. Knowledge about each language as acquired from a large number of on-line texts. Using a small set of rules, the expert system outperforms the linear combination in accuracy and shows more stability when parameter settings are varied.

12 citations


Book ChapterDOI
TL;DR: An elegant method for evaluating the discriminant power of features in the framework of an HMM-based word recognition system that employs statistical indicators, entropy and perplexity, to quantify the capability of each feature to discriminate between classes without resorting to the result of the recognition phase.
Abstract: This paper describes an elegant method for evaluating the discriminant power of features in the framework of an HMM-based word recognition system. This method employs statistical indicators, entropy and perplexity, to quantify the capability of each feature to discriminate between classes without resorting to the result of the recognition phase. The HMMs and the Viterbi algorithm are used as powerful tools to automatically deduce the probabilities required to compute the above mentioned quantities.

9 citations


Proceedings ArticleDOI
18 Aug 1997
TL;DR: A technique using each currency unit as a key word to locate/extract the legal amount in bank cheques using a Chinese cheque processing system under development at the Centre for Pattern Recognition and Machine Intelligence.
Abstract: This paper describes a Chinese cheque processing system currently under development at the Centre for Pattern Recognition and Machine Intelligence (CENPARMI). The information on Chinese bank cheques is not the same as that on alphanumeric bank cheques. The legal amount in a Chinese bank cheque is the Chinese character text associated with each currency unit. This paper discusses a technique using each currency unit as a key word to locate/extract the legal amount in bank cheques. In the analysis and recognition process, the system tries to locate the smallest currency units in the image and identifies it first. Then, the system tries to locate the image strings associated with each currency unit. Each image string is separated and recognized. Next, a set of rules and context are applied to recognize the characters. In order to choose the correct one, the recognized character string is accepted only if it satisfies all the conditions governed by rules.

4 citations




Book ChapterDOI
TL;DR: The ideas of splitting patterns into 4- or 6- parts, integration for recognition, and selection of crucial combinations, and a new and simple algorithm to perform the crucial combinations are presented.
Abstract: Crucial features are important for the identification of patterns. Great efforts have been devoted to discover the most distinctive features. This paper presents the ideas of splitting patterns into 4- or 6- parts, integration for recognition, and selection of crucial combinations. The crucial combinations are explored with particular reference to 26 handprinted letters of the English alphabet. This paper proposes a new and simple algorithm to perform the crucial combinations. Also, the largest confusion regions and algorithms for finding them are provided. The most useful crucial combinations and the largest confusion combinations are listed in this paper.