Showing papers on "Intelligent word recognition published in 2001"

PDF

Open Access

Journal Article•DOI•

An Arabic optical character recognition system using recognition-based segmentation

[...]

Anthony Cheung¹, Mohammed Bennamoun¹, Neil W. Bergmann¹•Institutions (1)

01 Feb 2001-Pattern Recognition

TL;DR: An Arabic OCR system is proposed, which uses a recognition-based segmentation technique to overcome the classical segmentation problems and shows a 90% recognition accuracy with a 20 chars/s recognition rate.

...read moreread less

124 citations

Proceedings Article•DOI•

An OCR system for Telugu

[...]

Atul Negi¹, Chakravarthy Bhagvati¹, B. Krishna¹•Institutions (1)

University of Hyderabad¹

10 Sep 2001

TL;DR: This work presents an efficient and practical approach to Telugu OCR which limits the number of templates to be recognized to just 370, avoiding issues of classifier design for thousands of shapes or very complex glyph segmentation.

...read moreread less

Abstract: Telugu is the language spoken by more than 100 million people of South India. Telugu has a complex orthography with a large number of distinct character shapes (estimated to be of the order of 10,000) composed of simple and compound characters formed from 16 vowels (called achchus) and 36 consonants (called hallus). We present an efficient and practical approach to Telugu OCR which limits the number of templates to be recognized to just 370, avoiding issues of classifier design for thousands of shapes or very complex glyph segmentation. A compositional approach using connected components and fringe distance template matching was tested to give a raw OCR accuracy of about 92%. Several experiments across varying fonts and resolutions showed the approach to be satisfactory.

...read moreread less

122 citations

Proceedings Article•DOI•

Separating handwritten material from machine printed text using hidden Markov models

[...]

J.K. Guo¹, M.Y. Ma²•Institutions (2)

Princeton University¹, Panasonic²

10 Sep 2001

TL;DR: An algorithm that is based on the theory of hidden Markov models (HMMs) to distinguish between machine-printed and handwritten materials is presented, which has been shown to be promising in the authors' experiments.

...read moreread less

Abstract: In this paper, we address the problem of separating handwritten annotations from machine-printed text within a document. We present an algorithm that is based on the theory of hidden Markov models (HMMs) to distinguish between machine-printed and handwritten materials. No OCR results are required prior to or during the process, and the classification is performed at the word level. Handwritten annotations are not limited to marginal areas, as the approach can deal with document images having handwritten annotations overlaid on machine-printed text and it has been shown to be promising in our experiments. Experimental results show that the proposed method can achieve 72.19% recall for fully extracted handwritten words and 90.37% for partially extracted words. The precision of extracting handwritten words has reached 92.86%.

...read moreread less

107 citations

Journal Article•DOI•

Model-based stroke extraction and matching for handwritten Chinese character recognition

[...]

Cheng-Lin Liu¹, In-Jung Kim², Jin H. Kim²•Institutions (2)

Hitachi¹, KAIST²

01 Dec 2001-Pattern Recognition

TL;DR: This method is able to obtain reliable stroke correspondence and enable structural interpretation and some structural post-processing operations are applied to improve the stroke correspondence.

...read moreread less

107 citations

Proceedings Article•DOI•

Automatic recognition of printed Oriya script

[...]

Bidyut B. Chaudhuri¹, Umapada Pal¹, Mandar Mitra¹•Institutions (1)

Indian Statistical Institute¹

10 Sep 2001

TL;DR: The paper deals with an optical character recognition system for printed Oriya, a popular Indian script, that achieves 96.3% character level accuracy on average.

...read moreread less

Abstract: The paper deals with an optical character recognition system for printed Oriya, a popular Indian script. The development of OCR for this script is difficult because a large number of characters have to be recognized. In the proposed system, the digitized document image is first passed through preprocessing modules like skew correction, line segmentation, zone detection, word and character segmentation, etc. These modules have been developed by combining some conventional techniques with some newly proposed ones. Next, individual characters are recognized using a combination of stroke and run-number based features, along with features obtained from the concept of a water reservoir. The feature detection methods are simple and robust. A prototype of the system has been tested on a variety of printed Oriya material, and currently achieves 96.3% character level accuracy on average.

...read moreread less

105 citations

Proceedings Article•DOI•

On the influence of vocabulary size and language models in unconstrained handwritten text recognition

[...]

U.-V. Marti¹, Horst Bunke¹•Institutions (1)

University of Bern¹

01 Sep 2001

TL;DR: The difficult problem of segmenting a line of text into its individual words can be overcome and a statistical language model is integrated into the hidden Markov model framework to enhance the recognition capabilities of the system.

...read moreread less

Abstract: In this paper we present a system for unconstrained handwritten text recognition The system consists of three components: preprocessing, feature extraction and recognition In the preprocessing phase, a page of handwritten text is divided into its lines and the writing is normalized by means of skew and slant correction, positioning and scaling From a normalized text line image, features are extracted using a sliding window technique From each position of the window nine geometrical features are computed The core of the system, the recognizes is based on hidden Markov models For each individual character, a model is provided The character models are concatenated to words using a vocabulary Moreover, the word models are concatenated to models that represent full lines of text Thus the difficult problem of segmenting a line of text into its individual words can be overcome To enhance the recognition capabilities of the system, a statistical language model is integrated into the hidden Markov model framework To preselect useful language models and compare them, perplexity is used Both perplexity as originally proposed and normalized perplexity are considered

...read moreread less

48 citations

Journal Article•DOI•

Fusion of multiple handwritten word recognition techniques

[...]

Brijesh Verma¹, Paul D. Gader², Wen-Tsong Chen²•Institutions (2)

Griffith University¹, University of Missouri²

01 Jul 2001-Pattern Recognition Letters

TL;DR: Three techniques with two different conventional segmentation algorithms in conjunction with backpropagation and radial basis function neural networks have been used in this research to create a novel Borda count for fusion based on ranks and confidence values.

...read moreread less

46 citations

Proceedings Article•DOI•

A shape based post processor for Gurmukhi OCR

[...]

Gurpreet Singh Lehal¹, Chandan Singh², R. Lehal²•Institutions (2)

Thapar University¹, Punjabi University²

01 Sep 2001

TL;DR: A shape based post processing system for an OCR of Gurmukhi script has been developed based on the size and shape of a word and an improvement of 3% in recognition rate has been reported on machine printed images using the post processing techniques.

...read moreread less

Abstract: A shape based post processing system for an OCR of Gurmukhi script has been developed. Based on the size and shape of a word, the Punjabi corpora has been split into different partitions. The statistical information of Punjabi language syllable combination, corpora look up and holistic recognition of most commonly occurring words have been combined to design the post processor. An improvement of 3% in recognition rate from 94.35% to 97.34% has been reported on machine printed images using the post processing techniques.

...read moreread less

44 citations

Proceedings Article•DOI•

An offline cursive handwritten word recognition system

[...]

Yong Haur Tay¹, P.-M. Lallican, Marzuki Khalid, Christian Viard-Gaudin, S. Kneer - Show less +1 more•Institutions (1)

Cairo University¹

19 Aug 2001

TL;DR: This paper describes an offline cursive handwritten word recognition system that combines hidden Markov models (HMM) and neural networks (NN) and presents the preprocessing and the recognition process as well as the training procedure for the NN-HMM hybrid system.

...read moreread less

Abstract: This paper describes an offline cursive handwritten word recognition system that combines hidden Markov models (HMM) and neural networks (NN). Using a fast left-right slicing method, we generate a segmentation graph that describes all possible ways to segment a word into letters. The NN computes the observation probabilities for each letter hypothesis in the segmentation graph. Then, the HMM compute the likelihood for each word in the lexicon by summing the probabilities over all possible paths through the graph. We present the preprocessing and the recognition process as well as the training procedure for the NN-HMM hybrid system. Another recognition system based on discrete HMM is also presented for performance comparison. The latter is also used for bootstrapping the NN-HMM hybrid system. Recognition performances of the two recognition systems using two image databases of French isolated words are presented. This paper is one of the first publications using the IRONOFF database, and thus can be used as a reference for future work on this database.

...read moreread less

41 citations

Proceedings Article•DOI•

Collection and analysis of on-line handwritten Japanese character patterns

[...]

K. Matsumoto¹, T. Fukushima², Masaki Nakagawa²•Institutions (2)

University of Tokyo¹, Tokyo University of Agriculture and Technology²

10 Sep 2001

TL;DR: This paper describes the second collection of online handwritten character patterns and their analysis, named Nakayosi, covering 4,438 categories mainly in the context of sentences and analyzed stroke number and order variations.

...read moreread less

Abstract: This paper describes our second collection of online handwritten character patterns and their analysis. 163 writers presented about 10,000 character patterns, covering 4,438 categories mainly in the context of sentences. Together with our first collection, the Kuchibue database containing 12,000 patterns from 120 writers, we have now collected about 3 million patterns. For this second collection of online patterns, named Nakayosi, we analyzed stroke number and order variations.

...read moreread less

39 citations

Proceedings Article•DOI•

An hybrid MLP-SVM handwritten digit recognizer

[...]

A. Bellili, M. Gilloux¹, Patrick Gallinari•Institutions (1)

La Poste¹

10 Sep 2001

TL;DR: The hybrid MLP-SVM recognizer achieves a recognition rate of 98.01%, for real mail zip code digits recognition task, a performance better than several classifiers reported in recent researches.

...read moreread less

Abstract: This paper presents an original hybrid MLP-SVM method for unconstrained handwritten digits recognition. Specialized support vector machines (SVMs) are introduced to improve significantly the multilayer perceptron (MLP) performances in local areas around the separation surfaces between each pair of digit classes, in the input pattern space. This hybrid architecture is based on the idea that the correct digit class almost systematically belongs to the two maximum MLP outputs and that some pairs of digit classes constitute the majority of MLP substitutions (errors). Specialized local SVMs are introduced to detect the correct class among these two classification hypotheses. The hybrid MLP-SVM recognizer achieves a recognition rate of 98.01%, for real mail zip code digits recognition task, a performance better than several classifiers reported in recent researches.

...read moreread less

Proceedings Article•DOI•

Synthetic data for Arabic OCR system development

[...]

Volker Märgner¹, Mario Pechwitz¹•Institutions (1)

Braunschweig University of Technology¹

01 Sep 2001

TL;DR: A system for the automatic generation of synthetic databases for the development or evaluation of Arabic word or text recognition systems (Arabic OCR) is presented and special problems caused by specific features of Arabic, such as printing from right to left, many diacritical points, variation in the height of characters, and changes in the relative position to the writing line are suggested.

...read moreread less

Abstract: A system for the automatic generation of synthetic databases for the development or evaluation of Arabic word or text recognition systems (Arabic OCR) is presented. The proposed system works without any scanning of printed paper. Firstly Arabic text has to be typeset using a standard typesetting system. Secondly a noise-free bitmap of the document and the corresponding ground truth (GT) is automatically generated. Finally, an image distortion can be superimposed to the character or word image to simulate the expected real world noise of the intended application. All necessary modules are presented together with some examples. Special problems caused by specific features of Arabic, such as printing from right to left, many diacritical points, variation in the height of characters, and changes in the relative position to the writing line, are suggested. The synthetic data set was used to train and test a recognition system based on hidden Markov model (HMM), which was originally developed for German cursive script, for Arabic printed words. Recognition results with different synthetic data sets are presented.

...read moreread less

Patent•DOI•

Speech recognition system, speech recognition server, speech recognition client, their control method, and computer readable memory

[...]

Akihiro Kushida¹, Tetsuo Kosaka¹•Institutions (1)

Canon Inc.¹

27 Nov 2001-Journal of the Acoustical Society of America

TL;DR: In this article, a dictionary management unit looks up an identifier table to determine a recognition dictionary corresponding to the dictionary management information received from a client from a plurality of kinds of recognition dictionaries.

...read moreread less

Abstract: A user dictionary, which is formed by storing pronunciations and notations of target recognition words designated by the user in correspondence with each other, input speech recognition data, and dictionary management data used to determine the recognition field of a recognition dictionary used in recognition of the speech recognition data are sent to a server via a communication module. In the server, a dictionary management unit looks up an identifier table to determine a recognition dictionary corresponding to the dictionary management information received from a client from a plurality of kinds of recognition dictionaries. A speech recognition module recognizes the speech recognition data using at least the determined recognition dictionary. The recognition result is sent to the client via a communication module.

...read moreread less

Proceedings Article•DOI•

Probabilistic model for segmentation based word recognition with lexicon

[...]

Sergey Tulyakov¹, Venu Govindaraju²•Institutions (2)

State University of New York System¹, University at Buffalo²

01 Sep 2001

TL;DR: The construction of a model for off-line word recognizers based on over-segmentation of the input image and recognition of segment combinations as characters in a given lexicon word is described.

...read moreread less

Abstract: We describe the construction of a model for off-line word recognizers based on over-segmentation of the input image and recognition of segment combinations as characters in a given lexicon word. One such recognizer, the Word Model Recognizer (WMR), is used extensively. Based on the proposed model it was possible to improve the performance of WMR.

...read moreread less

Patent•

Word recognition method and storage medium that stores word recognition program

[...]

Tomoyuki Hamamura¹•Institutions (1)

Toshiba¹

26 Jan 2001

TL;DR: In this paper, a recognition process is executed for each character of an input character string corresponding to a word to be recognized, and a probability is determined that the feature appears, which is obtained as a result of character recognition using, as a condition, each word of each word in a word dictionary having stored therein candidates of the word for recognition.

...read moreread less

Abstract: A recognition process is executed for each character of an input character string corresponding to a word to be recognized, and a probability is determined that the feature appears, which is obtained as a result of character recognition using, as a condition, each character of each word in a word dictionary having stored therein candidates of the word to be recognized, and this probability is divided by a probability that the feature obtained as a result of character recognition appears. Each division result obtained for each character of each word in the word dictionary is multiplied for all the characters, and all the multiplication results obtained for each word in the word dictionary are added. Then, the multiplication result obtained for each word in the word dictionary is divided by the addition result, and based on this result, the recognition result of the particular word is obtained.

...read moreread less

Proceedings Article•DOI•

Single-character type identification

[...]

Yefeng Zheng¹, Changsong Liu¹, Xiaoqing Ding¹•Institutions (1)

Tsinghua University¹

18 Dec 2001

TL;DR: This paper addresses the problem of character type identification independent of its content, including handwritten/printed Chinese character identification and printed Chinese/English character identification, based on only one character, by exploiting some effective features of OCR technologies.

...read moreread less

Abstract: Different character recognition problems have their own specific characteristics. The state-of-art OCR technologies take different recognition approaches, which are most effective, to recognize different types of characters. How to identify character type automatically, then use specific recognition engines, has not brought enough attention among researchers. Most of the limited researches are based on the whole document image, a block of text or a text line. This paper addresses the problem of character type identification independent of its content, including handwritten/printed Chinese character identification, and printed Chinese/English character identification, based on only one character. Exploiting some effective features, such as run-lengths histogram features and stroke density histogram features, we have got very promising result. The identification correct rate is higher than 98% in our experiments.

...read moreread less

Proceedings Article•DOI•

Handwritten month word recognition on Brazilian bank cheques

[...]

M. E. Morita, A. El Yacoubi, Robert Sabourin, Flávio Bortolozzi, Ching Y. Suen - Show less +1 more

10 Sep 2001

TL;DR: An off-line system under development to process unconstrained handwritten dates on Brazilian bank cheques in an omni-writer context and shows improvements on previous work on isolated month word recognition using hidden Markov models (HMM).

...read moreread less

Abstract: This paper describes an off-line system under development to process unconstrained handwritten dates on Brazilian bank cheques in an omni-writer context. We show here some improvements on our previous work on isolated month word recognition using hidden Markov models (HMM). After preprocessing, a word image is explicitly segmented into characters or pseudo-characters and represented by two feature sequences of equal length, which are combined using HMM. The word models are generated from the concatenation of appropriate character models. In addition to the small date database, we also make use of the legal amount database to increase the frequency of characters in the training and the validation sets. Although this study deals with a limited lexicon, the many similarities among the word classes can affect the performance of the recognition. Experiments show an increase in the average recognition rate from 84% to 91%. Finally, we present our perspectives of future work.

...read moreread less

Journal Article•DOI•

A rotation invariant printed Chinese character recognition system

[...]

Tai-Ning Yang¹, Sheng-De Wang¹•Institutions (1)

National Taiwan University¹

01 Feb 2001-Pattern Recognition Letters

TL;DR: The proposed system has a three-stage structure designed mainly to reduce the time complexity in the recognition process and aims to recognize the complete set of frequently used 13 053 printed Chinese characters with arbitrary orientations.

...read moreread less

Proceedings Article•DOI•

Handwritten Chinese character segmentation using a two-stage approach

[...]

Shuyan Zhao, Zheru Chi¹, Pengfei Shi², Qing Wang¹•Institutions (2)

Hong Kong Polytechnic University¹, Shanghai Jiao Tong University²

01 Sep 2001

TL;DR: In this paper, a two-stage approach is addressed to segment unconstrained handwritten Chinese character strings by using fuzzy decision rules learned from examples to evaluate the segmentation paths.

...read moreread less

Abstract: Correct segmentation of handwritten Chinese characters is crucial to the successful recognition. However, because of the many difficulties involved, little work has been done in this area. In this paper, a two-stage approach is addressed to segment unconstrained handwritten Chinese character strings. A string is first coarsely segmented according to the background skeleton and vertical projection after a proper image preprocessing. At the fine segmentation stage that follows, the strokes that may contain segmentation points are first identified. The feature points are then extracted from candidate strokes and taken as segmentation point candidates through each of which a segmentation path may be formed. Geometric features are extracted and fuzzy decision rules learned from examples are used to evaluate the segmentation paths. By using this two-stage segmentation approach, we can achieve both good performance and efficiency in segmenting unconstrained handwritten Chinese characters.

...read moreread less

Proceedings Article•DOI•

Speeding up on-line recognition of handwritten characters by pruning the prototype set

[...]

V. Vuori¹, Jorma Laaksonen¹, Erkki Oja¹, Jari Kangas²•Institutions (2)

Helsinki University of Technology¹, Nokia²

01 Sep 2001

TL;DR: This work describes a prototype-based online handwritten character recognition system and a two-phase recognition scheme aimed to speed up the recognition.

...read moreread less

Abstract: This work describes a prototype-based online handwritten character recognition system and a two-phase recognition scheme aimed to speed up the recognition. In the first phase, the prototype set is pruned and ordered on the basis of preclassification performed with heavily down-sampled characters and prototypes. In the second phase, the final classification is performed without down-sampling by using the reduced set of prototypes. Two down-sampling methods, a linear and nonlinear one, have been analyzed to see their properties regarding the recognition time and accuracy.

...read moreread less

Journal Article•DOI•

A fuzzy approach to 2D-shape recognition

[...]

B. Lazzerini, Francesco Marcelloni¹•Institutions (1)

University of Pisa¹

01 Feb 2001-IEEE Transactions on Fuzzy Systems

TL;DR: Two significant applications of the fuzzy classification and recognition of 2D shapes, such as handwritten characters, image contours, etc, are described, namely, recognition of olfactory signals and Recognition of isolated, handwritten characters.

...read moreread less

Abstract: This paper describes a method for fuzzy classification and recognition of 2D shapes, such as handwritten characters, image contours, etc. A fuzzy model is derived for each considered shape from a fuzzy description of a set of instances of this shape. A fuzzy description of a shape instance, in its turn, exploits appropriate fuzzy partitions of the two dimensions of the shape. These fuzzy partitions allow us to identify, and automatically associate an importance degree with the relevant shape zones for classification and recognition purposes. Two significant applications of the method are described, namely, recognition of olfactory signals and recognition of isolated, handwritten characters. In the former case, results are shown concerning the recognition of three different types of waste waters, collected in three different dilutions. In the latter case, results are shown concerning the application of the method to a NIST database, containing the segmented handprinted characters of 500 writers.

...read moreread less

Proceedings Article•DOI•

A radical approach to handwritten Chinese character recognition using active handwriting models

[...]

Daming Shi¹, Steve R. Gunn¹, Robert I. Damper¹•Institutions (1)

University of Southampton¹

01 Dec 2001

TL;DR: The AHM is used within a radical approach to handwritten Chinese characters recognition, which converts the complex pattern recognition problem to recognizing a small set of primitive structures-radicals and achieves superior performance.

...read moreread less

Abstract: This paper applies active handwriting models (AHM) to handwritten Chinese character recognition. Exploiting active shape models (ASM), the AHM can capture the handwriting variation from character skeletons. The AHM has the following characteristics: principal component analysis is applied to capture variations caused by handwriting, an energy functional on the basis of chamfer distance transform is introduced as a criterion to fit the model to a target character skeleton, and the dynamic tunneling algorithm (DTA) is incorporated with gradient descent to search for shape parameters. The AHM is used within a radical approach to handwritten Chinese characters recognition, which converts the complex pattern recognition problem to recognizing a small set of primitive structures-radicals. Our initial experiments are conducted on 98 radicals covering 1400 loosely-constrained Chinese character categories written by 200 different writers. The correct matching rate is 94.2% on these 2.8/spl times/10/sup 5/ characters. Comparison with existing radical approaches shows that our method achieves superior performance.

...read moreread less

Journal Article•DOI•

A discrete contextual stochastic model for the off-line recognition of handwritten Chinese characters

[...]

Yan Xiong¹, Qiang Huo², Chorkin Chan²•Institutions (2)

Hewlett-Packard¹, University of Hong Kong²

01 Jul 2001-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A discrete contextual stochastic model for complex and variant patterns like handwritten Chinese characters for character recognition is studied and a formulation for discriminative training of CS model parameters is introduced and its practical usage investigated.

...read moreread less

Abstract: We study a discrete contextual stochastic (CS) model for complex and variant patterns like handwritten Chinese characters. Three fundamental problems of using CS models for character recognition are discussed, and several practical techniques for solving these problems are investigated. A formulation for discriminative training of CS model parameters is also introduced and its practical usage investigated. To illustrate the characteristics of the various algorithms, comparative experiments are performed on a recognition task with a vocabulary consisting of 50 pairs of highly similar handwritten Chinese characters. The experimental results confirm the effectiveness of the discriminative training for improving recognition performance.

...read moreread less

Journal Article•DOI•

Recognizing Thai handwritten characters and words for human-computer interaction

[...]

Chomtip Pornpanomchai¹, Dentcho N. Batanov¹, Nicholas J. Dimmitt¹•Institutions (1)

Asian Institute of Technology¹

01 Sep 2001-International Journal of Human-computer Studies \/ International Journal of Man-machine Studies

TL;DR: This paper proposes a non-keyboard computer interaction by using a write-pen or mouse to write Thai handwritten characters and words, using a feature-based, fuzzy logic and object-oriented approach (FBFLOOA) to recognize on-line handwritten Thai characters andWords.

...read moreread less

Abstract: Normally, people use a keyboard to interact with a computer. This type of interaction has two main problems; typing speed and typing error. This paper proposes a non-keyboard computer interaction by using a write-pen or mouse to write Thai handwritten characters and words, using a feature-based, fuzzy logic and object-oriented approach (FBFLOOA) to recognize on-line handwritten Thai characters and words. The feature-based concept is used to extract handwritten character features, the fuzzy logic set is used to identify uncertain handwritten character shapes and the object-oriented approach is used to analyse, design and implement a handwritten character and word recognition program.Two phases of Thai handwritten character and word recognition are proposed. The first phase uses only the FBFLOOA to recognize a handwritten character and the second phase uses FBFLOOA combined with a Thai dictionary file to seek a correct answer for a rejected recognition character. The first phase experimental results show a recognition accuracy of 89.24%, 9.20% misrecognition and 1.56% rejection. The second phase precision results are 97.82%, 0.62% misrecognition and 1.56% rejection. Both phases have an average recognition speed of 6.72s per character. The FBFLOOA-executed program size is 189 KB and the Thai dictionary file is 853 KB, which makes FBFLOOA available for notebooks, mobile phones, calculators and pocket computers.

...read moreread less

Proceedings Article•DOI•

An analytical handwritten word recognition system with word-level discriminant training

[...]

Yong Haur Tay¹, P.-M. Lallican, Marzuki Khalid, S. Knerr, Christian Viard-Gaudin - Show less +1 more•Institutions (1)

Universiti Teknologi Malaysia¹

10 Sep 2001

TL;DR: An analytical handwritten word recognition system combining neural networks (NN) and hidden Markov models (HMM) and a fast left-right slicing method that describes all possible ways to segment a word into characters is described.

...read moreread less

Abstract: We describe an analytical handwritten word recognition system combining neural networks (NN) and hidden Markov models (HMM). Using a fast left-right slicing method, we generate a segmentation graph that describes all possible ways to segment a word into characters. The NN computes the observation probabilities for each character hypothesis in the segmentation graph. Then, using concatenated character HMMs, a likelihood is computed for each word in the lexicon by multiplying the observation probabilities over the best path through the graph. The role of the NN is to recognize characters and to reject non-characters. We present our approach to globally train the word recognizer using isolated word images. Using a maximum mutual information (MMI) cost function at the word level, the discriminant training updates the parameters of the NN within a global optimization process based on gradient descent. The recognizer is bootstrapped from a baseline recognition system, which is based on character level training. The recognition performance of the globally trained system is compared to the baseline system.

...read moreread less

Patent•

Method of using empirical substitution data in speech recognition

[...]

Matthew W. Hartley¹, James R. Lewis¹•Institutions (1)

IBM¹

31 May 2001

TL;DR: In this article, a method of speech recognition can include receiving at least one spoken word and performing speech recognition to determine a recognition result, which can be compared to the spoken word to determine if the recognition result is an incorrectly recognized word.

...read moreread less

Abstract: A method of speech recognition can include receiving at least one spoken word and performing speech recognition to determine a recognition result. The spoken word can be compared to the recognition result to determine if the recognition result is an incorrectly recognized word. The spoken word can be identified as an alternate word candidate for the incorrectly recognized word.

...read moreread less

Proceedings Article•DOI•

A segmentation method for touching Japanese handwritten characters based on connecting condition of lines

[...]

T. Yamaguchi¹, Tomohiro Yoshikawa, Tsuyoshi Shinogi, Shinji Tsuruoka, M. Teramoto - Show less +1 more•Institutions (1)

Mie University¹

10 Sep 2001

TL;DR: A new segmentation method based on connecting condition of lines at a touching point is proposed, and the efficiency of this method for touching Japanese handwritten characters is evaluated.

...read moreread less

Abstract: In unconstrained Japanese handwritten character strings, there are many touching characters. Segmentation of touching characters is required as preprocessing of isolated character recognition. But conventional segmentation methods cannot segment complicated touching characters. In this paper we propose a new segmentation method based on connecting condition of lines at a touching point, and evaluate the efficiency of this method for touching Japanese handwritten characters. This method could segment complicated touching characters with less unnecessary segmentations.

...read moreread less

Book Chapter•DOI•

Active Handwritten Character Recognition Using Genetic Programming

[...]

Ankur Teredesai¹, J. Park¹, Venu Govindaraju¹•Institutions (1)

University at Buffalo¹

18 Apr 2001

TL;DR: This paper proposes an implementation with dynamism in pre-processing and classification of handwritten digit images by providing better performance in terms of accuracy and processing time per image for classification, and compares passive and active handwritten digit classification schemes that are based on other pattern recognition techniques.

...read moreread less

Abstract: This paper is intended to demonstrate the effective use of genetic programming in handwritten character recognition. When the resources utilized by the classifier increase incrementally and depend on the complexity of classification task, we term such a classifier as active. The design and implementation of active classifiers based on genetic programming principles becomes very simple and efficient. Genetic Programming has helped optimize handwritten character recognition problem in terms of feature set selection. We propose an implementation with dynamism in pre-processing and classification of handwritten digit images. This paradigm will supplement existing methods by providing better performance in terms of accuracy and processing time per image for classification. Different levels of informative detail can be present in image data and our proposed paradigm helps highlight these information rich zones. We compare our performance with passive and active handwritten digit classification schemes that are based on other pattern recognition techniques.

...read moreread less

Journal Article•DOI•

A novel invariant mapping applied to hand-written arabic character recognition

[...]

Nawwaf Kharma¹, Rabab K. Ward•Institutions (1)

Concordia University Wisconsin¹

01 Nov 2001-Pattern Recognition

TL;DR: An application of a novel mapping, one that is intended for use in on-line hand-written character recognition, that produces the same output pattern regardless of the orientation, position, and size of the input pattern is described.

...read moreread less

Proceedings Article•DOI•

Alignment of free layout color texts for character recognition

[...]

H. Hase¹, Masaaki Yoneda¹, T. Shinokawa², Ching Y. Suen³•Institutions (3)

University of Toyama¹, Toyama National College of Maritime Technology², Concordia University³

01 Sep 2001

TL;DR: A realignment algorithm for irregular character strings on color documents is proposed that realigns all the characters in a text horizontally, then test them with an ordinary character recognition method.

...read moreread less

Abstract: A realignment algorithm for irregular character strings on color documents is proposed. Color documents often contain poorly aligned texts such as inclined or curved texts sometimes with distortion. In order to recognize them, we classify these texts into five types. After determining the type, we realign all the characters in a text horizontally, then test them with an ordinary character recognition method. Lastly, we show some experimental results for texts extracted from real color documents and discuss some causes of misrecognition.

...read moreread less