scispace - formally typeset
Search or ask a question

Showing papers in "International Journal on Document Analysis and Recognition in 2013"


Journal ArticleDOI
Nicholas R. Howe1
TL;DR: An automatic technique for setting parameters in a manner that tunes them to the individual image, yielding a final binarization algorithm that can cut total error by one-third with respect to the baseline version is described.
Abstract: Document analysis systems often begin with binarization as a first processing stage. Although numerous techniques for binarization have been proposed, the results produced can vary in quality and often prove sensitive to the settings of one or more control parameters. This paper examines a promising approach to binarization based upon simple principles, and shows that its success depends most significantly upon the values of two key parameters. It further describes an automatic technique for setting these parameters in a manner that tunes them to the individual image, yielding a final binarization algorithm that can cut total error by one-third with respect to the baseline version. The results of this method advance the state of the art on recent benchmarks.

185 citations


Journal ArticleDOI
TL;DR: This survey is the first survey to focus on online Arabic handwriting recognition and provide recognition rates and descriptions of database used for the discussed approaches and is based on an extensive review of the literature.
Abstract: Researches on handwriting recognition have known a great attention since it has been considered as a technological revolution in man-machines interfaces especially that handwriting has continued to persist as the most used mean of communication and recording information in day-to-day life. The challenging nature of handwriting recognition and segmentation has attracted the attention of researchers from academic and industry circles. The huge part of these researches deals with Latin and Chinese. Interest in Arabic script comes years later, and so the state of the art is less advanced. This survey describes the nature of this Arabic handwritten language and the basic concepts behind the recognition process. An overview of the state of the art of online Arabic handwriting recognition is presented. It is based on an extensive review of the literature in order to describe background in the field, discussion of the methods, and future research directions. It is the first survey to focus on online Arabic handwriting recognition and provide recognition rates and descriptions of database used for the discussed approaches.

94 citations


Journal ArticleDOI
TL;DR: Experiments performed with real-world SV data comprised of random, simple, and skilled forgeries indicate that the proposed approach provides a high level of performance when extended shadow code and directional probability density function features are extracted at multiple scales.
Abstract: Some of the fundamental problems faced in the design of signature verification (SV) systems include the potentially large number of input features and users, the limited number of reference signatures for training, the high intra-personal variability among signatures, and the lack of forgeries as counterexamples. In this paper, a new approach for feature selection is proposed for writer-independent (WI) off-line SV. First, one or more preexisting techniques are employed to extract features at different scales. Multiple feature extraction increases the diversity of information produced from signature images, allowing to produce signature representations that mitigate intra-personal variability. Dichotomy transformation is then applied in the resulting feature space to allow for WI classification. This alleviates the challenges of designing off-line SV systems with a limited number of reference signatures from a large number of users. Finally, boosting feature selection is used to design low-cost classifiers that automatically select relevant features while training. Using this global WI feature selection approach allows to explore and select from large feature sets based on knowledge of a population of users. Experiments performed with real-world SV data comprised of random, simple, and skilled forgeries indicate that the proposed approach provides a high level of performance when extended shadow code and directional probability density function features are extracted at multiple scales. Comparing simulation results to those of off-line SV systems found in literature confirms the viability of the new approach, even when few reference signatures are available. Moreover, it provides an efficient framework for designing a wide range of biometric systems from limited samples with few or no counterexamples, but where new training samples emerge during operations.

90 citations


Journal ArticleDOI
TL;DR: The description of the Arabic script characteristics is presented with an overview on OCR systems and a comprehensive review mainly on off-line printed Arabic character segmentation techniques.
Abstract: Arabic character segmentation is a necessary step in Arabic Optical Character Recognition (OCR). The cursive nature of Arabic script poses challenging problems in Arabic character recognition; however, incorrectly segmented characters will cause misclassifications of characters which in turn may lead to wrong results. Therefore, off-line Arabic character segmentation is a difficult research problem and little research has been achieved in this area in the past few decades. This is due to both the cursive nature of Arabic writing in both printed and handwritten forms and the scarcity of Arabic databases and dictionaries. Most of the character recognition methods used in the recognition of Arabic characters are adopted from available methods used on handwritten Latin and Chinese characters; however, other methods are developed only for Arabic character segmentation. This survey presents the description of the Arabic script characteristics with an overview on OCR systems and a comprehensive review mainly on off-line printed Arabic character segmentation techniques.

87 citations


Journal ArticleDOI
TL;DR: A fast, incremental parsing algorithm is developed, motivated by the two-dimensional structure of written mathematics, and a correction mechanism is developed that allows users to navigate parse results and choose the correct interpretation in case of recognition errors or ambiguity.
Abstract: We present a new approach for parsing two-dimensional input using relational grammars and fuzzy sets. A fast, incremental parsing algorithm is developed, motivated by the two-dimensional structure of written mathematics. The approach reports all identifiable parses of the input. The parses are represented as a fuzzy set, in which the membership grade of a parse measures the similarity between it and the handwritten input. To identify and report parses efficiently, we adapt and apply existing techniques such as rectangular partitions and shared parse forests, and introduce new ideas such as relational classes and interchangeability. We also present a correction mechanism that allows users to navigate parse results and choose the correct interpretation in case of recognition errors or ambiguity. Such corrections are incorporated into subsequent incremental recognition results. Finally, we include two empirical evaluations of our recognizer. One uses a novel user-oriented correction count metric, while the other replicates the CROHME 2011 math recognition contest. Both evaluations demonstrate the effectiveness of our proposed approach.

86 citations


Journal ArticleDOI
TL;DR: This paper presents a general road vectorization approach that exploits common geometric properties of roads in maps for processing heterogeneous raster maps while requiring minimal user intervention.
Abstract: Raster maps are easily accessible and contain rich road information; however, converting the road information to vector format is challenging because of varying image quality, overlapping features, and typical lack of metadata (e.g., map geocoordinates). Previous road vectorization approaches for raster maps typically handle a specific map series and require significant user effort. In this paper, we present a general road vectorization approach that exploits common geometric properties of roads in maps for processing heterogeneous raster maps while requiring minimal user intervention. In our experiments, we compared our approach to a widely used commercial product using 40 raster maps from 11 sources. We showed that overall our approach generated high-quality results with low redundancy with considerably less user input compared with competing approaches.

47 citations


Journal ArticleDOI
TL;DR: Experimental results using varying TAS(S) representation parameters on two publicly available signature databases show the improved performance of the selected feature along with the chosen elastic distance measure on the equal error rate results of the online signature verification task.
Abstract: Online signature verification has been intensively investigated in several directions, such as the selected feature(s), similarity estimation and classification method. Local feature approaches combined with elastic distance metrics have the most successful performance so far. The Turning Angle Sequence (TAS) feature has not been extensively explored for signature verification, while the fusion of TASs of different scales, the Turning Angle Scale Space (TASS) is a new approach in this field. In this paper, we study the signatures TAS and TASS representations and their application to online signature verification. In the matching stage, a variation of the longest common sub-sequence matching technique has been employed. Experimental results using varying TAS(S) representation parameters on two publicly available signature databases, the SVC2004 and SUSIG, show the improved performance of the selected feature along with the chosen elastic distance measure on the equal error rate results of the online signature verification task.

47 citations


Journal ArticleDOI
TL;DR: A novel segmentation-free Arabic handwriting recognition system based on hidden Markov model (HMM) that builds character HMM models and learns word H MM models using embedded training and outperforms all the other Arabic handwriting Recognition systems reported in the literature.
Abstract: In this paper, we present a novel segmentation-free Arabic handwriting recognition system based on hidden Markov model (HMM). Two main contributions are introduced: a new technique for dividing the image into nonuniform horizontal segments to extract the features and a new technique for solving the problems of the skewing of characters by fusing multiple HMMs. Moreover, two enhancements are introduced: the pre-processing method and feature extraction using concavity space. The proposed system first pre-processes the input image by setting the thickness of the input word to three pixels and fixing the spacing between the different parts of the word. The input image is divided into constant number of nonuniform horizontal segments depending on the distribution of the foreground pixels. A set of robust features representing the gradient of the foreground pixels is extracted using sliding windows. The input image is decomposed into several images representing the vertical, horizontal, left diagonal and right diagonal edges in the image. A set of robust features representing the densities of the foreground pixels in the various edge images is extracted using sliding windows. The proposed system builds character HMM models and learns word HMM models using embedded training. Besides the vertical sliding window, two slanted sliding windows are used to extract the features. Three different HMMs are used: one for the vertical sliding window and two for the slanted windows. A fusion scheme is used to combine the three HMMs. The proposed system is very promising and outperforms all the other Arabic handwriting recognition systems reported in the literature.

45 citations


Journal ArticleDOI
TL;DR: An in-depth evaluation of the proposed methods shows its usefulness in the context of document security with an area under the ROC curve (AUC) score of AUC=0.89 and the automatic nature of the approach allows them to be used in high-volume environments.
Abstract: In this paper, an approach for forgery detection using text-line information is presented. In questioned document examination, text-line rotation and alignment can be important clues for detecting tampered documents. Measuring and detecting such mis-rotations and mis-alignments are a cumbersome task. Therefore, an automated approach for verification of documents based on these two text-line features is proposed in this paper. An in-depth evaluation of the proposed methods shows its usefulness in the context of document security with an area under the ROC curve (AUC) score of AUC=0.89. The automatic nature of the approach allows the presented methods to be used in high-volume environments.

40 citations


Journal ArticleDOI
TL;DR: It is concluded that combining different segmentation algorithms may be an appropriate strategy for improving the correct segmentation rate.
Abstract: In this work, algorithms for segmenting handwritten digits based on different concepts are compared by evaluating them under the same conditions of implementation. A robust experimental protocol based on a large synthetic database is used to assess each algorithm in terms of correct segmentation and computational time. Results on a real database are also presented. In addition to the overall performance of each algorithm, we show the performance for different types of connections, which provides an interesting categorization of each algorithm. Another contribution of this work concerns the complementarity of the algorithms. We have observed that each method is able to segment samples that cannot be segmented by any other method, and do so independently of their individual performance. Based on this observation, we conclude that combining different segmentation algorithms may be an appropriate strategy for improving the correct segmentation rate.

36 citations


Journal ArticleDOI
TL;DR: Experimental results on a set of machine-printed documents which have been annotated by multiple writers in an office/collaborative environment show that the proposed segmentation of handwritten text and machine printed text from annotated documents is robust and provides good text separation performance.
Abstract: The convenience of search, both on the personal computer hard disk as well as on the web, is still limited mainly to machine printed text documents and images because of the poor accuracy of handwriting recognizers. The focus of research in this paper is the segmentation of handwritten text and machine printed text from annotated documents sometimes referred to as the task of "ink separation" to advance the state-of-art in realizing search of hand-annotated documents. We propose a method which contains two main steps--patch level separation and pixel level separation. In the patch level separation step, the entire document is modeled as a Markov Random Field (MRF). Three different classes (machine printed text, handwritten text and overlapped text) are initially identified using G-means based classification followed by a MRF based relabeling procedure. A MRF based classification approach is then used to separate overlapped text into machine printed text and handwritten text using pixel level features forming the second step of the method. Experimental results on a set of machine-printed documents which have been annotated by multiple writers in an office/collaborative environment show that our method is robust and provides good text separation performance.

Journal ArticleDOI
TL;DR: This article presents IESK-arDB, a new multi-propose off-line Arabic handwritten database and proposes a multi-phase segmentation approach that starts by detecting and resolving sub-word overlaps, then hypothesizing a large number of segmentation points that are later reduced by a set of heuristic rules.
Abstract: Even though a lot of researches have been conducted in order to solve the problem of unconstrained handwriting recognition, an effective solution is still a serious challenge. In this article, we address two Arabic handwriting recognition-related issues. Firstly, we present IESK-arDB, a new multi-propose off-line Arabic handwritten database. It is publicly available and contains more than 4,000 word images, each equipped with binary version, thinned version as well as a ground truth information stored in separate XML file. Additionally, it contains around 6,000 character images segmented from the database. A letter frequency analysis showed that the database exhibits letter frequencies similar to that of large corpora of digital text, which proof the database usefulness. Secondly, we proposed a multi-phase segmentation approach that starts by detecting and resolving sub-word overlaps, then hypothesizing a large number of segmentation points that are later reduced by a set of heuristic rules. The proposed approach has been successfully tested on IESK-arDB. The results were very promising, indicating the efficiency of the suggested approach.

Journal ArticleDOI
TL;DR: This work has developed a new approach for handwritten digit recognition that uses a small number of patterns for training phase and introduced a novel reliability parameter which is applied to tackle the problem of PSO being trapped in local minima.
Abstract: The problem of handwritten digit recognition has long been an open problem in the field of pattern classification and of great importance in industry. The heart of the problem lies within the ability to design an efficient algorithm that can recognize digits written and submitted by users via a tablet, scanner, and other digital devices. From an engineering point of view, it is desirable to achieve a good performance within limited resources. To this end, we have developed a new approach for handwritten digit recognition that uses a small number of patterns for training phase. To improve the overall performance achieved in classification task, the literature suggests combining the decision of multiple classifiers rather than using the output of the best classifier in the ensemble; so, in this new approach, an ensemble of classifiers is used for the recognition of handwritten digit. The classifiers used in proposed system are based on singular value decomposition (SVD) algorithm. The experimental results and the literature show that the SVD algorithm is suitable for solving sparse matrices such as handwritten digit. The decisions obtained by SVD classifiers are combined by a novel proposed combination rule which we named reliable multi-phase particle swarm optimization. We call the method "Reliable" because we have introduced a novel reliability parameter which is applied to tackle the problem of PSO being trapped in local minima. In comparison with previous methods, one of the significant advantages of the proposed method is that it is not sensitive to the size of training set. Unlike other methods, the proposed method uses just 15 % of the dataset as a training set, while other methods usually use (60---75) % of the whole dataset as the training set. To evaluate the proposed method, we tested our algorithm on Farsi/Arabic handwritten digit dataset. What makes the recognition of the handwritten Farsi/Arabic digits more challenging is that some of the digits can be legally written in different shapes. Therefore, 6000 hard samples (600 samples per class) are chosen by K-nearest neighbor algorithm from the HODA dataset which is a standard Farsi/Arabic digit dataset. Experimental results have shown that the proposed method is fast, accurate, and robust against the local minima of PSO. Finally, the proposed method is compared with state of the art methods and some ensemble classifier based on MLP, RBF, and ANFIS with various combination rules.

Journal ArticleDOI
TL;DR: A novel curled text-line segmentation algorithm is introduced by adapting active contour (snake) by estimating pairs of x-line and baseline and achieving improved accuracy on the DFKI-I (CBDAR 2007 dewarping contest) dataset.
Abstract: Camera-captured, warped document images usually contain curled text-lines because of distortions caused by camera perspective view and page curl. Warped document images can be transformed into planar document images for improving optical character recognition accuracy and human readability using monocular dewarping techniques. Curled text-lines segmentation is a crucial initial step for most of the monocular dewarping techniques. Existing curled text-line segmentation approaches are sensitive to geometric and perspective distortions. In this paper, we introduce a novel curled text-line segmentation algorithm by adapting active contour (snake). Our algorithm performs text-line segmentation by estimating pairs of x-line and baseline. It estimates a local pair of x-line and baseline on each connected component by jointly tracing top and bottom points of neighboring connected components, and finally each group of overlapping pairs is considered as a segmented text-line. Our algorithm has achieved curled text-line segmentation accuracy of above 95% on the DFKI-I (CBDAR 2007 dewarping contest) dataset, which is significantly better than previously reported results on this dataset.

Journal ArticleDOI
TL;DR: The importance, requirements, and challenges of a patent image retrieval system are introduced and an overview of the algorithms developed for the retrieval and analysis of CAD and technical drawings, diagrams, data flow diagrams, circuit diagrams,Data charts, flowcharts, plots, and symbol recognition are presented.
Abstract: To verify the originality of an invention in a patent, the graphical description available in the form of patent drawings often plays a critical role. This paper introduces the importance, requirements, and challenges of a patent image retrieval system. We present a brief account of the work done in the specific and related areas of the patent image domain. We begin with a review of work done dealing specifically with retrieval and analysis of images in the patent domain. Although the literature found dealing with patent images is small, there is a significant amount of work that has been done in related areas that is useful and applicable to the patent image area. From a methodological point of view, we present an overview of the algorithms developed for the retrieval and analysis of CAD and technical drawings, diagrams, data flow diagrams, circuit diagrams, data charts, flowcharts, plots, and symbol recognition.

Journal ArticleDOI
TL;DR: This paper presents an efficient system that automatically generates prototypes for each word in a lexicon using multiple appearances of each letter, which can automatically generate large databases, whose quality is at least as good as the manually generated ones.
Abstract: Developing and maintaining large comprehensive databases for script recognition that include different shapes for each word in the lexicon is expensive and difficult. In this paper, we present an efficient system that automatically generates prototypes for each word in a lexicon using multiple appearances of each letter. Large sets of different shapes are created for each letter in each position. These sets are then used to generate valid shapes for each word-part. The number of valid permutations for each word is large and prohibits practical training and searching for various tasks, such as script recognition and word spotting. We apply dimensionality reduction and clustering techniques to maintain compact representation of these databases, without affecting their ability to represent the wide variety of handwriting styles. In addition, a database for off-line script recognition is generated from the on-line strokes using a standard dilation technique, while making special efforts to resemble pen's path. We also examined and used several layout techniques for producing words from the generated word-parts. Our experimental results show that the proposed system can automatically generate large databases, whose quality is at least as good as the manually generated ones.

Journal ArticleDOI
TL;DR: A distance transform based technique that aims to remove irregular and independent unwanted clutter while preserving the text content for clutter detection and removal for complex document images.
Abstract: The paper presents a clutter detection and removal algorithm for complex document images. This distance transform based technique aims to remove irregular and independent unwanted clutter while preserving the text content. The novelty of this approach is in its approximation to the clutter---content boundary when the clutter is attached to the content in irregular ways. As an intermediate step, a residual image is created, which forms the basis for clutter detection and removal. Clutter detection and removal are independent of clutter's position, size, shape, and connectivity with text. The method is tested on a collection of highly degraded and noisy, machine-printed and handwritten Arabic and English documents, and results show pixel-level accuracies of 99.18 and 98.67 % for clutter detection and removal, respectively. This approach is also extended to documents having a mix of clutter and salt-and-pepper noise.

Journal ArticleDOI
TL;DR: This paper proposes applying content-based image retrieval methods to detect both exact and similar copies based on two kinds of regions of interest (ROIs): generic ROIs and face ROIs, and proves high performance of the proposed method for detecting printed partial copies.
Abstract: Manga, a kind of Japanese comic book, is an important genre in the realm of image publications requiring copyright protection. To copy manga, illegal users generally focus on certain interesting parts from which to make partial copies to apply in their own drawings. With respect to their sources, copying of manga can be divided into two types: (1) exact copies, which duplicate specific contents of manga, such as scanned manga publications (printed copies) and traced outlines of manga (hand-drawn copies), and (2) similar partial copies, which infringe the copyright of manga characters based on their features. In this paper, we propose applying content-based image retrieval methods to detect both exact and similar copies based on two kinds of regions of interest (ROIs): generic ROIs and face ROIs. The method is able not only to locate the partial copies from images with complex backgrounds, but also to report the corresponding copied parts of copyrighted manga pages for exact copy detection and copied manga characters for similar copy detection. The experimental results prove high performance of the proposed method for detecting printed partial copies. In addition, 85 % of hand-drawn and 77 % of similar partial copies were detected with relatively high precision using a database containing more than $$10{,}000$$ manga pages.

Journal ArticleDOI
TL;DR: An adaptive water flow model for the binarization of degraded document images that controls the rainfall process in such a way that the water fills up to half of the valley’s depth and classifies the blobs instead of pixels, it preserves stroke connectivity.
Abstract: In this paper, we present an adaptive water flow model for the binarization of degraded document images. We regard an image surface as a three-dimensional terrain and pour water on it. The water finds the valleys and fills them. Our algorithm controls the rainfall process, pouring the water, in such a way that the water fills up to half of the valley's depth. After stopping the rainfall, each wet region represents one character or a noisy component. To segment each character, we labeled the wet regions and regarded them as blobs; since some of the blobs are noisy components, we use a multilayer Perceptron to label each blob as either text or non-text. Since our algorithm classifies the blobs instead of pixels, it preserves stroke connectivity. After several experiments, the proposed binarization algorithm demonstrated superior performance against six well-known algorithms on three sets of degraded document images. The main superiority of our algorithm is on document images with uneven illumination.

Journal ArticleDOI
TL;DR: The problem of see-through cancelation in digital images of double-sided documents is addressed and it is shown that a nonlinear convolutional data model proposed elsewhere for moderate show-through can also be effective on strong back-to-front interferences.
Abstract: The problem of see-through cancelation in digital images of double-sided documents is addressed. We show that a nonlinear convolutional data model proposed elsewhere for moderate show-through can also be effective on strong back-to-front interferences, provided that the recto and verso pure patterns are estimated jointly. To this end, we propose a restoration algorithm that does not need any classification of the pixels. The see-through PSFs are estimated off-line, and an iterative procedure is then employed for a joint estimation of the pure patterns. This simple and fast algorithm can be used on both grayscale and color images and has proved to be very effective in real-world cases. The experimental results we report in this paper demonstrate that our algorithm outperforms the ones based on linear models with no need to tune free parameters and remains computationally inexpensive despite the nonlinear model and the iterative solution adopted. Strategies to overcome some of the residual difficulties are also envisaged.

Journal ArticleDOI
TL;DR: The exhaustive experimental evaluation of the proposed framework on a collection of documents belonging to Devanagari, Bengali and English scripts has yielded encouraging results.
Abstract: In this paper, we propose a novel feature representation for binary patterns by exploiting the object shape information. Initial evaluation of the representation is performed for Bengali and Gujarati script character classification. The extension of the representation for word images is presented subsequently. The proposed feature representation in combination with distance-based hashing is applied for defining novel word image-based document image indexing and retrieval framework. The concept of hierarchical hashing is utilized to reduce the retrieval time complexity. In addition, with the objective of reduction in the size of hashing data structure, the concept of multi-probe hashing is extended for binary mapping functions. The exhaustive experimental evaluation of the proposed framework on a collection of documents belonging to Devanagari, Bengali and English scripts has yielded encouraging results.

Journal ArticleDOI
TL;DR: Results demonstrate that the proposed fast self-generation voting method outperforms the state-of-the-art methods and is useful for practical applications.
Abstract: In this paper, a fast self-generation voting method is proposed for further improving the performance in handwritten Chinese character recognition. In this method, firstly, a set of samples are generated by the proposed fast self-generation method, and then these samples are classified by the baseline classifier, and the final recognition result is determined by voting from these classification results. Two methods that are normalization-cooperated feature extraction strategy and an approximated line density are used for speeding up the self-generation method. We evaluate the proposed method on the CASIA and CASIA-HWDB1.1 databases. High recognition rate of 98.84 % on the CASIA database and 91.17 % on the CASIA-HWDB1.1 database are obtained. These results demonstrate that the proposed method outperforms the state-of-the-art methods and is useful for practical applications.

Journal ArticleDOI
TL;DR: This study comprehensively evaluate state-of-the-art statistical methods in HHR and implemented fifteen character normalization methods, five feature extraction methods, and four classification methods and evaluated their performances on two public handwritten Hangul databases.
Abstract: Although structural approaches have shown better performance than statistical ones in handwritten Hangul recognition (HHR), they have not been widely used in practical applications because of their vulnerability to image degradation and high computational complexity. Statistical approaches have not received high attention in HHR because their early trials were not promising enough. The past decade has seen significant improvements in statistical recognition in handwritten character recognition, including handwritten Chinese character recognition. Nevertheless, without a systematic evaluation on the effects of statistical methods in HHR, they cannot draw enough attention because of their discouraging experience. In this study, we comprehensively evaluate state-of-the-art statistical methods in HHR. Specifically, we implemented fifteen character normalization methods, five feature extraction methods, and four classification methods and evaluated their performances on two public handwritten Hangul databases. On the SERI database, statistical methods achieved the best performance of 93.71 % accuracy, which is higher than the best result achieved by structural recognizers. On the PE92 database, which has small number of samples per class, statistical methods gave slightly lower performance than the best structural recognizer.

Journal ArticleDOI
TL;DR: In tests with 383 text lines in HIT-MW database, the proposed method achieved the character-level recognition rates of 71.37% without any language model, and 80.15% with a bi-gram language model.
Abstract: This paper presents a new Bayesian-based method of unconstrained handwritten offline Chinese text line recognition. In this method, a sample of a real character or non-character in realistic handwritten text lines is jointly recognized by a traditional isolated character recognizer and a character verifier, which requires just a moderate number of handwritten text lines for training. To improve its ability to distinguish between real characters and non-characters, the isolated character recognizer is negatively trained using a linear discriminant analysis (LDA)-based strategy, which employs the outputs of a traditional MQDF classifier and the LDA transform to re-compute the posterior probability of isolated character recognition. In tests with 383 text lines in HIT-MW database, the proposed method achieved the character-level recognition rates of 71.37% without any language model, and 80.15% with a bi-gram language model, respectively. These promising results have shown the effectiveness of the proposed method for unconstrained handwritten offline Chinese text line recognition.

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed visual word density-based nonlinear normalization method for handwritten Chinese character recognition outperforms the start-of-the-art methods and can be applied to some other image classification problems.
Abstract: In handwritten Chinese character recognition, the performance of a system is largely dependent on the character normalization method. In this paper, a visual word density-based nonlinear normalization method is proposed for handwritten Chinese character recognition. The underlying rationality is that the density for each image pixel should be determined by the visual word around this pixel. Visual vocabulary is used for mapping from a visual word to a density value. The mapping vocabulary is learned to maximize the ratio of the between-class variation and the within-class variation. Feature extraction is involved in the optimization stage, hence the proposed normalization method is beneficial for the following feature extraction. Furthermore, the proposed method can be applied to some other image classification problems in which scene character recognition is tried in this paper. Experimental results on one constrained handwriting database (CASIA) and one unconstrained handwriting database (CASIA-HWDB1.1) demonstrate that the proposed method outperforms the start-of-the-art methods. Experiments on scene character databases chars74k and ICDAR03-CH show that the proposed method is promising for some image classification problems.

Journal ArticleDOI
Nicholas R. Howe1
TL;DR: H2009-T1 96.4 96.6 96.5 96.3 97.2 P 2009-T2 98.0 98.5 98.1 98.4 98.3 P2009-4 93.5 93.3 94.7 93.7 94.6 All 92.7 95.0 93.9 95.1 H2009-1 92.9 92.8 92.2 94.8 95.2
Abstract: H2009-T1 96.4 96.6 96.6 96.1 96.6 H2009-T2 93.3 93.4 92.5 93.1 93.1 H2009-1 92.7 95.9 95.4 95.8 95.8 H2009-2 90.1 96.4 88.8 96.2 95.7 H2009-3 94.7 95.6 92.0 94.8 95.1 H2009-4 92.1 94.7 93.9 94.6 94.5 H2009-5 86.5 92.7 91.9 92.3 92.6 H2010-1 93.7 96.2 95.0 95.4 96.0 H2010-2 89.0 96.1 95.4 95.7 95.4 H2010-3 94.4 94.7 93.3 94.7 94.7 H2010-4 92.9 94.4 93.5 93.8 93.9 H2010-5 92.9 96.5 96.1 96.5 96.4 H2010-6 90.9 91.2 90.4 90.9 90.9 H2010-7 95.0 95.2 94.8 95.0 95.1 H2010-8 92.6 93.7 92.2 93.5 93.4 H2010-9 92.6 93.8 92.9 92.1 93.5 H2010-10 88.8 92.7 90.8 92.5 87.2 P2009-T1 88.5 97.4 96.7 97.3 97.2 P2009-T2 98.0 98.5 98.1 98.4 98.5 P2009-1 93.9 94.3 90.4 94.0 94.2 P2009-2 96.8 96.9 96.8 96.8 96.9 P2009-3 97.6 98.4 98.2 98.2 98.3 P2009-4 93.5 93.8 92.8 92.9 92.8 P2009-5 88.6 91.5 84.7 91.2 84.3 Hand 92.3 94.7 93.3 94.3 94.1 Print 93.9 95.8 93.9 95.6 94.6 All 92.7 95.0 93.5 94.7 94.3