scispace - formally typeset
Search or ask a question
Author

Lambert Schomaker

Bio: Lambert Schomaker is an academic researcher from University of Groningen. The author has contributed to research in topics: Handwriting & Handwriting recognition. The author has an hindex of 41, co-authored 224 publications receiving 6094 citations. Previous affiliations of Lambert Schomaker include Radboud University Nijmegen & Nijmegen Institute for Cognition and Information.


Papers
More filters
Journal ArticleDOI
TL;DR: New and very effective techniques for automatic writer identification and verification that use probability distribution functions (PDFs) extracted from the handwriting images to characterize writer individuality are developed.
Abstract: The identification of a person on the basis of scanned images of handwriting is a useful biometric modality with application in forensic and historic document analysis and constitutes an exemplary study area within the research field of behavioral biometrics. We developed new and very effective techniques for automatic writer identification and verification that use probability distribution functions (PDFs) extracted from the handwriting images to characterize writer individuality. A defining property of our methods is that they are designed to be independent of the textual content of the handwritten samples. Our methods operate at two levels of analysis: the texture level and the character-shape (allograph) level. At the texture level, we use contour-based joint directional PDFs that encode orientation and curvature information to give an intimate characterization of individual handwriting style. In our analysis at the allograph level, the writer is considered to be characterized by a stochastic pattern generator of ink-trace fragments, or graphemes. The PDF of these simple shapes in a given handwriting sample is characteristic for the writer and is computed using a common shape codebook obtained by grapheme clustering. Combining multiple features (directional, grapheme, and run-length PDFs) yields increased writer identification and verification performance. The proposed methods are applicable to free-style handwriting (both cursive and isolated) and have practical feasibility, under the assumption that a few text lines of handwritten material are available in order to obtain reliable probability estimates

468 citations

Proceedings ArticleDOI
09 Oct 1994
TL;DR: The status of the UNIPEN project of data exchange and recognizer benchmarks started two years ago is reported, to propose and implement solutions to the growing need of handwriting samples for online handwriting recognizers used by pen-based computers.
Abstract: We report the status of the UNIPEN project of data exchange and recognizer benchmarks started two years ago at the initiative of the International Association of Pattern Recognition (Technical Committee 11). The purpose of the project is to propose and implement solutions to the growing need of handwriting samples for online handwriting recognizers used by pen-based computers. Researchers from several companies and universities have agreed on a data format, a platform of data exchange and a protocol for recognizer benchmarks. The online handwriting data of concern may include handprint and cursive from various alphabets (including Latin and Chinese), signatures and pen gestures. These data will be compiled and distributed by the Linguistic Data Consortium. The benchmarks will be arbitrated the US National Institute of Standards and Technologies. We give a brief introduction to the UNIPEN format. We explain the protocol of data exchange and benchmarks.

437 citations

Journal ArticleDOI
TL;DR: The proposed automatic approach bridges the gap between image-statistics approaches on one end and manually measured allograph features of individual characters on the other end, and revealed a high-sensitivity of the CO/sup 3/ PDF for identifying individual writers on the basis of a single sentence of uppercase characters.
Abstract: In this paper, a new technique for offline writer identification is presented, using connected-component contours (COCOCOs or CO/sup 3/s) in uppercase handwritten samples. In our model, the writer is considered to be characterized by a stochastic pattern generator, producing a family of connected components for the uppercase character set. Using a codebook of CO/sup 3/s from an independent training set of 100 writers, the probability-density function (PDF) of CC's was computed for an independent test set containing 150 unseen writers. Results revealed a high-sensitivity of the CO/sup 3/ PDF for identifying individual writers on the basis of a single sentence of uppercase characters. The proposed automatic approach bridges the gap between image-statistics approaches on one end and manually measured allograph features of individual characters on the other end. Combining the CO/sup 3/ PDF with an independent edge-based orientation and curvature PDF yielded very high correct identification rates.

265 citations

Proceedings ArticleDOI
23 Aug 2004
TL;DR: A system that reads the text encountered in natural scenes with the aim to provide assistance to the visually impaired persons and evaluates several character extraction methods based on connected components.
Abstract: We propose a system that reads the text encountered in natural scenes with the aim to provide assistance to the visually impaired persons. This paper describes the system design and evaluates several character extraction methods. Automatic text recognition from natural images receives a growing attention because of potential applications in image retrieval, robotics and intelligent transport system. Camera-based document analysis becomes a real possibility with the increasing resolution and availability of digital cameras. However, in the case of a blind person, finding the text region is the first important problem that must be addressed, because it cannot be assumed that the acquired image contains only characters. At first, our system tries to find in the image areas with small characters. Then it zooms into the found areas to retake higher resolution images necessary for character recognition. In the present paper, we propose four character-extraction methods based on connected components. We tested the effectiveness of our methods on the ICDAR 2003 Robust Reading Competition data. The performance of the different methods depends on character size. In the data, bigger characters are more prevalent and the most effective extraction method proves to be the sequence: Sobel edge detection, Otsu binarization, connected component extraction and rule-based connected component filtering.

242 citations

Proceedings ArticleDOI
06 Aug 2002
TL;DR: This article discusses and test several well known voting methods from politics and economics on classifier combination in order to see if an alternative to the simple plurality vote exists, and finds that, assuming a number of prerequisites, better methods are available, that are comparatively simple and fast.
Abstract: In pattern recognition, there is a growing use of multiple classifier combinations with the goal to increase recognition performance. In many cases, plurality voting is a part of the combination process. In this article, we discuss and test several well known voting methods from politics and economics on classifier combination in order to see if an alternative to the simple plurality vote exists. We found that, assuming a number of prerequisites, better methods are available, that are comparatively simple and fast.

200 citations


Cited by
More filters
01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations

Journal ArticleDOI
TL;DR: The nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms are described.
Abstract: Handwriting has continued to persist as a means of communication and recording information in day-to-day life even with the introduction of new technologies. Given its ubiquity in human transactions, machine recognition of handwriting has practical significance, as in reading handwritten notes in a PDA, in postal addresses on envelopes, in amounts in bank checks, in handwritten fields in forms, etc. This overview describes the nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms. Both the online case (which pertains to the availability of trajectory data during writing) and the off-line case (which pertains to scanned images) are considered. Algorithms for preprocessing, character and word recognition, and performance with practical systems are indicated. Other fields of application, like signature verification, writer authentification, handwriting learning tools are also considered.

2,653 citations

Journal ArticleDOI
TL;DR: The FERET evaluation procedure is an independently administered test of face-recognition algorithms to allow a direct comparison between different algorithms and to assess the state of the art in face recognition.

2,494 citations

Reference EntryDOI
15 Oct 2004

2,118 citations