scispace - formally typeset
Search or ask a question
Author

Charles C. Tappert

Bio: Charles C. Tappert is an academic researcher from Pace University. The author has contributed to research in topics: Keystroke logging & Biometrics. The author has an hindex of 29, co-authored 193 publications receiving 4393 citations. Previous affiliations of Charles C. Tappert include United States Military Academy & IBM.


Papers
More filters
Journal ArticleDOI
TL;DR: The state of the art of online handwriting recognition during a period of renewed activity in the field is described, based on an extensive review of the literature, including journal articles, conference proceedings, and patents.
Abstract: This survey describes the state of the art of online handwriting recognition during a period of renewed activity in the field. It is based on an extensive review of the literature, including journal articles, conference proceedings, and patents. Online versus offline recognition, digitizer technology, and handwriting properties and recognition problems are discussed. Shape recognition algorithms, preprocessing and postprocessing techniques, experimental systems, and commercial products are examined. >

922 citations

Journal Article
TL;DR: This work has collected 76 binary similarity and distance measures used over the last century and reveals their correlations through the hierarchical clustering technique.
Abstract: The binary feature vector is one of the most common representations of patterns and measuring similarity and distance measures play a critical role in many problems such as clustering, classification, etc. Ever since Jaccard proposed a similarity measure to classify ecological species in 1901, numerous binary similarity and distance measures have been proposed in various fields. Applying appropriate measures results in more accurate data analysis. Notwithstanding, few comprehensive surveys on binary measures have been conducted. Hence we collected 76 binary similarity and distance measures used over the last century and reveal their correlations through the hierarchical clustering technique.

799 citations

Journal ArticleDOI
Charles C. Tappert1
TL;DR: A major advantage of this procedure is that it combines letter segmentation and recognition in one operation by, in essence, evaluating recognition at all possible segmentations, thus avoiding the usual segmentation-then-recognition philosophy.
Abstract: Dynamic programming has been found useful for performing nonlinear time warping for matching patterns in automatic speech recognition. Here, this technique is applied to the problem of recognizing cursive script. The parameters used in the matching are derived from time sequences of x-y coordinate data of words handwritten on an electronic tablet. Chosen for their properties of invariance with respect to size and translation of the writing, these parameters are found particularly suitable for the elastic matching technique. A salient feature of the recognition system is the establishment, in a training procedure, of prototypes by each writer using the system. In this manner, the system is tailored to the user. Processing is performed on a word-by-word basis after the writing is separated into words. Using prototypes for each letter, the matching procedure allows any letter to follow any letter and finds the letter sequence which best fits the unknown word. A major advantage of this procedure is that it combines letter segmentation and recognition in one operation by, in essence, evaluating recognition at all possible segmentations, thus avoiding the usual segmentation-then-recognition philosophy. Results on cursive writing are presented where the alphabet is restricted to the lower-case letters. Letter recognition accuracy is over 95 percent for each of three writers.

188 citations

Proceedings ArticleDOI
Mary Villani1, Charles C. Tappert1, Giang Ngo1, J. Simone1, H.St. Fort1, Sung-Hyuk Cha1 
17 Jun 2006
TL;DR: Results indicate that the keystroke biometric can accurately identify an individual who sends inappropriate email (free text) if sufficient enrollment samples are available and if the same type of keyboard is used to produce the enrollment and questioned samples.
Abstract: A long-text-input keystroke biometric system was developed for applications such as identifying perpetrators of inappropriate e-mail or fraudulent Internet activity. A Java applet collected raw keystroke data over the Internet, appropriate long-text-input features were extracted, and a pattern classifier made identification decisions. Experiments were conducted on a total of 118 subjects using two input modes - copy and free-text input - and two keyboard types - desktop and laptop keyboards. Results indicate that the keystroke biometric can accurately identify an individual who sends inappropriate email (free text) if sufficient enrollment samples are available and if the same type of keyboard is used to produce the enrollment and questioned samples. For laptop keyboards we obtained 99.5% accuracy on 36 users, which decreased to 97.9% on a larger population of 47 users. For desktop keyboards we obtained 98.3% accuracy on 36 users, which decreased to 93.3% on a larger population of 93 users. Accuracy decreases significantly when subjects used different keyboard types or different input modes for enrollment and testing.

121 citations

Journal ArticleDOI
01 Jan 2009
TL;DR: A method to encode and decode a decision tree to and from a chromosome where genetic operators such as mutation and crossover can be applied to improve on the finding of compact, near-optimal decision trees is presented.
Abstract: Tree-based classifiers are important in pattern recognitio n and have been well studied. Although the problem of finding an optimal decision tree has r eceived attention, it is a hard optimization problem. Here we propose utilizing a genetic algorithm to improve on the finding of compact, near-optimal decision trees. We present a method to encode and decode a decision tree to and from a chromosome where genetic operators such as mutation and crossover can be applied. Theoretical properties of decisi on trees, encoded chromosomes, and fitness functions are presented.

118 citations


Cited by
More filters
Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Jan 2002

9,314 citations

Journal ArticleDOI
TL;DR: The nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms are described.
Abstract: Handwriting has continued to persist as a means of communication and recording information in day-to-day life even with the introduction of new technologies. Given its ubiquity in human transactions, machine recognition of handwriting has practical significance, as in reading handwritten notes in a PDA, in postal addresses on envelopes, in amounts in bank checks, in handwritten fields in forms, etc. This overview describes the nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms. Both the online case (which pertains to the availability of trajectory data during writing) and the off-line case (which pertains to scanned images) are considered. Algorithms for preprocessing, character and word recognition, and performance with practical systems are indicated. Other fields of application, like signature verification, writer authentification, handwriting learning tools are also considered.

2,653 citations