scispace - formally typeset
Search or ask a question
Author

Richard P. Lippmann

Bio: Richard P. Lippmann is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Artificial neural network & Intrusion detection system. The author has an hindex of 43, co-authored 92 publications receiving 21619 citations.


Papers
More filters
Proceedings ArticleDOI
01 Apr 1987
TL;DR: A two-stage isolated word speech recognition system that uses a Hidden Markov Model (HMM) recognizer in the first stage and a discriminant analysis system in the second stage, reducing the overall error rate by more than a factor of two.
Abstract: This paper describes a two-stage isolated word speech recognition system that uses a Hidden Markov Model (HMM) recognizer in the first stage and a discriminant analysis system in the second stage. During recognition, when the first-stage recognizer is unable to clearly differentiate between acoustically similar words such as "go" and "no" the second-stage discriminator is used. The second-stage system focuses on those parts of the unknown token which are most effective at discriminating the confused words. The system was tested on a 35 word, 10,710 token stress speech isolated word data base created at Lincoln Laboratory. Adding the second-stage discriminating system produced the best results to date on this data base, reducing the overall error rate by more than a factor of two.

27 citations

Proceedings ArticleDOI
27 Apr 1993
TL;DR: Two approaches to integrating neural network and hidden Markov model (HMM) algorithms into one hybrid wordspotter are being explored.
Abstract: Two approaches to integrating neural network and hidden Markov model (HMM) algorithms into one hybrid wordspotter are being explored. One approach uses neural network secondary testing to analyze putative hits produced by a high-performance HMM wordspotter. This has provided consistent but small reductions in the number of false alarms required to obtain a given detection rate. In one set of experiments using the NIST Road Rally database, secondary testing reduced the false alarm rate by an average of 16.4%. A second approach uses radial basis function (RBF) neural networks to produce local machine scores for a Viterbi decoder. Network weights and RBF centers are trained at the word level to produce a high score for the correct keyword hits and a low score for false alarms generated by nonkeyword speech. Preliminary experiments using this approach are exploring a constructive approach which adds RBF centers to model nonkeyword near-misses and a cost function which attempts to maximize directly average detection accuracy over a specified range of false alarm rates. >

25 citations

Proceedings Article
30 Nov 1992
TL;DR: A new boundary hunting radial basis function (BH-RBF) classifier which allocates RBF centers constructively near class boundaries is described, which provides a lower error rate with fewer centers than are required by more conventional RBF, Gaussian mixture, or MLP classifiers.
Abstract: A new boundary hunting radial basis function (BH-RBF) classifier which allocates RBF centers constructively near class boundaries is described. This classifier creates complex decision boundaries only in regions where confusions occur and corresponding RBF outputs are similar. A predicted square error measure is used to determine how many centers to add and to determine when to stop adding centers. Two experiments are presented which demonstrate the advantages of the BH-RBF classifier. One uses artificial data with two classes and two input features where each class contains four clusters but only one cluster is near a decision region boundary. The other uses a large seismic database with seven classes and 14 input features. In both experiments the BH-RBF classifier provides a lower error rate with fewer centers than are required by more conventional RBF, Gaussian mixture, or MLP classifiers.

25 citations

Book ChapterDOI
01 Jan 1994
TL;DR: Analytic results are presented which demonstrate that many neural network classifiers can accurately estimate posterior probabilities and that these neural networkclassifiers can sometimes provide lower error rates than PDF classifiers using the same number of trainable parameters.
Abstract: Researchers in the fields of neural networks, statistics, machine learning, and artificial intelligence have followed three basic approaches to developing new pattern classifiers. Probability Density Function (PDF) classifiers include Gaussian and Gaussian Mixture classifiers which estimate distributions or densities of input features separately for each class. Posterior probability classifiers include multilayer perceptron neural networks with sigmoid nonlinearities and radial basis function networks. These classifiers estimate minimum-error Bayesian a posteriori probabilities (hereafter referred to as posterior probabilities) simultaneously for all classes. Boundary forming classifiers include hard-limiting single-layer perceptrons, hypersphere classifiers, and nearest neighbor classifiers. These classifiers have binary indicator outputs which form decision regions that specify the class of any input pattern. Posterior probability and boundary-forming classifiers are trained using discriminant training. All training data is used simultaneously to estimate Bayesian posterior probabilities or minimize overall classification error rates. PDF classifiers are trained using maximum likelihood approaches which individually model class distributions without regard to overall classification performance. Analytic results are presented which demonstrate that many neural network classifiers can accurately estimate posterior probabilities and that these neural network classifiers can sometimes provide lower error rates than PDF classifiers using the same number of trainable parameters. Experiments also demonstrate how interpretation of network outputs as posterior probabilities makes it possible to estimate the confidence of a classification decision, compensate for differences in class prior probabilities between test and training data, and combine outputs of multiple classifiers over time for speech recognition.

25 citations

Journal ArticleDOI
TL;DR: The implementation of a hidden Markov model state decoding system, a component for a wordspotting speech recognition system, and the mapping of the discrete-time state decoding algorithm into the continuous domain are described.
Abstract: We describe the implementation of a hidden Markov model state decoding system, a component for a wordspotting speech recognition system. The key specification for this state decoder design is microwatt power dissipation: this requirement led to a continuous-time, analog circuit implementation. We describe the tradeoffs inherent in the choice of an analog design and explain the mapping of the discrete-time state decoding algorithm into the continuous domain. We characterize the operation of a ten-word (81-state) state decoder test chip.

21 citations


Cited by
More filters
Journal ArticleDOI
Lawrence R. Rabiner1
01 Feb 1989
TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Abstract: This tutorial provides an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and gives practical details on methods of implementation of the theory along with a description of selected applications of the theory to distinct problems in speech recognition. Results from a number of original sources are combined to provide a single source of acquiring the background required to pursue further this area of research. The author first reviews the theory of discrete Markov chains and shows how the concept of hidden states, where the observation is a probabilistic function of the state, can be used effectively. The theory is illustrated with two simple examples, namely coin-tossing, and the classic balls-in-urns system. Three fundamental problems of HMMs are noted and several practical techniques for solving these problems are given. The various types of HMMs that have been studied, including ergodic as well as left-right models, are described. >

21,819 citations

Book
01 Jan 1995
TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Abstract: From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization in neural networks, and Bayesian techniques and their applications. Designed as a text, with over 100 exercises, this fully up-to-date work will benefit anyone involved in the fields of neural computation and pattern recognition.

19,056 citations

Book ChapterDOI
TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Abstract: Publisher Summary This chapter provides an account of different neural network architectures for pattern recognition. A neural network consists of several simple processing elements called neurons. Each neuron is connected to some other neurons and possibly to the input nodes. Neural networks provide a simple computing paradigm to perform complex recognition tasks in real time. The chapter categorizes neural networks into three types: single-layer networks, multilayer feedforward networks, and feedback networks. It discusses the gradient descent and the relaxation method as the two underlying mathematical themes for deriving learning algorithms. A lot of research activity is centered on learning algorithms because of their fundamental importance in neural networks. The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue. It closes with the discussion of performance and implementation issues.

13,033 citations

Journal ArticleDOI
TL;DR: It is demonstrated that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube.
Abstract: In this paper we demonstrate that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube; only mild conditions are imposed on the univariate function. Our results settle an open question about representability in the class of single hidden layer neural networks. In particular, we show that arbitrary decision regions can be arbitrarily well approximated by continuous feedforward neural networks with only a single internal, hidden layer and any continuous sigmoidal nonlinearity. The paper discusses approximation properties of other possible types of nonlinearities that might be implemented by artificial neural networks.

12,286 citations

Journal ArticleDOI
TL;DR: It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.
Abstract: In the first part of this paper, a regular recurrent neural network (RNN) is extended to a bidirectional recurrent neural network (BRNN). The BRNN can be trained without the limitation of using input information just up to a preset future frame. This is accomplished by training it simultaneously in positive and negative time direction. Structure and training procedure of the proposed network are explained. In regression and classification experiments on artificial data, the proposed structure gives better results than other approaches. For real data, classification experiments for phonemes from the TIMIT database show the same tendency. In the second part of this paper, it is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution. For this part, experiments on real data are reported.

7,290 citations