Showing papers by "Richard P. Lippmann published in 1994"

PDF

Open Access

Journal Article•DOI•

Book Review: "Neural Networks, A Comprehensive Foundation", by Simon Haykin

[...]

Richard P. Lippmann¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Dec 1994-International Journal of Neural Systems

99 citations

Book Chapter•DOI•

Neural Networks, Bayesian a posteriori Probabilities, and Pattern Classification

[...]

Richard P. Lippmann¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 1994

TL;DR: Analytic results are presented which demonstrate that many neural network classifiers can accurately estimate posterior probabilities and that these neural networkclassifiers can sometimes provide lower error rates than PDF classifiers using the same number of trainable parameters.

...read moreread less

Abstract: Researchers in the fields of neural networks, statistics, machine learning, and artificial intelligence have followed three basic approaches to developing new pattern classifiers. Probability Density Function (PDF) classifiers include Gaussian and Gaussian Mixture classifiers which estimate distributions or densities of input features separately for each class. Posterior probability classifiers include multilayer perceptron neural networks with sigmoid nonlinearities and radial basis function networks. These classifiers estimate minimum-error Bayesian a posteriori probabilities (hereafter referred to as posterior probabilities) simultaneously for all classes. Boundary forming classifiers include hard-limiting single-layer perceptrons, hypersphere classifiers, and nearest neighbor classifiers. These classifiers have binary indicator outputs which form decision regions that specify the class of any input pattern. Posterior probability and boundary-forming classifiers are trained using discriminant training. All training data is used simultaneously to estimate Bayesian posterior probabilities or minimize overall classification error rates. PDF classifiers are trained using maximum likelihood approaches which individually model class distributions without regard to overall classification performance. Analytic results are presented which demonstrate that many neural network classifiers can accurately estimate posterior probabilities and that these neural network classifiers can sometimes provide lower error rates than PDF classifiers using the same number of trainable parameters. Experiments also demonstrate how interpretation of network outputs as posterior probabilities makes it possible to estimate the confidence of a classification decision, compensate for differences in class prior probabilities between test and training data, and combine outputs of multiple classifiers over time for speech recognition.

...read moreread less

25 citations

Proceedings Article•

Predicting the Risk of Complications in Coronary Artery Bypass Operations using Neural Networks

[...]

Richard P. Lippmann¹, Linda Kukolich¹, David M. Shahian²•Institutions (2)

Massachusetts Institute of Technology¹, Lahey Hospital & Medical Center²

01 Jan 1994

TL;DR: Experiments demonstrated that sigmoid multilayer perceptron (MLP) networks provide slightly better risk prediction than conventional logistic regression when used to predict the risk of death, stroke, and renal failure on 1257 patients who underwent coronary artery bypass operations at the Lahey Clinic.

...read moreread less

Abstract: Experiments demonstrated that sigmoid multilayer perceptron (MLP) networks provide slightly better risk prediction than conventional logistic regression when used to predict the risk of death, stroke, and renal failure on 1257 patients who underwent coronary artery bypass operations at the Lahey Clinic. MLP networks with no hidden layer and networks with one hidden layer were trained using stochastic gradient descent with early stopping. MLP networks and logistic regression used the same input features and were evaluated using bootstrap sampling with 50 replications. ROC areas for predicting mortality using preoperative input features were 70.5% for logistic regression and 76.0% for MLP networks. Regularization provided by early stopping was an important component of improved performance. A simplified approach to generating confidence intervals for MLP risk predictions using an auxiliary "confidence MLP" was developed. The confidence MLP is trained to reproduce confidence intervals that were generated during training using the outputs of 50 MLP networks trained with different bootstrap samples.

...read moreread less

17 citations

Proceedings Article•DOI•

Wordspotter training using figure-of-merit back propagation

[...]

Richard P. Lippmann¹, E.I. Chang¹, Charles Jankowski¹•Institutions (1)

Massachusetts Institute of Technology¹

19 Apr 1994

TL;DR: A new approach to wordspotter training is presented which directly maximizes the figure of merit (FOM) defined as the average detection rate over a specified range of false alarm rates.

...read moreread less

Abstract: A new approach to wordspotter training is presented which directly maximizes the figure of merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the FOM of wordspotters tested using cross-validation on the credit-card speech corpus training conversations by 4 to 5 percentage points to roughly 70%. This improved performance requires little extra complexity during wordspotting and only two extra passes through the training data during training. The FOM gradient is computed analytically for each putative hit, back-propagated through HMM word models using the Viterbi alignment, and used to adjust RBF hidden node centers and state-weights associated with every node in HMM keyword models. >

...read moreread less

12 citations

Proceedings Article•

Using Voice Transformations to Create Additional Training Talkers for Word Spotting

[...]

Eric Chang¹, Richard P. Lippmann¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 1994

TL;DR: The use of simple linear spectral warping to enlarge a 48-talker training data base used for word spotting and the average detection rate overall was increased, similar to that obtained by doubling the amount of training data.

...read moreread less

Abstract: Speech recognizers provide good performance for most users but the error rate often increases dramatically for a small percentage of talkers who are "different" from those talkers used for training. One expensive solution to this problem is to gather more training data in an attempt to sample these outlier users. A second solution, explored in this paper, is to artificially enlarge the number of training talkers by transforming the speech of existing training talkers. This approach is similar to enlarging the training set for OCR digit recognition by warping the training digit images, but is more difficult because continuous speech has a much larger number of dimensions (e.g. linguistic, phonetic, style, temporal, spectral) that differ across talkers. We explored the use of simple linear spectral warping to enlarge a 48-talker training data base used for word spotting. The average detection rate overall was increased by 2.9 percentage points (from 68.3% to 71.2%) for male speakers and 2.5 percentage points (from 64.8% to 67.3%) for female speakers. This increase is small but similar to that obtained by doubling the amount of training data.

...read moreread less

8 citations