Showing papers on "Word error rate published in 1990"

PDF

Open Access

Journal Article•DOI•

[...]

J.C. van Houwelingen¹, S. le Cessie¹•Institutions (1)

01 Nov 1990-Statistics in Medicine

TL;DR: A review is given of different ways of estimating the error rate of a prediction rule based on a statistical model and how cross-validation can be used to obtain an adjusted predictor with smaller error rate.

...read moreread less

Abstract: A review is given of different ways of estimating the error rate of a prediction rule based on a statistical model. A distinction is drawn between apparent, optimum and actual error rates. Moreover it is shown how cross-validation can be used to obtain an adjusted predictor with smaller error rate. A detailed discussion is given for ordinary least squares, logistic regression and Cox regression in survival analysis. Finally, the splitsample approach is discussed and demonstrated on two data sets.

...read moreread less

528 citations

Journal Article•DOI•

Automatic recognition of keywords in unconstrained speech using hidden Markov models

[...]

Jay G. Wilpon¹, Lawrence R. Rabiner¹, Chin-Hui Lee¹, E.R. Goldman¹•Institutions (1)

Bell Labs¹

01 Nov 1990-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: The modifications made to a connected word speech recognition algorithm based on hidden Markov models which allow it to recognize words from a predefined vocabulary list spoken in an unconstrained fashion are described.

...read moreread less

Abstract: The modifications made to a connected word speech recognition algorithm based on hidden Markov models (HMMs) which allow it to recognize words from a predefined vocabulary list spoken in an unconstrained fashion are described. The novelty of this approach is that statistical models of both the actual vocabulary word and the extraneous speech and background are created. An HMM-based connected word recognition system is then used to find the best sequence of background, extraneous speech, and vocabulary word models for matching the actual input. Word recognition accuracy of 99.3% on purely isolated speech (i.e., only vocabulary items and background noise were present), and 95.1% when the vocabulary word was embedded in unconstrained extraneous speech, were obtained for the five word vocabulary using the proposed recognition algorithm. >

...read moreread less

472 citations

Book Chapter•DOI•

Context-independent phonetic hidden Markov models for speaker-independent continuous speech recognition

[...]

Kai-Fu Lee¹•Institutions (1)

Carnegie Mellon University¹

01 May 1990-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Two new context-dependent phonetic units are introduced: function-word-dependent phone models, which focus on the most difficult subvocabulary; and generalized triphones, which combine similar triphones on the basis of an information-theoretic measure.

...read moreread less

Abstract: Context-dependent phone models are applied to speaker-independent continuous speech recognition and shown to be effective in this domain. Several previously proposed context-dependent models are evaluated, and two new context-dependent phonetic units are introduced: function-word-dependent phone models, which focus on the most difficult subvocabulary; and generalized triphones, which combine similar triphones on the basis of an information-theoretic measure. The subword clustering procedure used for generalized triphones can find the optimal number of models, given a fixed amount of training data. It is shown that context-dependent modeling reduces the error rate by as much as 60%. >

...read moreread less

228 citations

Journal Article•DOI•

An application of fuzzy concepts to modelling of reliability analysis

[...]

Takehisa Onisawa¹•Institutions (1)

Kumamoto University¹

02 Sep 1990-Fuzzy Sets and Systems

TL;DR: In this article, a fuzzy model of the reliability analysis is presented, which is based on the operation of dependence and operation of fuzziness which is contained in the qualitative expression, and the evaluation of the failure possibility and the error possibility.

...read moreread less

165 citations

Proceedings Article•

Parsing a natural language using mutual information statistics

[...]

David M. Magerman¹, Mitchell Marcus¹•Institutions (1)

University of Pennsylvania¹

29 Jul 1990

TL;DR: The generalized mutual information statistic is derived, the parsing algorithm is described, and results and sample output from the parser are presented.

...read moreread less

Abstract: The purpose of this paper is to characterize a constituent boundary parsing algorithm, using an information-theoretic measure called generalized mutual information, which serves as an alternative to traditional grammar-based parsing methods. This method is based on the hypothesis that constituent boundaries can be extracted from a given sentence (or word sequence) by analyzing the mutual information values of the part of speech n-grams within the sentence. This hypothesis is supported by the performance of an implementation of this parsing algorithm which determines a recursive unlabeled bracketing of unrestricted English text with a relatively low error rate. This paper derives the generalized mutual information statistic, describes the parsing algorithm, and presents results and sample output from the parser.

...read moreread less

142 citations

Journal Article•DOI•

Decision feedback equalisation using neural network structures and performance comparison with standard architecture

[...]

S. Siu¹, G.J. Gibson¹, C.F.N. Cowan¹•Institutions (1)

University of Edinburgh¹

01 Aug 1990

TL;DR: Results indicate that the perceptron based decision feedbackequaliser provides better bit error rate performance relative to the least mean square decision feedback equaliser, especially in high noise conditions, and biterror rate performance degrades less owing to decision errors and is also less sensitive to gain variation.

...read moreread less

Abstract: The paper describes a new approach for a decision feedback equaliser using the multilayer perceptron structure for equalisation in digital communications systems. Results indicate that the perceptron based decision feedback equaliser provides better bit error rate performance relative to the least mean square decision feedback equaliser, especially in high noise conditions, also that bit error rate performance degrades less owing to decision errors and is also less sensitive to gain variation.

...read moreread less

130 citations

Journal Article•DOI•

Optimization limits in improving system reliability

[...]

Z. Xu¹, Way Kuo¹, Hsin-Hui Lin¹•Institutions (1)

Iowa State University¹

01 Apr 1990-IEEE Transactions on Reliability

TL;DR: The author's method requires considerably less computer time to obtain results comparable to those of the other methods, and it has a low degree of programming difficulty.

...read moreread less

Abstract: A computationally simple approach for reliability-redundancy optimization problems is proposed. It is compared by means of a simulation study with the other two existing approaches: (1) the LMBB method, which incorporates the Lagrange multiplier technique in conjunction with the Kuhn-Tucker condition and the branch-and-bound method, and (2) sequential search techniques in combination with heuristic redundancy allocation methods, including an extension of combinations of four heuristics and two search techniques. Using 100 sets of randomly generated test problems with nonlinear constraints for both series systems and a complex system, the authors measured and evaluated the performance of these approaches in terms of optimality rate, error rate, and execution time. In general, the author's method requires considerably less computer time to obtain results comparable to those of the other methods, and it has a low degree of programming difficulty. >

...read moreread less

98 citations

Patent•DOI•

Speech recognition employing key word modeling and non-key word modeling

[...]

Chin-Hui Lee¹, Lawrence R. Rabiner¹, Jay G. Wilpon¹•Institutions (1)

AT&T¹

09 May 1990-Journal of the Acoustical Society of America

TL;DR: In this paper, a speaker independent recognition of small vocabularies, spoken over the long distance telephone network, is achieved using two types of models, one for defined vocabulary words (e.g., collect, calling-card, person, third number and operator), and one type for extraneous input which ranges from non-speech sounds to groups of non-vocabulary words.

...read moreread less

Abstract: Speaker independent recognition of small vocabularies, spoken over the long distance telephone network, is achieved using two types of models, one type for defined vocabulary words (e.g., collect, calling-card, person, third-number and operator), and one type for extraneous input which ranges from non-speech sounds to groups of non-vocabulary words (e.g. `I want to make a collect call please`). For this type of key word spotting, modifications are made to a connected word speech recognition algorithm based on state-transitional (hidden Markov) models which allow it to recognize words from a pre-defined vocabulary list spoken in an unconstrained fashion. Statistical models of both the actual vocabulary words and the extraneous speech and background noises are created. A syntax-driven connected word recognition system is then used to find the best sequence of extraneous input and vocabulary word models for matching the actual input speech.

...read moreread less

92 citations

Proceedings Article•

Using Genetic Algorithms to Improve Pattern Classification Performance

[...]

Eric Chang¹, Richard P. Lippmann¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Oct 1990

TL;DR: The results suggest that genetic algorithms are becoming practical for pattern classification problems as faster serial and parallel computers are developed.

...read moreread less

Abstract: Genetic algorithms were used to select and create features and to select reference exemplar patterns for machine vision and speech pattern classification tasks. For a complex speech recognition task, genetic algorithms required no more computation time than traditional approaches to feature selection but reduced the number of input features required by a factor of five (from 153 to 33 features). On a difficult artificial machine-vision task, genetic algorithms were able to create new features (polynomial functions of the original features) which reduced classification error rates from 19% to almost 0%. Neural net and k nearest neighbor (KNN) classifiers were unable to provide such low error rates using only the original features. Genetic algorithms were also used to reduce the number of reference exemplar patterns for a KNN classifier. On a 338 training pattern vowel-recognition problem with 10 classes, genetic algorithms reduced the number of stored exemplars from 338 to 43 without significantly increasing classification error rate. In all applications, genetic algorithms were easy to apply and found good solutions in many fewer trials than would be required by exhaustive search. Run times were long, but not unreasonable. These results suggest that genetic algorithms are becoming practical for pattern classification problems as faster serial and parallel computers are developed.

...read moreread less

82 citations

Patent•

Continuous on-line link error rate detector utilizing the frame bit error rate

[...]

Charles F. Wagner¹, James A. Coleman¹•Institutions (1)

United States Department of the Army¹

19 Mar 1990

TL;DR: In this article, the framing bit errors of a received digital communications signal are monitored and recorded and an audible alarm is sounded when the error rate exceeds a predetermined threshold value in a plurality of calculation modes.

...read moreread less

Abstract: The framing bit errors of a received digital communications signal are monitored and recorded. The framing bit error rate is determined and an audible alarm is sounded when the error rate exceeds a predetermined threshold value in a plurality of calculation modes. The framing bit error rate and the total framing bit errors detected over a predetermined fixed time period is also displayed. A link to a remote network monitor can be implemented for monitoring and displaying framing bit error rate at a remote location.

...read moreread less

64 citations

Journal Article•DOI•

A general theory of software-reliability modeling

[...]

M. Trachtenberg

01 Apr 1990-IEEE Transactions on Reliability

TL;DR: A general theory of software reliability that proposes that software failure rates are the product of the software average error size, apparent error density, and workload is developed and models of these factors that are consistent with the assumptions of classical software-reliability models are developed.

...read moreread less

Abstract: A general theory of software reliability that proposes that software failure rates are the product of the software average error size, apparent error density, and workload is developed. Models of these factors that are consistent with the assumptions of classical software-reliability models are developed. The linear, geometric and Rayleigh models are special cases of the general theory. Linear reliability models result from assumptions that the average size of remaining errors and workload are constant and that its apparent error density equals its real error density. Geometric reliability models differ from linear models in assuming that the average-error size decreases geometrically as errors are corrected, whereas the Rayleigh model assumes that the average size of remaining errors increases linearly with time. The theory shows that the abstract proportionality constants of classical models are composed of more fundamental and more intuitively meaningful factors, namely, the initial values of average size of remaining errors, real error density, workload, and error content. It is shown how the assumed behavior of the reliability primitives of software (average-error size, error density, and workload) is modeled to accommodate diverse reliability factors. >

...read moreread less

Proceedings Article•

A Comparative Study of the Practical Characteristics of Neural Network and Conventional Pattern Classifiers

[...]

Kenney Ng, Richard P. Lippmann¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Oct 1990

TL;DR: The results suggest that the selection of a classifier for a particular task should be guided not so much by small differences in error rate, but by practical considerations concerning memory usage, computational resources, ease of implementation, and restrictions on training and classification times.

...read moreread less

Abstract: Seven different pattern classifiers were implemented on a serial computer and compared using artificial and speech recognition tasks. Two neural network (radial basis function and high order polynomial GMDH network) and five conventional classifiers (Gaussian mixture, linear tree, K nearest neighbor, KD-tree, and condensed K nearest neighbor) were evaluated. Classifiers were chosen to be representative of different approaches to pattern classification and to complement and extend those evaluated in a previous study (Lee and Lippmann, 1989). This and the previous study both demonstrate that classification error rates can be equivalent across different classifiers when they are powerful enough to form minimum error decision regions, when they are properly tuned, and when sufficient training data is available. Practical characteristics such as training time, classification time, and memory requirements, however, can differ by orders of magnitude. These results suggest that the selection of a classifier for a particular task should be guided not so much by small differences in error rate, but by practical considerations concerning memory usage, computational resources, ease of implementation, and restrictions on training and classification times.

...read moreread less

Journal Article•DOI•

Automatic recognition of intermittent failures: an experimental study of field data

[...]

Ravishankar K. Iyer¹, L.T. Young¹, P.V.K. Iyer²•Institutions (2)

University of Illinois at Urbana–Champaign¹, University of Queensland²

01 Apr 1990-IEEE Transactions on Computers

TL;DR: Comparisons to real failure/repair information obtained from field engineers show that, in about 85% of the cases, the error symptoms recognized by this approach correspond to real problems.

...read moreread less

Abstract: A methodology is proposed for recognizing the symptoms of persistent problems in large systems. The system error rate is used to identify the error states among which relationships may exist. Statistical techniques are used to validate and quantify the strength of the relationship among these error states. As input, the approach takes the raw error logs containing a single entry for each error that is detected as an isolated event. As output, it produces a list of symptoms that characterize persistent errors. Thus, given a failure, it is determined whether the failure is an intermittent manifestation of a common fault or whether it is an isolated (transient) incident. The technique is shown to work on two CYBER systems and on IBM 3081 multiprocessor system. Comparisons to real failure/repair information obtained from field engineers show that, in about 85% of the cases, the error symptoms recognized by this approach correspond to real problems. The remaining 15% of the cases, although not directly supported by field data, are confirmed as being valid problems. >

...read moreread less

Proceedings Article•DOI•

Speaker-independent recognition of spoken English letters

[...]

R. Cole¹, M. Fanty¹, Y. Muthusamy¹, M. Gopalakrishnan¹•Institutions (1)

Oregon Health & Science University¹

17 Jun 1990

TL;DR: EAR, an English alphabet recognizer that performs speaker-independent recognition of letters spoken in isolation, has high level of performance and is attributed to accurate and explicit phonetic segmentation, the use of speech knowledge to select features that measure the important linguistic information, and the ability of the neural classifier to model the variability of the data.

...read moreread less

Abstract: A description is presented of EAR, an English alphabet recognizer that performs speaker-independent recognition of letters spoken in isolation. During recognition, (a) signal processing routines transform the digitized speech into useful representations, (b) rules are applied to the representations to locate segment boundaries, (c) feature measurements are computed on the speech segments, and (d) a neural network uses the feature measurements to classify the letter. The system was trained on one token of each letter from 120 speakers. Performance was 95% when tested on a new set of 30 speakers. Performance was 96% when tested on a second token of each letter from the original 120 speakers. The recognition accuracy is 6% higher than that of previously reported systems. The high level of performance is attributed to accurate and explicit phonetic segmentation, the use of speech knowledge to select features that measure the important linguistic information, and the ability of the neural classifier to model the variability of the data

...read moreread less

Journal Article•DOI•

Error characteristics of fiber distributed data interface (FDDI)

[...]

Raj Jain

01 Aug 1990-IEEE Transactions on Communications

TL;DR: An analysis is made of the impact of various design decisions on the error detection capability of the fiber distributed data interface (FDDI), a 100-Mb/s fiber-optic LAN standard being developed by the ANSI, and the frame error rate, token loss rate, and undetected error rate are quantified.

...read moreread less

Abstract: An analysis is made of the impact of various design decisions on the error detection capability of the fiber distributed data interface (FDDI), a 100-Mb/s fiber-optic LAN standard being developed by the American National Standards Institute (ANSI). In particular, the frame error rate, token loss rate, and undetected error rate are quantified. Several characteristics of the 32-b frame check sequence (FCS) polynomial, which is also used in IEEE 802 LAN protocols, are discussed. The standard uses a nonreturn to zero invert on ones (NRZI) signal encoding and a 4-b to 5-b (4b/5b) symbol encoding in the physical layer. Due to the combination of NRZI and 4b/5b encoding, many noise events are detected by code (or symbol) violations. A large percentage of errors are detected by FCS violations. The errors that escape these three violations remain undetected. The probability of undetected errors due to creation of false starting delimiters, false ending delimiters, or merging of two frames is analyzed. It is shown that every noise event results in two code bit errors, which in turn may result in up to four data bit errors. The FCS can detect up to two noise events. Creation of a false starting delimiter or ending delimiter on a symbol boundary also requires two noise events. This assumes enhanced frame validity criteria. The author justifies the enhancements by quantifying their effect. >

...read moreread less

Proceedings Article•DOI•

Neural network approach to word category prediction for English texts

[...]

Masami Nakamura, Katsuteru Maruyama, Takeshi Kawabata, Kiyohiro Shikano

20 Aug 1990

TL;DR: The proposed NETgram is a neural network for word category prediction that requires fewer parameters than the statistical model and performs effectively for unknown data, i.e., the NETgram interpolates sparse training data.

...read moreread less

Abstract: Word category prediction is used to implement an accurate word recognition system. Traditional statistical approaches require considerable training data to estimate the probabilities of word sequences, and many parameters to memorize probabilities. To solve this problem, NETgram, which is the neural network for word category prediction, is proposed. Training results show that the performance of the NETgram is comparable to that of the statistical model although the NETgram requires fewer parameters than the statistical model. Also the NETgram performs effectively for unknown data, i.e., the NETgram interpolates sparse training data. Results of analyzing the hidden layer show that the word categories are classified into linguistically significant groups. The results of applying the NETgram to HMM English word recognition show that the NETgram improves the word recognition rate from 81.0% to 86.9%

...read moreread less

Journal Article•DOI•

High-reliability data transfer over the land mobile radio channel using interleaved hybrid-ARQ error control

[...]

Stephen B. Wicker¹•Institutions (1)

Georgia Institute of Technology¹

01 Feb 1990-IEEE Transactions on Vehicular Technology

TL;DR: Several coding techniques are combined to create a simple error control system capable of providing very low end-to-end symbol error rates over a digital mobile phone link.

...read moreread less

Abstract: Several coding techniques are combined to create a simple error control system capable of providing very low end-to-end symbol error rates over a digital mobile phone link. Interleaving is used to spread out the impact of deep fades, allowing for the use of shorter block codes with correspondingly simple encoders and decoders. The degree of error control provided by these block codes is then substantially improved through the adoption of a type-1 hybrid-ARQ protocol. The performance of this system is explored in detail for implementation based on Reed-Solomon codes. >

...read moreread less

Patent•DOI•

Speech coding apparatus using multimode coding

[...]

Tomohiko Taniguchi¹, Yoshinori Tanaka¹, Akira Sasama¹, Yasuji Ohta¹, Fumio Amano¹, Shigeyuki Unagami¹ - Show less +2 more•Institutions (1)

Fujitsu¹

11 Sep 1990-Journal of the Acoustical Society of America

TL;DR: In this article, a speech coding apparatus coupled to a transmission channel includes m (m is an integer greater than 1) coders, m decoders and m or (m-1) error-correcting coders.

...read moreread less

Abstract: A speech coding apparatus coupled to a transmission channel includes m (m is an integer greater than 1) coders, m decoders and m or (m-1) error correcting coders. The apparatus also includes an evaluation unit which evaluates a quality of each of reproduced speech signals from the input speech signal and the reproduced speech signals and which outputs an evaluated quality of each of the reproduced speech signals. The quality of each of the reproduced speech signals is evaluated in a state having no transmission error. A decision unit identifies one of the m coders which provides the reproduced speech signal having a smallest distortion on the basis of the evaluated quality of each of the reproduced speech signals, a current error rate of the transmission channel and error correcting abilities of the error correcting coders, and generates a coder identification number representative of a selected one of the m coders. An output part outputs a multiplexed transmission signal including the coded speech signal generated by the one of the m coders identified by the decision unit and the error correcting code generated by a corresponding one of the m error correcting coders.

...read moreread less

Journal Article•DOI•

Large vocabulary word recognition using context-dependent allophonic hidden Markov models☆

[...]

Li Deng¹, Matthew Lennig¹, F. Seitz¹, Paul Mermelstein¹•Institutions (1)

Institut national de la recherche scientifique¹

01 Oct 1990-Computer Speech & Language

TL;DR: Development of context-dependent allophonic hidden Markov models (HMMs) implemented in a 75 000-word speaker-dependent Gaussian-HMM recognizer are reported, showing that when a large amount of data is used to train context- dependent HMMs, the word recognition error rate is reduced by 33%, compared with the context-independent HMMs.

...read moreread less

Proceedings Article•DOI•

Lexicon-building methods for an acoustic sub-word based speech recognizer

[...]

Kuldip K. Paliwal¹•Institutions (1)

Tata Institute of Fundamental Research¹

03 Apr 1990

TL;DR: The use of an acoustic subword unit (ASWU)-based speech recognition system for the recognition of isolated words is discussed and it is shown that the use of a modified k-means algorithm on the likelihoods derived through the Viterbi algorithm provides the best deterministic-type of word lexicon.

...read moreread less

Abstract: The use of an acoustic subword unit (ASWU)-based speech recognition system for the recognition of isolated words is discussed. Some methods are proposed for generating the deterministic and the statistical types of word lexicon. It is shown that the use of a modified k-means algorithm on the likelihoods derived through the Viterbi algorithm provides the best deterministic-type of word lexicon. However, the ASWU-based speech recognizer leads to better performance with the statistical type of word lexicon than with the deterministic type. Improving the design of the word lexicon makes it possible to narrow the gap in the recognition performances of the whole word unit (WWU)-based and the ASWU-based speech recognizers considerably. Further improvements are expected by designing the word lexicon better. >

...read moreread less

Proceedings Article•DOI•

On vocabulary-independent speech modeling

[...]

Hsiao-Wuen Hon¹, Kai-Fu Lee¹•Institutions (1)

Carnegie Mellon University¹

03 Apr 1990

TL;DR: The use of vocabulary-independent (VI) models to improve the usability of speech recognizers is described and initial results show that with more training data and more detailed modeling, the error rate of VI models can be reduced substantially.

...read moreread less

Abstract: The use of vocabulary-independent (VI) models to improve the usability of speech recognizers is described. Initial results using generalized triphones as VI models show that with more training data and more detailed modeling, the error rate of VI models can be reduced substantially. For example, the error rates for VI models with 5000, 10000, and 15000 training sentences, are 23.9%, 15.2%, and 13.3%, respectively. Moreover, if task-specific training data are available, one can interpolate them with VI models. This task adaptation can reduce the error rate by 18% over task-specifying models. >

...read moreread less

Proceedings Article•DOI•

Continuous speech recognition from a phonetic transcription

[...]

Stephen E. Levinson¹, Andrej Ljolje¹, L.G. Miller¹•Institutions (1)

Bell Labs¹

03 Apr 1990

TL;DR: A novel speech recognition system based on this theory in which the acoustic-to-phonetic mapping is accomplished by means of a particular form of hidden Markov model and is independent of lexical and syntactic constraint is described.

...read moreread less

Abstract: A widely accepted linguistic theory holds that speech recognition in humans proceeds from an intermediate representation of the acoustic signal in terms of a small number of phonetic symbols. A novel speech recognition system based on this theory in which the acoustic-to-phonetic mapping is accomplished by means of a particular form of hidden Markov model and is independent of lexical and syntactic constraint is described. Word recognition is then treated as a classical string-to-string editing problem which is solved with a two-level dynamic programming algorithm that accounts for lexical and syntactic structure. The system was tested on speaker-independent recognition of fluent speech from the 991-word DARPA resource management task, on which 76.6% word accuracy was achieved. In informal tests it was observed that the phonetic transcription can be resynthesized to provide a 100-bit/s vocoder with word intelligibility rates of approximately 75%. >

...read moreread less

Proceedings Article•DOI•

Improved acoustic modeling for continuous speech recognition

[...]

Chin-Hui Lee, E. Giachin, Lawrence R. Rabiner, Roberto Pieraccini, Aaron E. Rosenberg - Show less +1 more

24 Jun 1990

TL;DR: The results suggest that intra-word and inter-word units should be modeled independently, even when they appear in the same context, and that the spectral vectors, corresponding to the same speech unit, behave differently statistically, depending on whether they are at word boundaries or within a word.

...read moreread less

Abstract: We report on some recent improvements to an HMM-based, continuous speech recognition system which is being developed at AT&T Bell Laboratories. These advances, which include the incorporation of inter-word, context-dependent units and an improved feature analysis, lead to a recognition system which achieve better than 95% word accuracy for speaker independent recognition of the 1000-word, DARPA resource management task using the standard word-pair grammar (with a perplexity of about 60). It will be shown that the incorporation of inter-word units into training results in better acoustic models of word juncture coarticulation and gives a 20% reduction in error rate. The effect of an improved set of spectral and log energy features is to further reduce word error rate by about 30%. We also found that the spectral vectors, corresponding to the same speech unit, behave differently statistically, depending on whether they are at word boundaries or within a word. The results suggest that intra-word and inter-word units should be modeled independently, even when they appear in the same context. Using a set of sub-word units which included variants for intra-word and inter-word, context-dependent phones, an additional decrease of about 10% in word error rate resulted.

...read moreread less

Patent•

Error detection and correction device

[...]

Akihiro Shikakura¹•Institutions (1)

Canon Inc.¹

06 Jun 1990

TL;DR: An error detection and correction device in which a word train including a number of error correction codes each constructed from a plurality of words is input, the error correction code being of the type that two error words can be corrected by each error correction as discussed by the authors.

...read moreread less

Abstract: An error detection and correction device in which a word train including a number of error correction codes each constructed from a plurality of words is input, the error correction code being of the type that two error words can be corrected by each error correction code. Error words are corrected by using the error correction code within the word train. Mode setting information associated with an error rate of the word train is generated, and an error correction means is controlled in a first or a second error correction mode depending on the mode setting. The error correction means corrects one or two error words for each error correction code in the first error correction mode and corrects only one error word for each error correction code in the second error correction mode.

...read moreread less

Proceedings Article•DOI•

Fast speaker adaptation for speech recognition systems

[...]

Fritz Class¹, Alfred Kaltenmeier¹, P. Regel¹, K. Trottler•Institutions (1)

Daimler AG¹

03 Apr 1990

TL;DR: Different speaker adaptation methods for speech recognition systems adapting automatically to new and unknown speakers in a short training phase are discussed, and the results show that in both systems speaker-adaptive error rates are close to speaker-dependent error rates.

...read moreread less

Abstract: Different speaker adaptation methods for speech recognition systems adapting automatically to new and unknown speakers in a short training phase are discussed. The adaptation techniques aim at transformations of feature vectors, optimized with respect to some constraints. Two different adaptation strategies are discussed. The first one is based on least mean-squared-error optimization. The second method is a codebook-driven feature transformation. Both adaptation techniques are incorporated into two different recognition systems: dynamic time warping (DTW) and hidden Markov modeling (HMM). The results show that in both systems speaker-adaptive error rates are close to speaker-dependent error rates. In the best case the mean error rate of four test speakers decreases by a factor of six compared to the interspeaker error rate without adaptation. A hardware realization of the speaker-adaptive HMM-recognizer is described. >

...read moreread less

Journal Article•DOI•

Understanding single event phenomena in complex analog and digital integrated circuits

[...]

T.L. Turflinger, Martin Vincent Davey

01 Dec 1990-IEEE Transactions on Nuclear Science

TL;DR: In this paper, an analysis technique for complex SEP response has been developed that allows system analysis for the data in order to increase confidence in their thoroughness, and the resulting data are directly applicable to existing orbital error rate codes.

...read moreread less

Abstract: Single event phenomena (SEP) responses of complex analog and digital integrated circuits (ICs) cannot be characterized using common (i.e. memory) techniques. An analysis technique for complex SEP response has been developed that allows system analysis for the data in order to increase confidence in their thoroughness. The resulting data are directly applicable to existing orbital error rate codes. The general analysis technique is the main thrust of this study. For a circuit with a complex SEP response, guidelines are established which allow a complete solution to the linear system by establishing relevant error categories, the raw error variables, and a method of error separation. When the data are presented in the framework of a complete linear system, the required independent analysis for system use can proceed with high confidence in the SEP data. The technique is used to analyze the SEP response of an analog-to-digital converter. >

...read moreread less

Proceedings Article•DOI•

Estimation using log-spectral-distance criterion for noise-robust speech recognition

[...]

A. Erell, M. Weintraub

03 Apr 1990

TL;DR: A spectral-estimation algorithm designed to improve the noise robustness of speech-recognition systems is presented and evaluated and is tailored for filter-bank-based systems, where the estimation seeks to minimize the distortion as measured by the recognizer's distance metric.

...read moreread less

Abstract: A spectral-estimation algorithm designed to improve the noise robustness of speech-recognition systems is presented and evaluated. The algorithm is tailored for filter-bank-based systems, where the estimation seeks to minimize the distortion as measured by the recognizer's distance metric. This minimization is achieved by modeling the speech distribution as consisting of clusters; the energies at different frequency channels are assumed to be uncorrelated within each cluster. The algorithm was tested with a continuous-speech, speaker-independent hidden Markov model (HMM) recognition system using the NIST Resource Management Task speech database. When trained on a clean speech database and tested with additive white Gaussian noise, the recognition accuracy with the new algorithm is comparable to that under the ideal condition of training and testing at constant SNR. When trained on clean speech and tested with a desktop microphone in a noisy environment, the error rate is only slightly higher than that with a close-talking microphone. >

...read moreread less

Journal Article•DOI•

Modeling microsegments of stop consonants in a hidden Markov model based word recognizer

[...]

Li Deng, Matthew Lennig, Paul Mermelstein

01 Jun 1990-Journal of the Acoustical Society of America

TL;DR: This article used context-dependent micro-segmental hidden Markov models (HMMs) for six stop phoneme for CVC word lists consisting of CVC words, which reduced the error rate by 35% compared with the result obtained by using one HMM for each phoneme.

...read moreread less

Abstract: The motivation of this study is the poor performance of speech recognizers on the stop consonants. To overcome this weakness, word initial and word final stop consonants are modeled at a subphonemic (microsegmental) level. Each stop consonant is segmented into a few relatively stationary microsegments: silence, voice bar, burst, and aspiration. Microsegments of certain phonemically different stops are trained together due to their similar spectral properties. Microsegmental models of burst and aspiration are conditioned on the adjacent vowel category: front versus nonfront vowels. The resulting context‐dependent microsegmental hidden Markov models (HMMs) for six stops possess the desired properties for a compromise between modeling accuracy and modeling robustness. They allow the recognizer to focus discrimination onto those regions of a stop that serve to distinguish it from other stops. Use of these models in recognition experiments for word lists consisting of CVC words reduces the error rate by 35% compared with the result obtained by using one HMM for each stop phoneme.

...read moreread less

Patent•

Spelling check apparatus including simple and quick similar word retrieval operation

[...]

Nobuyoshi Futatsugi¹, Toshinori Sawada¹•Institutions (1)

Casio¹

28 Aug 1990

TL;DR: In this paper, a spelling check apparatus checks a spelling of a word entered from a keyboard, and also displays a word having a spelling similar to the spelling of the input word.

...read moreread less

Abstract: A spelling check apparatus checks a spelling of a word entered from a keyboard, and also displays a word having a spelling similar to the spelling of the input word. The input word is modified by way of plural preselected methods, and when this modified word coincides with a word which has been stored in a word dictionary memory employed in this spelling check apparatus, this modified word is regarded as the similar word. The similar word is displayed every time the modification process for the input word by one preselected method has been accomplished.

...read moreread less

Proceedings Article•DOI•

Improved hidden Markov modeling for speaker-independent continuous speech recognition

[...]

Xuedong Huang, F. Alleva, Satoru Hayamizu, Hsiao-Wuen Hon, Mei-Yuh Hwang, Kai-Fu Lee - Show less +2 more

24 Jun 1990

TL;DR: Recent efforts to further improve the performance of the Sphinx system for speaker-independent continuous speech recognition are reported, with incorporation of additional dynamic features, semi-continuous hidden Markov models, and speaker clustering.

...read moreread less

Abstract: The paper reports recent efforts to further improve the performance of the Sphinx system for speaker-independent continuous speech recognition. The recognition error rate is significantly reduced with incorporation of additional dynamic features, semi-continuous hidden Markov models, and speaker clustering. For the June 1990 (RM2) evaluation test set, the error rates of our current system are 4.3% and 19.9% for word-pair grammar and no grammar respectively.

...read moreread less