scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Proceedings ArticleDOI
23 May 1989
TL;DR: In this article, a description of the voice activity detector (VAD) standardized by CEPT for use in the Pan-European digital cellular mobile telephone service is given, and performance tests carried out to validate the design are described.
Abstract: A description is given of the voice activity detector (VAD) standardized by CEPT for use in the Pan-European digital cellular mobile telephone service The speech-coding algorithm chosen is a 13-kb/s speech coder, using a technique in which speech is produced at the decoder by passing a substitute for the residual through long-term and short-term predictor filters The difficulties of detecting speech in a noisy environment are discussed, and the performance tests carried out to validate the design are described The tests show that clipping levels are very low but that low levels of speech activity are recorded in conversations The VAD has low complexity (because it uses the results of analysis performed in the speech coder) and is failsafe in difficult conditions >

179 citations

Patent
09 Aug 2000
TL;DR: In this article, the authors present methods and systems for testing speech recognition systems using a text-to-speech (T2S) device, in which the speech recognition device to be tested is directly monitored in accordance with a T2S device.
Abstract: Methods and systems for testing speech recognition systems are disclosed in which the speech recognition device to be tested is directly monitored in accordance with a text-to-speech device The collection of reference texts to be used by the speech recognition device is provided by a text-to-speech device preferably, in one embodiment, implemented within the same computer system In such an embodiment, a digital audio file stored within a storage area of a computer system is generated from a reference text using a text-to-speech device The digital audio file is later read using a speech recognition device to generate a decoded (or recognized) text representative of the reference text The reference text and the decoded text are compared in an alignment operation and an error report representative of the recognition rate of the speech recognition device is finally generated

178 citations

Patent
Yasunaga Miyazawa1, Mitsuhiro Inazumi1, Hiroshi Hasegawa1, Isao Edatsune1, Osamu Urano1 
TL;DR: In this article, a speaker specific and non-speaker specific method and apparatus is provided for enabling speech-based remote control and for recognizing the speech of an unspecified speaker at extremely high recognition rates regardless of the speaker's age, sex, or individual speech mannerisms.
Abstract: Bifurcated speaker specific and non-speaker specific method and apparatus is provided for enabling speech-based remote control and for recognizing the speech of an unspecified speaker at extremely high recognition rates regardless of the speaker's age, sex, or individual speech mannerisms. A device main unit is provided with a speech recognition processor for recognizing speech and taking an appropriate action, and with a user terminal containing specific speaker capture and/or preprocessing capabilities. The user terminal exchanges data with the speech recognition processor using radio transmission. The user terminal may be provided with a conversion rule generator that compares the speech of a user with previously compiled standard speech feature data and, based on this comparison result, generates a conversion rule for converting the speaker's speech feature parameters to corresponding standard speaker's feature information. The speech recognition processor, in turn, may reference the conversion rule developed in the user terminal and perform speech recognition based on the input speech feature parameters that have been converted above.

178 citations

Patent
01 Oct 2007
TL;DR: In this paper, a post-recognition processor coupled with an interface is used to compare recognized speech data generated by the speech recognition engine to contextual information retained in a memory, and transmits the modified recognition data to a parsing component.
Abstract: A system improves speech recognition includes an interface linked to a speech recognition engine. A post-recognition processor coupled to the interface compares recognized speech data generated by the speech recognition engine to contextual information retained in a memory, generates a modified recognized speech data, and transmits the modified recognized speech data to a parsing component.

177 citations

PatentDOI
TL;DR: A speech coding apparatus compares the closeness of the feature value of a featurevector signal of an utterance to the parameter values of prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal.
Abstract: A speech coding apparatus compares the closeness of the feature value of a feature vector signal of an utterance to the parameter values of prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal. The speech coding apparatus stores a plurality of speech transition models representing speech transitions. At least one speech transition is represented by a plurality of different models. Each speech transition model has a plurality of model outputs, each comprising a prototype match score for a prototype vector signal. Each model output has an output probability. A model match score for a first feature vector signal and each speech transition model comprises the output probability for at least one prototype match score for the first feature vector signal and a prototype vector signal. A speech transition match score for the first feature vector signal and each speech transition comprises the best model match score for the first feature vector signal and all speech transition models representing the speech transition. The identification value of each speech transition and the speech transition match score for the first feature vector signal and each speech transition are output as a coded utterance representation signal of the first feature vector signal.

176 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108