Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Method for time aligning audio signals using characterizations based on auditory events

[...]

Brett G. Crockett¹, Michael J. Smithers¹•Institutions (1)

Dolby Laboratories¹

25 Feb 2002

TL;DR: In this paper, a method for time aligning audio signal, wherein one signal has been derived from the other or both have been derived separately from another signal, comprises deriving reduced-information characterizations of the audio signals, auditory scene analysis.

...read moreread less

Abstract: A method for time aligning audio signal, wherein one signal has been derived from the other or both have been derived from another signal, comprises deriving reduced-information characterizations of the audio signals, auditory scene analysis. The time offset of one characterization with respect to the other characterization is calculated and the temporal relationship of the audio signals with respect to each other is modified in response to the time offset such that the audio signals are coicident with each other. These principles may also be applied to a method for time aligning a video signal and an audio signal that will be subjected to differential time offsets.

...read moreread less

122 citations

Patent•

Method for coding an audio signal

[...]

Jürgen Herre, Gbur Uwe, Andreas Ehret, Martin Dietz, Bodo Teichmann, Oliver Kunz, Karlheinz Brandenburg, Gerhaeuser Heinz Dr - Show less +4 more

13 Mar 1998

TL;DR: In this paper, a method for coding or de-coding an audio signal combining the advantages of TNS processing and noise substitution was proposed, where a time discrete audio signal is initially transformed in a frequency range in order to obtain spectral value of the temporal audio signal.

...read moreread less

Abstract: The invention relates to a method for coding or de-coding an audio signal combining the advantages of TNS processing and noise substitution. A time discrete audio signal is initially transformed in a frequency range in order to obtain spectral value of the temporal audio signal. A prediction of the spectral values in relation to frequency is subsequently made in order to enable spectral residual values. Areas within the spectral values encompassing spectral values with noise properties are detected . The spectral residual values are noise substituted in the noise areas, whereupon data relating to the noise areas and noise substitution are incorporated into side information pertaining to a coded audio signal.

...read moreread less

122 citations

Patent•DOI•

Reassembling speech sentence fragments using associated phonetic property

[...]

Peter Buth¹, Simona Grothues¹, Amir Iman¹, Wolfgang Theimer¹•Institutions (1)

Nokia¹

28 Jun 2001-Journal of the Acoustical Society of America

TL;DR: In this paper, a series of original sentences for messages is segmented and stored as audio files with search criteria, and the audio files of the segments in which the examination resulted in the prerequisites for optimal maintaining of the natural speech rhythm are combined and output for reproduction.

...read moreread less

Abstract: A method of composing messages for speech output and the improvement of the quality of reproduction of speech outputs. A series of original sentences for messages is segmented and stored as audio files with search criteria. The length, position, and transition values for the respective segments can be recorded and stored. A sentence to be reproduced is transmitted in a format corresponding to the format of the search criteria. It is determined whether the sentence to be reproduced can be fully reproduced by one segment or a succession of stored segments. The segments found in each case are examined using their entries as to how far the individual segments match as regards speech rhythm. The audio files of the segments in which the examination resulted in the pre-requisites for optimal maintaining of the natural speech rhythm are combined and output for reproduction.

...read moreread less

122 citations

Patent•

Transmitting data on the phase of speech

[...]

Raymond Steele¹, Wai C. Wong¹, Costas Xydeas¹•Institutions (1)

Bell Labs¹

05 Aug 1982

TL;DR: In this article, the authors proposed a means for simultaneous transmission of data and speech with only a minimal expansion of the bandwidth of the speech signal, where a Fourier transform is performed on the speech signals and a predetermined number of phase components are replaced with data (d(n)) in an appropriate form.

...read moreread less

Abstract: The present invention relates to a means for achieving simultaneous transmission of data and speech with only a minimal expansion of the bandwidth of the speech signal. A Fourier transform (14) is performed on the speech signal and a predetermined number of phase components are replaced with data (d(n)) in an appropriate form. The number of phase components replaced with data is determined by approximately classifying the speech (16) as either "silence", no data inserted; "unvoiced" speech, M phase components convey data; and "voiced" speech, J phase components convey data; where J is less than M, and M is not greater than the number of phase components in the message band of the speech signal. An inverse Fourier transform (22) is subsequently performed on the combined data and speech signal. The combined message signal (G(t)) will comprise approximately the same bandwidth as the original speech signal, by virtue of the frequency domain insertion of the data into the speech. At the receiver the signal is inspected and a classifier (38) determines if data is embedded in the received signal. If data is deemed embedded, a Fourier transformation is performed, the data carrying phase components are inspected, and the data signal regenerated in an appropriate form. The phase components used for the conveyance of data are replaced by random phase components, and the inverse Fourier transformation performed. Median filtering is employed to mitigate the effects of end-of-block distortion and yield the recovered speech signal.

...read moreread less

121 citations

Proceedings Article•DOI•

Proposal and evaluation of models for the glottal source waveform

[...]

Hidehiko Fujisaki¹, M. Ljungqvist¹•Institutions (1)

University of Tokyo¹

07 Apr 1986

TL;DR: The results indicate the importance of detailed modeling of the period of glottal closure for accurate analysis and describe a method for simultaneously estimating theglottal source and vocal: tract parameters.

...read moreread less

Abstract: Speech analysis for high quality speech synthesis or high accuracy speech recognition requires realistic models not only for the vocal tract but also for the voice source. In the present paper, we investigate models for the glottal volume velocity waveform. Previously proposed models are reviewed and classified according to their level of elaboration in expressing the glottal characteristics. A new model is then proposed which possesses all the important features of previously proposed models. A method is also described for simultaneously estimating the glottal source and vocal: tract parameters. Using this method, evaluation of glottal model parameters is carried out on real speech by varying the number of parameters in the proposed model. The results indicate the importance of detailed modeling of the period of glottal closure for accurate analysis.

...read moreread less

121 citations

Collapse

Network Information

Performance

Metrics

14,368

Papers

279,843

Citations

No. of papers in the topic in previous years
Year	Papers
2023	38
2022	84
2021	70
2020	62
2019	77
2018	108

Speech coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics