scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Proceedings ArticleDOI
05 Jun 2000
TL;DR: This work presents the results of combining the line spectral frequencies (LSFs) and zero crossing-based features for frame-level narrowband speech/music discrimination and shows the good discriminating power of these features.
Abstract: Automatic discrimination of speech and music is an important tool in many multimedia applications. Previous work has focused on using long-term features such as differential parameters, variances and time-averages of spectral parameters. These classifiers use features estimated over windows of 0.5-5 seconds, and are relatively complex. We present our results of combining the line spectral frequencies (LSFs) and zero crossing-based features for frame-level narrowband speech/music discrimination. Our classification results for different types of music and speech show the good discriminating power of these features. Our classification algorithms operate using only a frame delay of 20 ms, making them suitable for real-time multimedia applications.

229 citations

Patent
Steve Tischer1
10 Dec 2001
TL;DR: In this paper, a method and system of customizing voice translation of a text to speech includes digitally recording speech samples of a known speaker, correlating each of the speech samples with a standardized audio representation, and organizing the recorded speech samples and correlated audio representations into a collection.
Abstract: A method and system of customizing voice translation of a text to speech includes digitally recording speech samples of a known speaker, correlating each of the speech samples with a standardized audio representation, and organizing the recorded speech samples and correlated audio representations into a collection. The collection of speech samples correlated with audio representations is saved as a single voice file and stored in a device capable of translating the text to speech. The voice file is applied to a translation of text to speech so that the translated speech is customized according to the applied voice file.

229 citations

Patent
23 Sep 1996
TL;DR: In this paper, a system and method for dynamically adapting the user bit rate of a time division multiple access (TDMA) cellular telecommunication system to achieve optimum voice quality over a broad range of radio channel conditions are disclosed.
Abstract: A system and method for dynamically adapting the user bit rate of a time division multiple access (TDMA) cellular telecommunication system to achieve optimum voice quality over a broad range of radio channel conditions are disclosed. The system continuously monitors radio channel quality on both the uplink and the downlink, and dynamically adapts the system's combination of speech coding (21), channel coding (22), modulation (23), a number of assignable time slots per call (27) to optimize voice quality of the measured conditions. Various combinations of the system's speech coding, channel coding, modulation, and assignable time slots are identified as combination types (1-5) and corresponding cost functions are defined. By idendifying and selecting the cost function with the lowest cost for the measured radio channel conditions, the system provides the maximum voice quality achievable within the limits of the system design.

229 citations

Book
01 Jan 2001
TL;DR: This book presents an introduction to Real-Time Digital Signal Processing, a branch of Digital Image Processing, and some of the techniques used in this area, as well as some new ideas on how to implement these techniques in the real-time.
Abstract: Preface. Chapter 1. Introduction to Real-Time Digital Signal Processing. Chapter 2. Introduction to TMS320C55x Digital Signal Processor. Chapter 3. DSP Fundamentals and Implementation Considerations. Chapter 4. Design and Implementation of FIR Filters. Chapter 5. Design and Implementation of IIR Filters. Chapter 6. Frequency Analysis and Fast Fourier Transform. Chapter 7. Adaptive Filtering. Chapter 8. Digital Signal Generators. Chapter 9. Dual-Tone Multi-Frequency Detection. Chapter 10. Adaptive Echo Cancellation. Chapter 11. Speech Coding Techniques. Chapter 12. Speech Enhancement Techniques. Chapter 13. Audio Signal Processing. Chapter 14. Channel Coding Techniques. Chapter 15. Introduction to Digital Image Processing. Appendix A: Some Useful Formulas and Definitions. A.1 Trigonometric Identities. A.2 Geometric Series. A.3 Complex Variables. A.4 Units of Power. References. Appendix B: Software Organization and List of Experiments. Index.

228 citations

Proceedings ArticleDOI
01 Feb 1997
TL;DR: The theoretic framework and applications of automatic audio content analysis, including analysis of amplitude, frequency and pitch, and simulations of human audio perception, are described.
Abstract: This paper describes the theoretic framework and applications of automatic audio content analysis. Research in multimedia content analysis has so far concentrated on the video domain. We demonstrate the strength of automatic audio content analysis. We explain the algorithms we use, including analysis of amplitude, frequency and pitch, and simulations of human audio perception. These algorithms serve us as tools for further audio content analysis. We use these tools in applications like the segmentation of audio data streams into logical units for further processing, the analysis of music, as well as the recognition of sounds indicative of violence like shots, explosions and cries.

227 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108