The adaptive multirate wideband speech codec (AMR-WB)

Home
/
Papers
/
The adaptive multirate wideband speech codec (AMR-WB)

The adaptive multirate wideband speech codec (AMR-WB)

01 Jan 2002-Vol. 10, Iss: 8, pp 620-636

TL;DR: The adaptive multirate wideband (AMR-WB) speech codec selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services is described.

read less

Abstract: This paper describes the adaptive multirate wideband (AMR-WB) speech codec selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services. The AMR-WB speech codec algorithm was selected in December 2000 and the corresponding specifications were approved in March 2001. The AMR-WB codec was also selected by the International Telecommunication Union-Telecommunication Sector (ITU-T) in July 2001 in the standardization activity for wideband speech coding around 16 kb/s and was approved in January 2002 as Recommendation G.722.2. The adoption of AMR-WB by ITU-T is of significant importance since for the first time the same codec is adopted for wireless as well as wireline services. AMR-WB uses an extended audio bandwidth from 50 Hz to 7 kHz and gives superior speech quality and voice naturalness compared to existing second- and third-generation mobile communication systems. The wideband speech service provided by the AMR-WB codec will give mobile communication speech quality that also substantially exceeds (narrowband) wireline quality. The paper details AMR-WB standardization history, algorithmic description including novel techniques for efficient ACELP wideband speech coding and subjective quality performance of the codec.

...read moreread less

Citations

PDF

Open Access

More filters

Monograph•DOI•

Near-Capacity Multi-Functional MIMO Systems

[...]

Lajos Hanzo, Osamah Alamri, Mohammed El-Hajjar, Nan Wu

22 May 2009

206 citations

Book•

Near-Capacity Multi-Functional MIMO Systems: Sphere-Packing, Iterative Detection and Cooperation

[...]

Lajos Hanzo, Osamah Alamri, Mohammed El-Hajjar, Nan Wu

22 Jun 2009

TL;DR: TheWireless Channel and the Concept of Diversity, a Coherent Versus Differential Turbo Detection of Sphere-packing-aided Single-user MIMO Systems, and a Universal Approach to Space-Time Block Codes: A Universal Approach are reviewed.

...read moreread less

Abstract: About the Authors. OtherWiley IEEE Press Books on Related Topics. Preface. Acknowledgments. 1 Problem Formulation, Objectives and Benefits. 1.1 TheWireless Channel and the Concept of Diversity. 1.2 Diversity and Multiplexing Trade-offs in Multi-functional MIMO Systems. 1.3 Coherent versus Non-coherent Detection for STBCs Using Co-located and Cooperative Antenna Elements. 1.4 Historical Perspective and State-of-the-Art Contributions. 1.5 Iterative Detection Schemes and their Convergence Analysis. 1.6 Outline and Novel Aspects of the Monograph. Part I Coherent Versus Differential Turbo Detection of Sphere-packing-aided Single-user MIMO Systems. List of Symbols in Part I. 2 Space-Time Block Code Design using Sphere Packing. 2.1 Introduction. 2.2 Design Criteria for Space-Time Signals. 2.3 Design Criteria for Time-correlated Fading Channels. 2.4 Orthogonal Space-Time Code Design using SP. 2.5 STBC-SP Performance. 2.6 Chapter Conclusions. 2.7 Chapter Summary. 3 Turbo Detection of Channel-coded STBC-SP Schemes. 3.1 Introduction. 3.2 System Overview. 3.3 Iterative Demapping. 3.4 Binary EXIT Chart Analysis. 3.5 Performance of Turbo-detected Bit-based STBC-SP Schemes. 3.6 Chapter Conclusions. 3.7 Chapter Summary. 4 Turbo Detection of Channel-coded DSTBC-SP Schemes. 4.1 Introduction. 4.2 Differential STBC using SP Modulation. 4.3 Bit-based RSC-coded Turbo-detected DSTBC-SP Scheme. 4.4 Chapter Conclusions. 4.5 Chapter Summary. 5 Three-stage Turbo-detected STBC-SP Schemes. 5.1 Introduction. 5.2 System Overview. 5.3 EXIT Chart Analysis. 5.4 Maximum Achievable Bandwidth Efficiency. 5.5 Performance of Three-stageTurbo-detected STBC-SP Schemes. 5.6 Chapter Conclusions. 5.7 Chapter Summary. 6 Symbol-based Channel-coded STBC-SP Schemes. 6.1 Introduction. 6.2 System Overview. 6.3 Symbol-based Iterative Decoding. 6.4 Non-binary EXIT Chart Analysis. 6.5 Performance of Bit-based and Symbol-based LDPC-coded STBC-SP Schemes. 6.6 Chapter Conclusions. 6.7 Chapter Summary. Part II Coherent Versus Differential Turbo Detection of Single-user and Cooperative MIMOs. List of Symbols in Part II. 7 Linear Dispersion Codes: An EXIT Chart Perspective. 7.1 Introduction and Outline. 7.2 Linear Dispersion Codes. 7.3 Link Between STBCs and LDCs. 7.4 EXIT-chart-based Design of LDCs. 7.5 EXIT-chart-based Design of IR-PLDCs. 7.6 Conclusion. 8 Differential Space-Time Block Codes: A Universal Approach. 8.1 Introduction and Outline. 8.2 System Model. 8.3 DOSTBCs. 8.4 DLDCs. 8.5 RSC-coded Precoder-aided DOSTBCs. 8.6 IRCC-coded Precoder-aided DLDCs. 8.7 Conclusion. 9 Cooperative Space-Time Block Codes. 9.1 Introduction and Outline. 9.2 Twin-layer CLDCs. 9.3 IRCC-coded Precoder-aided CLDCs. 9.4 Conclusion. Part III Differential Turbo Detection of Multi-functional MIMO-aided Multi-user and Cooperative Systems. List of Symbols in Part III. 10 Differential Space-Time Spreading. 10.1 Introduction. 10.2 DPSK. 10.3 DSTS Designusing Two Transmit Antennas. 10.4 DSTS Design Using Four Transmit Antennas. 10.5 Chapter Conclusions. 10.6 Chapter Summary. 11 Iterative Detection of Channel-coded DSTS Schemes. 11.1 Introduction. 11.2 Iterative Detection of RSC-coded DSTS Schemes. 11.3 Iterative Detection of RSC-coded and Unity-rate Precoded Four-antenna-aided DSTS-SP System. 11.4 Chapter Conclusions. 11.5 Chapter Summary. 12 Adaptive DSTS-assisted Iteratively Detected SP Modulation. 12.1 Introduction. 12.2 System Overview. 12.3 Adaptive DSTS-assisted SP Modulation. 12.4 VSF-based Adaptive Rate DSTS. 12.5 Variable-code-rate Iteratively Detected DSTS-SP System. 12.6 Results and Discussion. 12.7 Chapter Conclusion and Summary. 13 Layered Steered Space-Time Codes. 13.1 Introduction. 13.2 LSSTCs. 13.3 Capacity of LSSTCs. 13.4 Iterative Detection and EXIT Chart Analysis. 13.5 Results and Discussion. 13.6 Chapter Conclusions. 13.7 Chapter Summary. 14 DL LSSTS-aided Generalized MC DS-CDMA. 14.1 Introduction. 14.2 LSSTS-aided Generalized MCDS-CDMA. 14.3 Increasing the Number of Users by Employing TD and FD Spreading. 14.4 Iterative Detection and EXIT Chart Analysis. 14.5 Results and Discussion. 14.6 Chapter Conclusions. 14.7 Chapter Summary. 15 Distributed Turbo Coding. 15.1 Introduction. 15.2 Background of Cooperative Communications. 15.3 DTC. 15.4 Results and Discussion. 15.5 Chapter Conclusions. 15.6 Chapter Summary. 16 Conclusions and Future Research. 16.1 Summary and Conclusions. 16.2 Future Research Ideas. 16.3 Closing Remarks. A Gray Mapping and AGM Schemes for SP Modulation of Size L =16. B EXIT Charts of Various Bit-based Turbo-detected STBC-SP Schemes. C EXIT Charts of Various Bit-based Turbo-detected DSTBC-SP Schemes. D LDCs' / for QPSK Modulation. E DLDCs' / for 2PAM Modulation. F CLDCs' / 1 and / 2 for BPSK Modulation. G Weighting Coefficient Vectors e and a. H Gray Mapping and AGM Schemes for SP Modulation of Size L =16. Glossary. Bibliography. Index. Author Index.

...read moreread less

204 citations

Journal Article•DOI•

ASVspoof: The Automatic Speaker Verification Spoofing and Countermeasures Challenge

[...]

Zhizheng Wu¹, Junichi Yamagishi¹, Tomi Kinnunen², Cemal Hanilci², Mohammed Sahidullah², Aleksandr Sizov², Nicholas Evans³, Massimiliano Todisco³ - Show less +4 more•Institutions (3)

University of Edinburgh¹, University of Eastern Finland², Institut Eurécom³

17 Feb 2017-IEEE Journal of Selected Topics in Signal Processing

TL;DR: A review of postevaluation studies conducted using the same dataset illustrates the rapid progress stemming from ASVspoof and outlines the need for further investigation.

...read moreread less

Abstract: Concerns regarding the vulnerability of automatic speaker verification (ASV) technology against spoofing can undermine confidence in its reliability and form a barrier to exploitation. The absence of competitive evaluations and the lack of common datasets has hampered progress in developing effective spoofing countermeasures. This paper describes the ASV Spoofing and Countermeasures (ASVspoof) initiative, which aims to fill this void. Through the provision of a common dataset, protocols, and metrics, ASVspoof promotes a sound research methodology and fosters technological progress. This paper also describes the ASVspoof 2015 dataset, evaluation, and results with detailed analyses. A review of postevaluation studies conducted using the same dataset illustrates the rapid progress stemming from ASVspoof and outlines the need for further investigation. Priority future research directions are presented in the scope of the next ASVspoof evaluation planned for 2017.

...read moreread less

177 citations

Cites methods from "The adaptive multirate wideband spe..."

...Ten different parameters, including the mean LP energy and LTP error [62], were used to construct feature parametrisations, which where scored using a logistic classifier....
[...]

Proceedings Article•DOI•

Unified speech and audio coding scheme for high quality at low bitrates

[...]

Max Neuendorf, Philippe Gournay¹, Markus Multrus, Jeremie Lecomte, B. Bessette¹, R. Geiger, Stefan Bayer, Guillaume Fuchs, Johannes Hilpert, Nikolaus Rettelbach, Redwan Salami, Gerald Schuller, Roch Lefebvre¹, Grill Bernhard - Show less +10 more•Institutions (1)

Université de Sherbrooke¹

19 Apr 2009

TL;DR: This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding, which results in a codec that exhibits consistently high quality for speech, music and mixed audio content.

...read moreread less

Abstract: Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the codec architecture and presents results of formal listening tests comparing this new codec with HE-AAC(v2) and AMR-WB+. This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding.

...read moreread less

108 citations

Proceedings Article•DOI•

A harmonic bandwidth extension method for audio codecs

[...]

Frederik Nagel¹, Sascha Disch²•Institutions (2)

Fraunhofer Society¹, Leibniz University of Hanover²

19 Apr 2009

TL;DR: This paper exposes the origin of the roughness and proposes a bandwidth extension method, which does not introduce roughness into the reconstructed audio signal, and demonstrates the advantage of the proposed method compared to a standard bandwidth extension.

...read moreread less

Abstract: Today's efficient audio codecs for low bitrate application scenarios often rely on parametric coding of the upper frequency band portion of a signal while the lower frequency band portion of the same is conveyed by a waveform preserving coding method. At the decoder, the upper frequency signal is approximated from the lower frequency data using the upper frequency band parameters. However, commonly used methods of bandwidth extension almost inevitably suffer from a sensation of unpleasant roughness, which is especially present for tonal music items. In this paper we expose the origin of the roughness and propose a bandwidth extension method, which does not introduce roughness into the reconstructed audio signal. A listening test demonstrates the advantage of the proposed method compared to a standard bandwidth extension.

...read moreread less

106 citations

Cites background from "The adaptive multirate wideband spe..."

...Alternatively, higher frequencies can be generated by either non-linear processing [5] or upsampling [6]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Design and description of CS-ACELP: a toll quality 8 kb/s speech coder

[...]

R. Salami¹, Claude Laflamme, J.-P. Adoul, A. Kataoka, S. Hayashi, Takehiro Moriya, C. Lamblin, D. Massaloux, S. Proust, P. Kroon, Y. Shoham - Show less +7 more•Institutions (1)

Université de Sherbrooke¹

01 Mar 1998-IEEE Transactions on Speech and Audio Processing

TL;DR: The coder structure is described in detail and the reasons behind certain design choices are discussed and a summary of the subjective test results based on a real-time implementation of this version are presented.

...read moreread less

Abstract: This paper describes the 8 kb/s speech coding algorithm G.729 which has been standardized by ITU-T. The algorithm is based on a conjugate-structure algebraic CELP (CS-ACELP) coding technique and uses 10 ms speech frames. The codec delivers toll-quality speech (equivalent to 32 kb/s ADPCM) for most operating conditions. This paper describes the coder structure in detail and discusses the reasons behind certain design choices. A 16-b fixed-point version has been developed as part of Recommendation G.729 and a summary of the subjective test results based on a real-time implementation of this version are presented.

...read moreread less

236 citations

"The adaptive multirate wideband spe..." refers methods in this paper

...The prediction and codebook search are similar to those used in the GSM EFR codec [3] or G.729 [ 20 ]....
[...]

Journal Article•DOI•

A toll quality 8 kb/s speech codec for the personal communications system (PCS)

[...]

R. Salami¹, Claude Laflamme¹, J.-P. Adoul¹, D. Massaloux•Institutions (1)

Université de Sherbrooke¹

01 Aug 1994-IEEE Transactions on Vehicular Technology

TL;DR: A toll quality speech codec at 8 kb/s suitable for the future personal communications system and can support a frame erasure rate up to 3% with a degradation in its performance that is still worse than the ITU-T requirements.

...read moreread less

Abstract: A toll quality speech codec at 8 kb/s suitable for the future personal communications system is presented. The codec is currently under standardization by the ITU-T (successor of CCITT) where the codec terms of reference were mainly determined considering PCS application. The encoding algorithm is based on algebraic code-excited linear prediction (ACELP) and has a speech frame of 10 ms. Efficient pitch and codebook search strategies, along with efficient quantization procedures, have been developed to achieve toll quality encoded speech with a complexity implementable on current fixed-point DSP chips. Formal subjective listening tests, performed by ITU-T SG 12, showed that the codec quality is equivalent to that of G.726 ADPCM at 32 kb/s in error-free conditions and it outperforms G.726 under error conditions. The codec performs adequately under tandeming conditions, and can support a frame erasure rate up to 3% with a degradation in its performance that is still worse than the ITU-T requirements, and this is one subject of study for the next phase. The algorithm has been implemented on a single fixed-point DSP for the ITU-T subjective rest, and required about 29 MIPS. An optimized version, however, requires 24 MIPS without any speech quality degradation. >

...read moreread less

110 citations

"The adaptive multirate wideband spe..." refers background in this paper

...The results are summarized in the 3GPP Technical Report in [13]....
[...]

Proceedings Article•DOI•

Immittance spectral pairs (ISP) for speech encoding

[...]

Yuval Bistritz¹, S. Peller¹•Institutions (1)

Tel Aviv University¹

27 Apr 1993

TL;DR: In quantization experiments ISP has been found to compare favorably with LSP, and a study of interframe differentiation coding for ISP and LSP demonstrates the respective performances of the two sets.

...read moreread less

Abstract: Immittance spectral pairs (ISPs) form a new set of parameters for representing the linear predictive coding (LPC) filter. For a filter of order n ISP consists of a gain and n-1 frequency parameters, instead of n frequency parameters as is the case for line spectrum pair (LSPs). In regarding LPC as a pseudo-model for the vocal tract, ISP can represent the immitance at the glottis without imposing, like LSP, artificial boundary conditions. In quantization experiments ISP has been found to compare favorably with LSP. A study of interframe differentiation coding for ISP and LSP demonstrates the respective performances of the two sets. >

...read moreread less

69 citations

"The adaptive multirate wideband spe..." refers methods in this paper

...Four interpolated ISP vectors (corresponding to 4 subframes) are computed and then converted to LP filter coefficient domain , which is used for synthesizing speech in the subframe....
[...]
...The ISP distance measure between the ISP’s in the present frameand the past frame is given by the relation where is the order of the LP filter....
[...]
...[18] Y. Bistritz and S. Pellerm, “Immittance Spectral Pairs (ISP) for speech encoding,” in IEEE Int....
[...]
...The received indices of ISP quantization are used to reconstruct the quantized ISP vector....
[...]
...The set of LP parameters is converted to immittance spectrum pairs (ISP) [18] and vector quantized using split-multistage vector quantization with 46 bits....
[...]

Proceedings Article•DOI•

Concepts and solutions for link adaptation and inband signaling for the GSM AMR speech coding standard

[...]

S. Bruhn¹, Peter Blöcher², Karl Hellwig², J. Sjoberg²•Institutions (2)

Ericsson Radio Systems¹, Ericsson²

16 May 1999

TL;DR: Various approaches for link adaptation with respect to varying radio channel conditions are described and the method of inband signaling that is standardized is discussed and motivated.

...read moreread less

Abstract: The European Telecommunications Standards Institute (ETSI) has just defined an adaptive multi rate (AMR) speech codec standard for the GSM system with a multitude of source and channel coding rates. The standard aims to provide robust high quality speech together with the flexibility to deliver radio network capacity enhancements by means of low bit-rate operation. The codec rates are dynamically selected with respect to the rapidly changing radio conditions and to local capacity requirements. This paper describes various approaches for link adaptation with respect to varying radio channel conditions and puts a focus on the solution in the AMR standard. Moreover the method of inband signaling that is standardized is discussed and motivated.

...read moreread less

59 citations

Proceedings Article•

Low-delay code-excited linear-predictive coding of wideband speech at 32 kbps.

[...]

Yair Shoham, Erik Ordentlich

01 Jan 1990

TL;DR: In this paper, the authors report on the use of the codebook-excited linear-predictive (CELP) algorithm for 32 kb/s low-delay (LD) coding of wideband speech.

...read moreread less

Abstract: The authors report on the use of the codebook-excited linear-predictive (CELP) algorithm for 32 kb/s low-delay (LD-CELP) coding of wideband speech. The main problem associated with wideband coding, namely, spectral noise weighting, is discussed. The authors propose an enhanced noise weighting technique and demonstrate its efficiency via subjective listening tests. In these tests, involving 20 listeners and 8 test sentences, the average rating for the proposed 32 kb/s LD-CELP was essentially equal to that of the 65 kb/s standard (G.722) CCITT wideband coder.<>

...read moreread less

43 citations