Showing papers on "Speech coding published in 1990"

PDF

Open Access

Journal Article•DOI•

Digital Coding of Waveforms: Principles and Applications to Speech and Video

[...]

Nuggehally Sampath Jayant, P. Noll

01 Nov 1990-Signal Processing

869 citations

Proceedings Article•DOI•

Vector sum excited linear prediction (VSELP) speech coding at 8 kbps

[...]

I.A. Gerson¹, M.A. Jasiuk¹•Institutions (1)

Motorola¹

03 Apr 1990

TL;DR: The vector sum excited linear prediction speech coder is presented, and it utilizes a codebook with a structure that allows for a very efficient search procedure.

...read moreread less

Abstract: The vector sum excited linear prediction speech coder is presented. It utilizes a codebook with a structure that allows for a very efficient search procedure. Other advantages of the VSELP codebook structure are discussed, and a detailed description of an 8-kb/s VSELP coder is given. This coder was selected by the Telecommunications Industry Association (TIA) as the standard for use in North American digital cellular telephone systems. The coder uses two VSELP excitation codebooks, a gain quantizer which is robust to channel errors, and a novel adaptive pre/postfilter arrangement. >

...read moreread less

288 citations

Journal Article•DOI•

Efficient vector quantization of LPC parameters at 24 bits/frame

[...]

Kuldip K. Paliwal¹, B. Atal¹•Institutions (1)

Bell Labs¹

01 May 1990-Journal of the Acoustical Society of America

TL;DR: A split vector quantization approach is used to overcome the complexity problem of LPC vector and each part is vector‐quantized separately.

...read moreread less

Abstract: Linear prediction coding (LPC) parameters are widely used in various speech processing applications for representing the spectral envelope information of speech. For low‐bit‐rate speech coding application, it is important to quantize these parameters accurately using as few bits as possible without sacrificing the speech quality. Though the vector quantizers are more efficient than the scalar quantizers, their use for fine quantization of LPC information (using 24–26 bits/frames) is impeded due to their prohibitively high complexity. In this paper, a split vector quantization approach is used to overcome the complexity problem. Here, the LPC vector is divided into two parts and each part is vector‐quantized separately. The splitting of LPC vector is studied in the following three domains: (1) line spectral‐pair frequency (LSF), (2) arc‐sine reflection coefficient, and (3) log area ratio. Splitting in LSF domain is found to be the best. Using the localized spectral properties of the LSF parameters, a weigh...

...read moreread less

211 citations

Proceedings Article•DOI•

Pitch predictors with high temporal resolution

[...]

P. Kroon¹, Bishnu S. Atal¹•Institutions (1)

Bell Labs¹

03 Apr 1990

TL;DR: A first-order pitch predictor is described whose delay is specified as an integer number of samples plus a fraction of a sample at the current sampling rate, which has a better performance than conventional multiple coefficient predictors and leads to more efficient coding of the predictor parameters.

...read moreread less

Abstract: A first-order pitch predictor is described whose delay is specified as an integer number of samples plus a fraction of a sample at the current sampling rate. This realization has a better performance than conventional multiple coefficient predictors and leads to more efficient coding of the predictor parameters. Also discussed is the application of noninteger delay pitch predictors to low-bit-rate speech coding. >

...read moreread less

208 citations

Patent•DOI•

Speech coding/decoding method having an excitation signal

[...]

Kazunori Ozawa¹•Institutions (1)

NEC¹

20 Jul 1990-Journal of the Acoustical Society of America

TL;DR: In this article, a speech coding method in which spectrum parameters representing a spectrum envelope and a pitch parameter representing a pitch are obtained from an input discrete speech signal is presented. And a frame interval is divided into subintervals in accordance with the pitch parameter.

...read moreread less

Abstract: A speech coding method in which spectrum parameter representing a spectrum envelope and a pitch parameter representing a pitch are obtained from an input discrete speech signal. A frame interval is divided into subintervals in accordance with the pitch parameter. A sound source signal in one of the subintervals is obtained by obtaining a multipulse with respect to a difference signal obtained by performing prediction on the basis of a past sound source signal. Correction information for correcting at least one of the amplitude and the phase of the sound source signal are obtained and output in other pitch intervals in the frame.

...read moreread less

183 citations

Patent•DOI•

Speech coding system utilizing a recursive computation technique for improvement in processing speed

[...]

Masami Akamine¹, Yuji Okuda¹, Kimio Miseki¹•Institutions (1)

Toshiba¹

16 Oct 1990-Journal of the Acoustical Society of America

TL;DR: In this paper, a speech coding system which recursively executes a filter-applied "Toeplitz characteristic" by causing a drive signal (i.e., an excitation signal) to be converted into a "Toplitz matrix" when detecting a pitch period in which distortion of the input vector and the vector subsequent to the application of filter applied computation to the drive signal vector in the pitch forecast called either closed loop or compatible code book is minimized.

...read moreread less

Abstract: This invention provides a novel speech coding system which recursively executes a filter-applied "Toeplitz characteristic" by causing a drive signal (i.e., an excitation signal) to be converted into a "Toeplitz matrix" when detecting a pitch period in which distortion of the input vector and the vector subsequent to the application of filter-applied computation to the drive signal vector in the pitch forecast called either "closed loop" or "compatible code book" is minimized. The vector quantization method substantially making up the speech coding system of the invention is characteristically used by the system.

...read moreread less

181 citations

Proceedings Article•DOI•

Pitch estimation and voicing detection based on a sinusoidal speech model

[...]

R.J. McAulay¹, Thomas F. Quatieri¹•Institutions (1)

Massachusetts Institute of Technology¹

03 Apr 1990

TL;DR: A pitch estimation criterion is derived that is inherently unambiguous, uses pitch-adaptive resolution, uses small-signal suppression to provide enhanced discrimination, and uses amplitude compression to eliminate the effects of pitch-formant interaction.

...read moreread less

Abstract: A technique for estimating the pitch of a speech waveform is developed. It fits a harmonic set of sine waves to the input data using a mean-squared-error (MSE) criterion. By exploiting a sinusoidal model for the input speech waveform, a pitch estimation criterion is derived that is inherently unambiguous, uses pitch-adaptive resolution, uses small-signal suppression to provide enhanced discrimination, and uses amplitude compression to eliminate the effects of pitch-formant interaction. The normalized minimum mean squared error proves to be a powerful discriminant for estimating the likelihood that a given frame of speech is voiced. >

...read moreread less

145 citations

Journal Article•DOI•

Fast methods for the CELP speech coding algorithm

[...]

Willem Bastiaan Kleijn¹, Daniel John Krasinski², Richard Harry Ketchum²•Institutions (2)

Bell Labs¹, AT&T²

01 Aug 1990-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Special fast procedures for the code excited linear predictive coding (CELP) algorithm have been developed to make implementation on modest hardware possible and their storage requirement and numerical accuracy are discussed.

...read moreread less

Abstract: Special fast procedures for the code excited linear predictive coding (CELP) algorithm have been developed to make implementation on modest hardware possible. The advantages, as well as the disadvantages, of the various fast procedures are discussed. A general formalism for the algorithm is developed, followed by the discussion of the individual procedures which are grouped according to their features. Along with the computational complexity of each procedure, its storage requirement and numerical accuracy are discussed. A large number of the fast procedures are designed to search through a particular type of codebook (most of the codebooks are stochastic in character, while a few are deterministic). Other fast procedures can be used for arbitrary codebooks and are thus also applicable to trained codebooks. Some of the fast procedures designed for stochastic codebooks can also be used for the computation of the closed pitch loop parameters, which can be interpreted as a search through a time-dependent codebook. >

...read moreread less

112 citations

Patent•DOI•

Low-bit-rate speech coder using LPC data reduction processing

[...]

Yu J. Liu, Joseph Harvey Rothweiler

16 Aug 1990-Journal of the Acoustical Society of America

TL;DR: A speech coder employs vector quantization of LPC parameters, interpolation, and trellis coding for improved speech coding at low bit rates (400 bps).

...read moreread less

Abstract: A speech coder employs vector quantization of LPC parameters, interpolation, and trellis coding for improved speech coding at low bit rates (400 bps). The speech coder has an LPC analysis module for converting input speech to LPC parameters, an LSP conversion module for converting LPC parameters into line spectrum frequencies (LSP) data, and a vector quantization and interpolation (VQ/I) module for encoding the LSP data into vector indexes for transmission by applying LPC spectral amplitude as weighting coefficients to the LSP data. The VQ/I module outputs one vector index for every two LPC frames in order to reduce the transmission bit rate, and the omitted frames are interpolated on the receiving end. A decoder correspondingly decodes incoming indexes to LPC parameters and synthesizes them into output speech. Trellis coders with an adaptive tracking function encode the pitch and gain parameters of the LPC frames. A universal codebook stores codewords according to a plurality of accents. The speech coder automatically identifies a speaker's accent and selects the corresponding vocabulary of codewords in order to more intelligibly encode and decode the speaker's speech.

...read moreread less

101 citations

Journal Article•DOI•

An introduction to speech and speaker recognition

[...]

Richard D. Peacocke¹, Daryl H. Graf¹•Institutions (1)

bell northern research¹

01 Aug 1990-IEEE Computer

TL;DR: In this article, five approaches that can be used to control and simplify the speech recognition task are examined: isolated words, speaker-dependent systems, limited vocabulary size, a tightly constrained grammar, and quiet and controlled environmental conditions.

...read moreread less

Abstract: Five approaches that can be used to control and simplify the speech recognition task are examined. They entail the use of isolated words, speaker-dependent systems, limited vocabulary size, a tightly constrained grammar, and quiet and controlled environmental conditions. The five components of a speech recognition system are described: a speech capture device, a digital signal processing module, preprocessed signal storage, reference speech patterns, and a pattern-matching algorithm. Current speech recognition systems are reviewed and categorized. Speaker recognition approaches and systems are also discussed. >

...read moreread less

87 citations

Patent•DOI•

Speech selective automatic gain control

[...]

Danny Thomas Pinckley¹•Institutions (1)

Motorola¹

07 Dec 1990-Journal of the Acoustical Society of America

TL;DR: An automatic gain control circuit uses a speech recognizer to obtain smoothautomatic gain control and AGC is not used until it is required (i.e., when speech is present).

...read moreread less

Abstract: An automatic gain control circuit uses a speech recognizer to obtain smooth automatic gain control. An analog audio input signal is converted to a digital signal by an analog-to-digital converter and delayed by a delay circuit. A frame power (or alternatively, rectified peak amplitude) detector determines the power of each frame (or alternatively, the rectified peak amplitude) of the audio input signal, after applied to the A/D converter. A linear-to-log converter converts those values to a logarithmic form (for gain control over a broad range of values). A detected speech smoothing circuit smooths the variation in the values determined by the frame power (or peak amplitude) detector. A summer subtracts the output of the detected speech smoothing means from a fixed reference level, and thus obtains an error signal from the desired reference. A gain smoothing circuit smooths the resulting error signal (which is the logarithmically-shaped gain signal). A logarithm-to-linear converter converts the logarithmic gain signal to a linear form; and a multiplier multiplies the input signal by this smoothed gain. In accordance with the invention, a speech recognizer determines whether the audio input signal represents speech. An output of the speech recognizer is used to enable the detected speech smoothing circuit and the gain smoothing means when the audio input signal represents speech. Thus AGC is not used until it is required (i.e., when speech is present).

...read moreread less

Patent•DOI•

Dynamic codebook for efficient speech coding based on algebraic codes

[...]

Adoul Jean-Pierre¹, Claude Laflamme¹•Institutions (1)

Université de Sherbrooke¹

06 Nov 1990-Journal of the Acoustical Society of America

TL;DR: In this article, the search complexity in finding the best codeword is greatly reduced by bringing the search back to the algebraic code domain, thereby allowing the sparsity of the codebook to speed up the necessary computations.

...read moreread less

Abstract: A method of encoding a speech signal is presented. This method improves the excitation codebook and search procedure of the conventional Code Excited Linear Prediction (CELP) speech encoders. Use is made of a dynamic codebook (201, 202) based on the combination of two modules: a sparse algebraic code generator (201) associated to a filter (202) having a transfer function varying in time. The generator (102) is a structured codebook with codewords having very few non zero components. The filter (202) shapes the spectral characteristics whereby the resulting excitation codebook (201, 202) exhibits favorable perceptual properties. The search complexity in finding the best codeword is greatly reduced by bringing the search back to the algebraic code domain thereby allowing the sparsity of the algebraic code to speed up the necessary computations.

...read moreread less

Proceedings Article•DOI•

High-quality 16 kb/s speech coding with a one-way delay less than 2 ms

[...]

Juin-Hwey Chen¹•Institutions (1)

Bell Labs¹

03 Apr 1990

TL;DR: A high-quality 16-kb/s speech coder which has a one-way coding delay of less than 2 ms is presented and formal subjective tests indicate that this coder produces high- quality speech comparable to that of the CCITT G.721 32- kb/s ADPCM standard.

...read moreread less

Abstract: A high-quality 16-kb/s speech coder which has a one-way coding delay of less than 2 ms is presented. The coder is basically a backward-adaptive version of the code-excited linear prediction (CELP) coder. The low coding delay is achieved by using backward-adaptive predictor and gain and by using an excitation vector size as small as five samples. The pitch predictor in conventional CELP coders is eliminated, and the linear predictive coding (LPC) predictor order is increased from 10 to 50. The excitation gain is updated by a tenth-order adaptive logarithmic gain predictor. This log-gain predictor and the LPC predictor are updated by performing LPC analysis on previous log-gain and coded speech, respectively. The excitation codebook is closed-loop optimized and the codebook index is Gray-coded to improve the robustness against channel errors. Formal subjective tests indicate that this coder produces high-quality speech comparable to that of the CCITT G.721 32-kb/s ADPCM standard. >

...read moreread less

Journal Article•DOI•

Predictive trellis coded quantization of speech

[...]

Michael W. Marcellin¹, Thomas R. Fischer², Jerry D. Gibson²•Institutions (2)

University of Arizona¹, Texas A&M University²

01 Jan 1990-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: The performance of predictive TCQ (PTCQ) is compared to that of other waveform coders, and the effects of channel errors on PTCQ performance are discussed.

...read moreread less

Abstract: Trellis-coded quantization (TCQ) is incorporated into a predictive coding structure for encoding sampled speech. The modest complexity of the resulting structure is seen to be a direct consequence of the TCQ formulation. Simulation results are presented for systems using fixed-prediction/fixed-residual encoding, fixed-prediction/adaptive-residual encoding, and adaptive-prediction/adaptive-residual encoding. The performance of predictive TCQ (PTCQ) is compared to that of other waveform coders, and the effects of channel errors on PTCQ performance are discussed. For a fully adaptive 16-kb/s speech coding system, segmental signal-to-noise ratios in the range of 19.1-21.9 dB are obtained for a variety of speakers and test sentences. Reconstructed speech obtained from this system is of excellent communication quality. >

...read moreread less

Proceedings Article•DOI•

Improved pitch prediction with fractional delays in CELP coding

[...]

Jorge S. Marques, Isabel Trancoso, José Tribolet, Luís B. Almeida

03 Apr 1990

TL;DR: In this paper, a scheme for long-term prediction in CELP (code-excited linear predictive) coding using fractional delay prediction was discussed, which permits a more accurate representation of voiced speech and achieves an improvement of synthetic quality for female speakers.

...read moreread less

Abstract: A scheme is discussed for long-term prediction in CELP (code-excited linear predictive) coding using fractional delay prediction. This technique permits a more accurate representation of voiced speech and achieves an improvement of synthetic quality for female speakers. The higher complexity of this type of predictor relative to the classical one is its major disadvantage. Suboptimal schemes in which the search for the functional pitch delay is restricted to a neighborhood of an integer pitch estimate can be envisaged to decrease the computational load. >

...read moreread less

Journal Article•DOI•

A subband coding, BCH coding, and 16-QAM system for mobile radio speech communications

[...]

Lajos Hanzo¹, Raymond Steele¹, P M Fortune¹•Institutions (1)

University of Southampton¹

01 Nov 1990-IEEE Transactions on Vehicular Technology

TL;DR: A combined subband speech coding (SBC), Bose-Chaudhuri-Hocquenghem (BCH) error-correction coding, and 16-level quadrature amplitude modulation (16-QAM) scheme with switched diversity and speech postenhancement is proposed.

...read moreread less

Abstract: A combined subband speech coding (SBC), Bose-Chaudhuri-Hocquenghem (BCH) error-correction coding, and 16-level quadrature amplitude modulation (16-QAM) scheme with switched diversity and speech postenhancement is proposed. The system's performance is dramatically improved by deploying some degree of fade tracking capability over fading channels. Further quality enhancement accrues by using appropriate mapping between the SBC speech codec and the Gray coded QAM words. Various BCH codes are utilized to adequately match the error-correcting power to the perceptual importance of the SBC bits. One of the proposed systems operates at 7 kBd and yields good communications-quality speech for channel signal-to-noise ratios (SNRs) in excess of 20 dB and encounters a maximum overall system delay of 55.125 ms. A more complex arrangement uses second-order switched diversity to reduce the channel SNR required to around 16 dB and the transmission rate to 5 kBd when the vehicular speed is 30 mph while the system delay is unchanged at 55.125 ms. >

...read moreread less

Patent•

Video signal coding apparatus, coding method used in the video signal coding apparatus and video signal coding transmission system having the video signal coding apparatus

[...]

Kiyoshi Sakai¹, Takashi Itoh¹, Kiichi Matsuda¹•Institutions (1)

Fujitsu¹

05 Nov 1990

TL;DR: In this article, a signal coding apparatus coupled to a receiver having a receiver buffer and a decoder, includes a coding unit for coding a signal and outputting information generated in a frame unit, the information being a coded signal.

...read moreread less

Abstract: A signal coding apparatus, which is coupled, via a transmission path, to a receiver having a receiver buffer and a decoder, includes a coding unit for coding a signal and outputting information generated in a frame unit, the information being a coded signal. The apparatus also includes a transmitter buffer for temporarily storing the information, and a controller for controlling an amount of the information on the basis of a storage capacity of the receiver buffer and an amount of the information which is contained in a frame per a unit time. There is also provided a method used in the above coding apparatus, and a signal coding transmission system employing the signal coding apparatus.

...read moreread less

Proceedings Article•DOI•

Efficient signal coding with hierarchical lapped transforms

[...]

H.S. Malvar

03 Apr 1990

TL;DR: The hierarchical lapped transform has a much lower computational complexity than a tree-structured QMF filter bank, and so with HLTs a much larger number of bands can be used in practice.

...read moreread less

Abstract: The hierarchical lapped transform (HLT) is defined. It is based on the modulated lapped transform (MLT). The HLT has a much lower computational complexity than a tree-structured QMF filter bank, and so with HLTs a much larger number of bands can be used in practice. The coding gain of HLTs is close to that of a full-length MLT, for the same number of bands, and therefore there is no significant loss of coding efficiency. With HLTs transient signals can be better reconstructed than with nonhierarchical transforms, as demonstrated by speech and image coding examples. In image coding applications, the HLT can also be used for progressive transmission. >

...read moreread less

Patent•DOI•

Speech coding apparatus

[...]

Fumio Amano¹, Tomohiko Taniguchi¹, Yoshinori Tanaka¹, Yasuji Ota¹, Shigeyuki Unagami¹ - Show less +1 more•Institutions (1)

Fujitsu¹

11 Apr 1990-Journal of the Acoustical Society of America

TL;DR: In this paper, a speech coding apparatus which selects an optimum code from a code book is presented, the optimum code giving the minimum magnitude of error signal between the input signal and the reproduced signal obtained by a filter calculation using a linear prediction parameter from a linear predictive analysis unit with respect to the codes of the code data, wherein use is made, as the codes, of a code formed by thinning to 1/M (M being an integer of two or more) the plurality of sampling values constituting the codes.

...read moreread less

Abstract: A speech coding apparatus which selects an optimum code from a code book (21), the optimum code giving the minimum magnitude of error signal between the input signal and the reproduced signal obtained by a filter calculation using a linear prediction parameter from a linear predictive analysis unit (10) with respect to the codes of the code data, wherein use is made, as the codes, of a code formed by thinning to 1/M (M being an integer of two or more) the plurality of sampling values constituting the codes. To compensate for the deterioration of the quality of the reproduced signal caused by thinning the sampling values in this way, an additional linear predictive analysis unit (20) is further introduced and use made of an amended linear prediction parameter instead of the linear prediction parameter.

...read moreread less

Patent•DOI•

Speech coding apparatus using multimode coding

[...]

Tomohiko Taniguchi¹, Yoshinori Tanaka¹, Akira Sasama¹, Yasuji Ohta¹, Fumio Amano¹, Shigeyuki Unagami¹ - Show less +2 more•Institutions (1)

Fujitsu¹

11 Sep 1990-Journal of the Acoustical Society of America

TL;DR: In this article, a speech coding apparatus coupled to a transmission channel includes m (m is an integer greater than 1) coders, m decoders and m or (m-1) error-correcting coders.

...read moreread less

Abstract: A speech coding apparatus coupled to a transmission channel includes m (m is an integer greater than 1) coders, m decoders and m or (m-1) error correcting coders. The apparatus also includes an evaluation unit which evaluates a quality of each of reproduced speech signals from the input speech signal and the reproduced speech signals and which outputs an evaluated quality of each of the reproduced speech signals. The quality of each of the reproduced speech signals is evaluated in a state having no transmission error. A decision unit identifies one of the m coders which provides the reproduced speech signal having a smallest distortion on the basis of the evaluated quality of each of the reproduced speech signals, a current error rate of the transmission channel and error correcting abilities of the error correcting coders, and generates a coder identification number representative of a selected one of the m coders. An output part outputs a multiplexed transmission signal including the coded speech signal generated by the one of the m coders identified by the decision unit and the error correcting code generated by a corresponding one of the m error correcting coders.

...read moreread less

Proceedings Article•DOI•

Acoustic and perceptual studies of Lombard speech: application to isolated-words automatic speech recognition

[...]

J. Junqua¹, Y. Anglade¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

03 Apr 1990

TL;DR: It is observed that recognition scores cannot necessarily be improved by reducing the variability for one specific parameter, and recognition scores are not directly related to the increase of the vocal effort and cannot be predicted from speech variability.

...read moreread less

Abstract: The purpose of this study was (1) to determine what are the acoustic-phonetic differences between speech produced in quiet and speech produced in noise (Lombard speech) and (2) to evaluate the influence of these differences on human listeners and automatic speech recognizers. The acoustical analyses, done at the phonetic level on about 40 parameters, showed significant differences in variability for male and female speakers. In addition to replicating previous studies, the authors investigated more parameters, and examined the influence of the Lombard effect on female speakers. Perceptual experiments, run for foreign listeners, exhibited a decrease of the intelligibility for some confusable subsets of the vocabulary studied. The findings are correlated with the performance of a DTW-based recognizer, and it is observed that recognition scores cannot necessarily be improved by reducing the variability for one specific parameter. It is also found that recognition scores are not directly related to the increase of the vocal effort and cannot be predicted from speech variability. >

...read moreread less

Proceedings Article•

Low-delay code-excited linear-predictive coding of wideband speech at 32 kbps.

[...]

Yair Shoham, Erik Ordentlich

01 Jan 1990

TL;DR: In this paper, the authors report on the use of the codebook-excited linear-predictive (CELP) algorithm for 32 kb/s low-delay (LD) coding of wideband speech.

...read moreread less

Abstract: The authors report on the use of the codebook-excited linear-predictive (CELP) algorithm for 32 kb/s low-delay (LD-CELP) coding of wideband speech. The main problem associated with wideband coding, namely, spectral noise weighting, is discussed. The authors propose an enhanced noise weighting technique and demonstrate its efficiency via subjective listening tests. In these tests, involving 20 listeners and 8 test sentences, the average rating for the proposed 32 kb/s LD-CELP was essentially equal to that of the 65 kb/s standard (G.722) CCITT wideband coder.<>

...read moreread less

Journal Article•DOI•

High-quality coding of telephone speech and wideband audio

[...]

Nuggehally Sampath Jayant¹•Institutions (1)

Bell Labs¹

01 Jan 1990-IEEE Communications Magazine

TL;DR: Digital speech technology is reviewed, with the emphasis on applications demanding high-quality reproduction of the speech signal, which include the important subclass of wideband speech.

...read moreread less

Abstract: Digital speech technology is reviewed, with the emphasis on applications demanding high-quality reproduction of the speech signal. Examples of such applications are network telephony, ISDN terminals for audio teleconferencing, and systems for the storage of audio signals, which include the important subclass of wideband speech. Depending on the application, the bandwidth of input speech can vary from about 3 kHz to nearly 20 kHz. Coding for digital telephony at 4 and 8 kb/s, network quality coding at 16 kb/s, and coding for audio at 7 and 20 kHz are examined. Future directions in the field are discussed with respect to anticipated technology applications and the algorithms needed to support these technologies. >

...read moreread less

Proceedings Article•DOI•

Digital cellular systems for North America

[...]

C.-E.W. Sundberg¹, N. Seshadri¹•Institutions (1)

Bell Labs¹

02 Dec 1990

TL;DR: The North American system is compared to the pan-European digital GSM (Groupe Speciale Mobile) system, and techniques that may be used to further improve the system capacity of future digital cellular systems beyond the current standard are discussed.

...read moreread less

Abstract: Standards for a new cellular mobile radio system for North America are currently being defined. The system will use digital transmission in contrast to the present analog system. Capacity is increased by means of three techniques. These are: sending three digital voice channels in one current analog FM channel (maintaining spectral compatibility), increased trunking efficiency, and exploiting improved frequency reuse offered by robust digital transmission techniques. The main elements of the system, such as multiple access digital modulation, speech coding, channel coding, and equalization, are briefly discussed. The North American system is compared to the pan-European digital GSM (Groupe Speciale Mobile) system, and techniques that may be used to further improve the system capacity of future digital cellular systems beyond the current standard are discussed. >

...read moreread less

Proceedings Article•DOI•

Transform coding of audio signals at 64 kbit/s

[...]

Y. Mahieux¹, J.P. Petit¹•Institutions (1)

CNET¹

02 Dec 1990

TL;DR: The coding of high-quality sound at 64 kb/s is of interest for applications such as ISDN, and the algorithm described allows the reduction to such a bit rate while maintaining the original quality.

...read moreread less

Abstract: The coding of high-quality sound at 64 kb/s is of interest for applications such as ISDN. The algorithm described allows the reduction to such a bit rate while maintaining the original quality. It is based on transform coding, and uses a time-domain aliasing cancellation (TDAC) transformation. Perceptual properties and the interblock redundancy of the spectrum are involved when coding the transform coefficients. The complexity of the algorithm allows its real-time implementation on a one floating-point digital signal processor, such as the ATT DSP 32C. The performance and subjective results of the coding system are discussed. >

...read moreread less

Proceedings Article•DOI•

4.8 kbit/s delayed decision CELP coder using tree coding

[...]

Kazunori Mano, Takehiro Moriya

03 Apr 1990

TL;DR: The proposed coding method significantly increases the quality of the 4.8-kb/s CELP coder at the cost of an additional 5-ms coding delay, and the optimum combined parameter sequences are selected to minimize global quantization distortion over the coding frame.

...read moreread less

Abstract: A 4.8-kb/s delayed decision code excited linear prediction (CELP) coder that uses tree coding is described. In conventional CELP coding, short-term and long-term prediction parameters as well as excitation parameters are sequentially determined. In the proposed delayed decision CELP coding, a tree coding method is utilized. The long-term prediction and excitation parameter candidates obtained in each subframe are listed as a tree and the optimum combined parameter sequences are selected to minimize global quantization distortion over the coding frame. The proposed coding method significantly increases the quality of the 4.8-kb/s CELP coder at the cost of an additional 5-ms coding delay. >

...read moreread less

Proceedings Article•DOI•

A real-time implementation of the improved MBE speech coder

[...]

Michael S. Brandstein¹, P.A. Monta¹, John C. Hardwick¹, J.S. Lim¹•Institutions (1)

Massachusetts Institute of Technology¹

03 Apr 1990

TL;DR: A real-time, single digital signal processing (DSP) chip implementation of a 2.4, 4.8, and 8.0-kb/s improved multiband excitation (IMBE) vocoder is presented, and it is shown to generate high-quality speech under both clean and noisy conditions.

...read moreread less

Abstract: A real-time, single digital signal processing (DSP) chip implementation of a 2.4-, 4.8-, and 8.0-kb/s improved multiband excitation (IMBE) vocoder is presented. The IMBE vocoder is based on the MBE speech model, and it is shown to generate high-quality speech under both clean and noisy conditions. In addition, the IMBE vocoder is well suited for real-time implementation since it does not require excessive computation or storage. Full-duplex operation is demonstrated using a single AT&T WE DSP 32. Aspects of the hardware architecture, algorithm implementation, and system performance are addressed. >

...read moreread less

Proceedings Article•DOI•

The ISO audio coding standard

[...]

H.G. Musmann

02 Dec 1990

TL;DR: Concepts for improvement of the coding algorithms are discussed which might be the basis for future ISO activities aiming at a bit rate of only 2*64 kb/s for a stereo sound signal.

...read moreread less

Abstract: An ISO audio coding standard is being developed that will provide an audio quality comparable to that of a compact disc using a reduced bit rate of about 2*128 kb/s for a stereo sound signal instead of 2*706 kb/s. Four coding algorithms have been considered in order to develop the audio coding standard. Two of these coding algorithms have been tested and are outlined. The ASPEC algorithm uses a modified discrete cosine transform with overlapping blocks and dynamic windowing in order to map the input samples into frequency coefficients. The MUSICAM algorithm uses a subband analysis filter bank with 32 equally spaced subbands to map the input samples into frequency coefficients. Concepts for improvement of the coding algorithms are discussed which might be the basis for future ISO activities aiming at a bit rate of only 2*64 kb/s for a stereo sound signal. >

...read moreread less

Journal Article•DOI•

Design and performance of an analysis-by-synthesis class of predictive speech coders

[...]

Richard Rose¹, Thomas P. Barnwell²•Institutions (2)

Massachusetts Institute of Technology¹, Georgia Institute of Technology²

01 Sep 1990-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A new coder, named the self-excited vocoder, is discussed because of its good performance with low complexity, and because of the insight this coder gives to analysis-by-synthesis coders in general.

...read moreread less

Abstract: The performance of a broad class of analysis-by-synthesis linear predictive speech coders is quantified experimentally The class of coders includes a number of well-known techniques as well as a very large number of speech coders which have not been named or studied A general formulation for deriving the parametric representation used in all of the coders in the class is presented A new coder, named the self-excited vocoder, is discussed because of its good performance with low complexity, and because of the insight this coder gives to analysis-by-synthesis coders in general The results of a study comparing the performances of different members of this class are presented The study takes the form of a series of formal subjective and objective speech quality tests performed on selected coders The results of this study lead to some interesting and important observations concerning the controlling parameters for analysis-by-synthesis speech coders >

...read moreread less

Proceedings Article•DOI•

High-quality audio transform coding at 128 kbits/s

[...]

Grant Allen Davidson¹, Louis Dunn Fielder¹, M. Antill¹•Institutions (1)

Dolby Laboratories¹

03 Apr 1990

TL;DR: An approach to wideband digital audio compression of CD-quality signals at data rates of 128 kb/s channel and below is presented, a form of adaptive transform coding that features a nonuniform frequency division and coding scheme to exploit known characteristics of human perception.

...read moreread less

Abstract: An approach to wideband digital audio compression of CD-quality signals at data rates of 128 kb/s channel and below is presented. A form of adaptive transform coding, this technique features a nonuniform frequency division and coding scheme to exploit known characteristics of human perception. The algorithm has low computational complexity and can be adapted for use at other bit rates. A windowed overlap-add process is used with the forward/inverse transforms, which have been efficiently implemented using FFTs. Transform coefficients are converted into a subband block-companded format consisting of exponent words and associated mantissas, which are then coded with an adaptive quantizer. A real-time, single-chip programmable digital signal processing (DSP) implementation encodes 480-kHz-sampled stereo audio signals at a variety of bit rates. At 128 kb/s, the coder's subjective performance is appropriate for highest-quality 15-kHz professional audio applications. >

...read moreread less

Collapse