scispace - formally typeset
Search or ask a question

Showing papers by "Andreas Spanias published in 1994"


Journal ArticleDOI
01 Oct 1994
TL;DR: The objective of this paper is to provide a tutorial overview of speech coding methodologies with emphasis on those algorithms that are part of the recent low-rate standards for cellular communications.
Abstract: The past decade has witnessed substantial progress towards the application of low-rate speech coders to civilian and military communications as well as computer-related voice applications. Central to this progress has been the development of new speech coders capable of producing high-quality speech at low data rates. Most of these coders incorporate mechanisms to: represent the spectral properties of speech, provide for speech waveform matching, and "optimize" the coder's performance for the human ear. A number of these coders have already been adopted in national and international cellular telephony standards. The objective of this paper is to provide a tutorial overview of speech coding methodologies with emphasis on those algorithms that are part of the recent low-rate standards for cellular communications. Although the emphasis is on the new low-rate coders, we attempt to provide a comprehensive survey by covering some of the traditional methodologies as well. We feel that this approach will not only point out key references but will also provide valuable background to the beginner. The paper starts with a historical perspective and continues with a brief discussion on the speech properties and performance measures. We then proceed with descriptions of waveform coders, sinusoidal transform coders, linear predictive vocoders, and analysis-by-synthesis linear predictive coders. Finally, we present concluding remarks followed by a discussion of opportunities for future research. >

461 citations


Journal ArticleDOI
TL;DR: Methods to improve the convergence speed and reduce the computational complexity of a constrained frequency-domain algorithm that uses a time-varying step size are proposed.
Abstract: This paper is concerned with problems evolving around the implementation of a frequency-domain adaptive noise canceller on a fixed-point signal processor. In particular, we propose methods to improve the convergence speed and reduce the computational complexity of a constrained frequency-domain algorithm that uses a time-varying step size. In addition, we study the effects of finite word length and fixed-point arithmetic. Improvements are realized by adopting a new data reusing scheme and by applying running and pruned FFT's. Results are given using synthetic data, as well as data from noise cancellation experiments. >

9 citations


Proceedings ArticleDOI
N.G. Nair1, Andreas Spanias1
01 Jan 1994
TL;DR: This work presents a new approach that employs gradient projections in selected eigenvector sub-spaces to improve the convergence properties of LMS algorithms for colored inputs and introduces an efficient method to iteratively update an "eigen subspace" of the autocorrelation matrix.
Abstract: Although adaptive gradient algorithms are simple and relatively robust, they generally have poor performance in the absence of "rich" excitation. In particular, it is well known that the convergence speed of the LMS algorithm deteriorates when the condition number of the input autocorrelation matrix is large. This problem has been previously addressed using weighted RLS or normalized frequency-domain algorithms. We present a new approach that employs gradient projections in selected eigenvector sub-spaces to improve the convergence properties of LMS algorithms for colored inputs. We also introduce an efficient method to iteratively update an "eigen subspace" of the autocorrelation matrix. The proposed algorithm is more efficient in terms of computational complexity, than the WRLS and its convergence speed approaches that of the WRLS even for highly correlated inputs. >

6 citations


Proceedings ArticleDOI
30 May 1994
TL;DR: Context-dependent phoneme-based HMMs are used to capture the fine phonetic detail that is required to discriminate such a confusable vocabulary and reveal that context-dependent modeling gives about 9% improvement on speaker-independent performance over whole-word modeling, and an 18% improved on the E-set.
Abstract: Alphabet recognition is known to be a difficult task due to the acoustic similarities among different letters, especially letters in the E-set. Recognition systems based on whole-word Hidden Markov Models (HMM) perform poorly on this task due to the inability of the models to capture fine phonetic details, especially details occurring within segments of short duration. Letters B and D, for example, differ mainly in the 10-20 msec segment prior to vowel onset. In this paper, we use context-dependent phoneme-based HMMs to capture the fine phonetic detail that is required to discriminate such a confusable vocabulary. Our results reveal that context-dependent modeling gives about 9% improvement on speaker-independent performance over whole-word modeling, and an 18% improvement on the E-set. Furthermore, using an improved spectral representation of the stop consonants in the E-set, an additional 6% improvement in the E-set can be achieved. Our best speaker-independent E-set performance over 15 speakers is 90.3%, with overall alphabet recognition of 94.1%. >

2 citations


Proceedings ArticleDOI
31 Oct 1994
TL;DR: In this article, an analysis/synthesis technique based on harmonic sinusoidal modeling of speech is used to develop a new hidden Markov model (HMM) based speech enhancement algorithm.
Abstract: An analysis/synthesis technique based on harmonic sinusoidal modeling of speech is used to develop a new hidden Markov model (HMM) based speech enhancement algorithm State sequence estimation is done using a standard HMM-based approach State-based enhancement is carried out by assuming a harmonic model for speech, ie, by representing each block of speech as a sum of sine waves in terms of a set of amplitudes, phases, and harmonically related frequencies Given the maximum a-posteriori probability (MAP) state sequence, the amplitudes, phases, voicing, and fundamental frequency are estimated Simulation results are presented, comparing the performance of the proposed algorithm to that of a standard HMM-based approach The proposed method was found to reduce the structured residual noise normally associated with HMM-based algorithms >

2 citations


Proceedings ArticleDOI
30 May 1994
TL;DR: A variety of architectures for implementing block adaptive filters in the time-domain based on a block implementation of the least mean squares algorithm, which have a significantly smaller sample period compared to frequency domain implementations.
Abstract: In this paper we propose a variety of architectures for implementing block adaptive filters in the time-domain. These filters are based on a block implementation of the least mean squares (BLMS) algorithm. First, we present an architecture which directly maps the BLMS algorithm into an array of processors. Next, we describe an architecture where the weight vector is updated without explicitly computing the filter error. Third, we describe an architecture which exploits the redundant computations of overlapping windows. All the architectures have a significantly smaller sample period compared to frequency domain implementations. Moreover, the sample periods can be reduced even further by applying relaxed look-ahead techniques. >

1 citations