scispace - formally typeset
Search or ask a question
Author

Yutaka Kaneda

Bio: Yutaka Kaneda is an academic researcher from Tokyo Denki University. The author has contributed to research in topics: Impulse response & Signal. The author has an hindex of 21, co-authored 91 publications receiving 2629 citations. Previous affiliations of Yutaka Kaneda include Nippon Telegraph and Telephone & Spacelabs Healthcare.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a novel method is proposed for realizing exact inverse filtering of acoustic impulse responses in room, based on the principle called the multiple-input/output inverse theorem (MINT).
Abstract: A novel method is proposed for realizing exact inverse filtering of acoustic impulse responses in room. This method is based on the principle called the multiple-input/output inverse theorem (MINT). The inverse is constructed from multiple finite-impulse response (FIR) filters (transversal filters) by adding some extra acoustic signal-transmission channels produced by multiple loudspeakers or microphones. The coefficients of these FIR filters can be computed by the well-known rules of matrix algebra. Inverse filtering in a sound field is investigated experimentally. It is shown that the proposed method is greatly superior to previous methods that use only one acoustic signal-transmission channel. The results prove the possibility of sound reproduction and sound reception without any distortion caused by reflected sounds. >

734 citations

Journal ArticleDOI
TL;DR: The superiority of the AMNOR criterion over conventional LMS and constrained LMS criteria for reducing noise in speech signals was confirmed in subjective preference tests.
Abstract: This paper introduces a new adaptive microphone-array system for noise reduction (AMNOR system). It is first shown that there exists a tradeoff relationship between reducing the output noise power and reducing the frequency response degradation of a microphone-array to a desired signal. It is then shown that this tradeoff can be controlled by the introduction of a fictitious desired signal. A new optimization criterion is presented which minimizes the output noise power while maintaining the frequency response degradation below some pre-determined value (AMNOR criterion). AMNOR determines an optimal noise reduction filter based on this criterion by controlling the tradeoff utilizing the fictitious desired signal. Experiments on noise reduction processing were carried out in a room with a 0.4-s reverberation time. The superiority of the AMNOR criterion over conventional LMS and constrained LMS criteria for reducing noise in speech signals was confirmed in subjective preference tests. The AMNOR system improved the SNR by more than 15 dB in the 300-3200 Hz range.

278 citations

Journal ArticleDOI
TL;DR: A normalized least-mean-squares (NLMS) adaptive algorithm with double the convergence speed, at the same computational load, of the conventional NLMS for an acoustic echo canceller is proposed and its fast convergence is demonstrated.
Abstract: A normalized least-mean-squares (NLMS) adaptive algorithm with double the convergence speed, at the same computational load, of the conventional NLMS for an acoustic echo canceller is proposed. This algorithm, called the ES (exponentially weighted stepsize) algorithm, uses a different stepsize (feedback constant) for each weight of an adaptive transversal filter. These stepsizes are time-invariant and weighted proportionally to the expected variation of a room impulse response. The algorithm adjusts coefficients with large errors in large steps, and coefficients with small errors in small steps. A transition formula is derived for the mean-squared coefficient error of the algorithm. The mean stepsize determines the convergence condition, the convergence speed, and the final excess mean-squared error. Modified for a practical multiple DSP structure, the algorithm requires only the same amount of computation as the conventional NLMS. The algorithm is implemented in a commercial acoustic echo canceller, and its fast convergence is demonstrated. >

148 citations

Journal ArticleDOI
TL;DR: A method of segregating desired speech from concurrent sounds received by two microphones that improved the signal-to-noise ratio by over 18dB and clarified the effect of frequency resolution on the proposed method.
Abstract: We have developed a method of segregating desired speech from concurrent sounds received by two microphones. In this method, which we call SAFIA, signals received by two microphones are analyzed by discrete Fourier transformation. For each frequency component, differences in the amplitude and phase between channels are calculated. These differences are used to select frequency components of the signal that come from the desired direction and to reconstruct these components as the desired source signal. To clarify the effect of frequency resolution on the proposed method, we conducted three experiments. First, we analyzed the relationship between frequency resolition and the power spectrum’s cumulative distribution. We found that the speech-signal power was concentrated on specific frequency components when the frequency resolution was about 10-Hz. Second, we determined whether a given frequency resolution decreased the overlap between the frequency components of two speech signals. A 10-Hz frequency resolution minimized the overlap. Third, we analyzed the relationship between sound quality and frequency resolution through subjective tests. The best frequency resolution in terms of sound quality corresponded to the frequency resolutions that concentrated the speech signal power on specific frequency components and that minimized the degree of overlap. Finally, we demonstrated that this method improved the signal-to-noise ratio by over 18dB.

144 citations

Journal ArticleDOI
TL;DR: A new model for a room transfer function (RTF) by using common acoustical poles that correspond to resonance properties of a room is proposed, which requires far fewer variable parameters to represent RTF's than the conventional all-zero or pole/zero model.
Abstract: A new model for a room transfer function (RTF) by using common acoustical poles that correspond to resonance properties of a room is proposed. These poles are estimated as the common values of many RTF's corresponding to different source and receiver positions. Since there is one-to-one correspondence between poles and AR coefficients, these poles are calculated as common AR coefficients by two methods: (i) using the least squares method, assuming all the given multiple RTF's have the same AR coefficients and (ii) averaging each set of AR coefficients estimated from each RTF. The estimated poles agree well with the theoretical poles when estimated with the same order as the theoretical pole order. When estimated with a lower order than the theoretical pole order, the estimated poles correspond to the major resonance frequencies, which have high Q factors. Using the estimated common AR coefficients, the proposed method models the RTF's with different MA coefficients. This model is called the common-acoustical-pole and zero (CAPZ) model, and it requires far fewer variable parameters to represent RTF's than the conventional all-zero or pole/zero model. This model was used for an acoustic echo canceller at low frequencies, as one example. The acoustic echo canceller based on the proposed model requires half the variable parameters and converges 1.5 times faster than one based on the all-zero model, confirming the efficiency of the proposed model. >

131 citations


Cited by
More filters
01 Jan 2012
TL;DR: The standardization of the IC model is talked about, and on the basis of n independent copies of x, the aim is to find an estimate of an unmixing matrix Γ such that Γx has independent components.

2,296 citations

Journal ArticleDOI
TL;DR: The results demonstrate that there exist ideal binary time-frequency masks that can separate several speech signals from one mixture and show that the W-disjoint orthogonality of speech can be approximate in the case where two anechoic mixtures are provided.
Abstract: Binary time-frequency masks are powerful tools for the separation of sources from a single mixture. Perfect demixing via binary time-frequency masks is possible provided the time-frequency representations of the sources do not overlap: a condition we call W-disjoint orthogonality. We introduce here the concept of approximate W-disjoint orthogonality and present experimental results demonstrating the level of approximate W-disjoint orthogonality of speech in mixtures of various orders. The results demonstrate that there exist ideal binary time-frequency masks that can separate several speech signals from one mixture. While determining these masks blindly from just one mixture is an open problem, we show that we can approximate the ideal masks in the case where two anechoic mixtures are provided. Motivated by the maximum likelihood mixing parameter estimators, we define a power weighted two-dimensional (2-D) histogram constructed from the ratio of the time-frequency representations of the mixtures that is shown to have one peak for each source with peak location corresponding to the relative attenuation and delay mixing parameters. The histogram is used to create time-frequency masks that partition one of the mixtures into the original sources. Experimental results on speech mixtures verify the technique. Example demixing results can be found online at http://alum.mit.edu/www/rickard/bss.html.

1,543 citations

Journal ArticleDOI
TL;DR: The importance of having a clear understanding of the principles behind both the acoustics and the electrical control in order to appreciate the advantages and limitations of active noise control is emphasized.
Abstract: Active noise control exploits the long wavelengths associated with low frequency sound. It works on the principle of destructive interference between the sound fields generated by the original primary sound source and that due to other secondary sources, acoustic outputs of which can be controlled. The acoustic objectives of different active noise control systems and the electrical control methodologies that are used to achieve these objectives are examined. The importance of having a clear understanding of the principles behind both the acoustics and the electrical control in order to appreciate the advantages and limitations of active noise control is emphasized. A brief discussion of the physical basis of active sound control that concentrates on three-dimensional sound fields is presented. >

965 citations

Journal ArticleDOI
D.L. Duttweiler1
TL;DR: On typical echo paths, the proportionate normalized least-mean-squares (PNLMS) adaptation algorithm converges significantly faster than the normalized at-a-glance-time (NLMS) algorithm generally used in echo cancelers to date.
Abstract: On typical echo paths, the proportionate normalized least-mean-squares (PNLMS) adaptation algorithm converges significantly faster than the normalized least-mean-squares (NLMS) algorithm generally used in echo cancelers to date. In PNLMS adaptation, the adaptation gain at each tap position varies from position to position and is roughly proportional at each tap position to the absolute value of the current tap weight estimate. The total adaptation gain being distributed over the taps is carefully monitored and controlled so as to hold the adaptation quality (misadjustment noise) constant. PNLMS adaptation only entails a modest increase in computational complexity.

862 citations

Journal ArticleDOI
S. Biyiksiz1
01 Mar 1985
TL;DR: This book by Elliott and Rao is a valuable contribution to the general areas of signal processing and communications and can be used for a graduate level course in perhaps two ways.
Abstract: There has been a great deal of material in the area of discrete-time transforms that has been published in recent years. This book does an excellent job of presenting important aspects of such material in a clear manner. The book has 11 chapters and a very useful appendix. Seven of these chapters are essentially devoted to the Fourier series/transform, discrete Fourier transform, fast Fourier transform (FFT), and applications of the FFT in the area of spectral estimation. Chapters 8 through 10 deal with many other discrete-time transforms and algorithms to compute them. Of these transforms, the KarhunenLoeve, the discrete cosine, and the Walsh-Hadamard transform are perhaps the most well-known. A lucid discussion of number theoretic transforms i5 presented in Chapter 11. This reviewer feels that the authors have done a fine job of compiling the pertinent material and presenting it in a concise and clear manner. There are a number of problems at the end of each chapter, an appreciable number of which are challenging. The authors have included a comprehensive set of references at the end of the book. In brief, this book is a valuable contribution to the general areas of signal processing and communications. It can be used for a graduate level course in perhaps two ways. One would be to cover the first seven chapters in great detail. The other would be to cover the whole book by focussing on different topics in a selective manner. This book by Elliott and Rao is extremely useful to researchers/engineers who are working in the areas of signal processing and communications. It i s also an excellent reference book, and hence a valuable addition to one’s library

843 citations