scispace - formally typeset
Search or ask a question

Showing papers on "Microphone array published in 1998"


Journal ArticleDOI
TL;DR: A theoretical analysis of noise reduction and dereverberation algorithms based on a microphone array combined with a Wiener postfilter shows an appreciable reduction of acoustic echo and localized noise is obtained and makes the whole system highly attractive for hands-free communication systems.
Abstract: In teleconferencing systems, the use of hands-free sound pick-up reduces speech quality. This is due to ambient noise, acoustic echo, and the reverberation produced by the acoustical environment. This paper sets out to present a theoretical analysis of noise reduction and dereverberation algorithms based on a microphone array combined with a Wiener postfilter. It is shown that the transfer function of the postfilter depends on the input signal-to-noise ratio (SNR) and on the noise reduction yielded by the array. The use of a directivity-controlled array instead of a conventional beam-former is proposed to improve the performance of the whole system. Examples in real room environments are provided, which confirm the theoretical results, It is observed that the postfilter gives a limited reduction of the reverberation. On the contrary, an appreciable reduction of acoustic echo and localized noise is obtained and makes the whole system highly attractive for hands-free communication systems.

276 citations


PatentDOI
TL;DR: In this paper, an apparatus and method in a video conference system provides accurate determination of the position of a speaking participant by measuring the difference in arrival times of a sound originating from the speaking participant, using as few as four microphones in a 3D configuration.
Abstract: An apparatus and method in a video conference system provides accurate determination of the position of a speaking participant by measuring the difference in arrival times of a sound originating from the speaking participant, using as few as four microphones in a 3-dimensional configuration. In one embodiment, a set of simultaneous equations relating the position of the sound source and each microphone and relating to the distance of each microphone to each other are solved off-line and programmed into a host computer. In one embodiment, the set of simultaneous equations provide multiple solutions and the median of such solutions is picked as the final position. In another embodiment, an average of the multiple solutions are provided as the final position.

233 citations


Proceedings ArticleDOI
02 Jun 1998
TL;DR: In this paper, the noise sources of landing commercial aircraft were examined with planar arrays consisting of 96 or 111 microphones mounted on an 8 m by 8 m plate under the glide path on the ground.
Abstract: The noise sources of landing commercial aircraft were examined with planar arrays consisting of 96 or 111 microphones mounted on an 8 m by 8 m plate under the glide path on the ground. It is shown that important airframe noise sources can be identified in spite of the presence of engine noise, i.e., landing-gear noise, flap side-edge noise, flap-gap noise, jet-flap interaction noise, slat-horn noise, slat-track noise. A surprising finding is a noise source near the wing tips of some aircraft which is tentatively called wake-vortex wing interaction noise. It is shown to be the by far strongest noise source (6 dB(A) louder than the engines) on a regional jet aircraft.

113 citations


Journal ArticleDOI
TL;DR: The use of multi-microphone systems to overcome some undesired effects caused by room acoustics and by coherent/incoherent noise (e.g. competitive talkers, computer fans).

91 citations


Journal ArticleDOI
TL;DR: This two-part article presents the full design and its justifications of the Huge Microphone Array and discusses performance for a few important algorithms relative to the use of processing capability, response latency, and difficulty of programming.
Abstract: The Huge Microphone Array project began in February 1994 to design, construct, debug, and test a real-time 512-microphone array system and to develop algorithms for it. Analysis of known algorithms indicated that signal-processing performance of over 6 Gflops would be required, while the need for portability-fitting it into a small van-also set an upper limit to the power required. These trade-offs and many others have led to a unique design in both hardware and software. This two-part article presents the full design and its justifications. The authors also discuss performance for a few important algorithms relative to the use of processing capability, response latency, and difficulty of programming.

71 citations


Proceedings ArticleDOI
12 May 1998
TL;DR: This method estimates noise using a subtractive microphone array and subtracts them from the noisy speech signal using spectral subtraction (SS) and can reduce LPC log spectral envelope distortions.
Abstract: This paper proposes a method of noise reduction by paired microphones as a front-end processor for speech recognition systems. This method estimates noise using a subtractive microphone array and subtracts them from the noisy speech signal using spectral subtraction (SS). Since this method can estimate noise analytically and frame by frame, it is easy to estimate noise not depending on these acoustic properties. Therefore, this method can also reduce non-stationary noise, for example sudden noise when a door has just closed, which cannot be reduced by other SS methods. The results of computer simulations and experiments in a real environment show that this method can reduce LPC log spectral envelope distortions.

61 citations


Proceedings ArticleDOI
12 May 1998
TL;DR: Evaluation through a real time signal-processing system demonstrates that noise reduction achieved by the RAMA is over 12 dB even in reverberant environments, and the AMC based on the SNR estimate causes less breathing noise than the conventional AMC.
Abstract: A robust adaptive microphone array (RAMA) using a new adaptation-mode control method (AMC) is proposed, and its evaluation results by hardware are presented. The adaptation of the RAMA is controlled based on an SNR (signal-to-noise) estimate using the output powers of the fixed beamformer and the adaptive blocking matrix. The RAMA is implemented on a multi-DSP real time signal-processing system with a C-compiler. Simulation results with real acoustic data show that the AMC based on the SNR estimate causes less breathing noise than the conventional AMC and that it obtains 1.0-point higher score on a 5-point mean opinion score scale. Evaluation through a real time signal-processing system demonstrates that noise reduction achieved by the RAMA is over 12 dB even in reverberant environments.

58 citations


Patent
24 Apr 1998
TL;DR: In this article, a telephone system includes two or more cardioid microphones held together and directed outwardly from a central point, and control circuitry combines and analyzes signals from the microphones and selects the signal from one of the microphones or from one or more predetermined combinations of microphone signals in order to track a speaker as the speaker moves about a room or as various speakers situated about the room speak then fall silent.
Abstract: A telephone system includes two or more cardioid microphones held together and directed outwardly from a central point. Mixing circuitry and control circuitry combines and analyzes signals from the microphones and selects the signal from one of the microphones or from one of one or more predetermined combinations of microphone signals in order to track a speaker as the speaker moves about a room or as various speakers situated about the room speak then fall silent. Visual indicators, in the form of light emitting diodes (LEDs) are evenly spaced around the perimeter of a circle concentric with the microphone array. Mixing circuitry produces ten combination signals, A+B, A+C, B+C, A+B+C, A−B, B−C, A−C, A−0.5(B+C), B−0.5(A+C), and C−0.5(B+A), with the “listening beam” formed by combinations, such as A−0.5(B+C), that involve the subtraction of signals, generally being more narrowly directed than beams formed by combinations, such as A+B, that involve only the addition of signals. An omnidirectional combination A+B+C is employed when active speakers are widely scattered throughout the room. Weighting factors are employed in a known manner to provide unity gain output. Control circuitry selects the signal from the microphone or from one of the predetermined microphone combinations, based generally on the energy level of the signal, and employs the selected signal as the output signal. The control circuitry also operates to limit dithering between microphones and, by analyzing the beam selection pattern, may switch to a broader coverage pattern, rather than switching between two narrower beams that each covers one of the speakers.

56 citations


Proceedings ArticleDOI
12 May 1998
TL;DR: A robust speech recognition system for videoconference applications is presented based on a microphone array that is able to know the position of the users and increase the signal-to-noise ratio (SNR) between the desired speaker signal and the interference from the other users.
Abstract: A robust speech recognition system for videoconference applications is presented based on a microphone array. By means of a microphone array, the speech recognition system is able to know the position of the users and increase the signal-to-noise ratio (SNR) between the desired speaker signal and the interference from the other users. The user positions are estimated by means of the combination of a direction of arrival (DOA) estimation method with a speaker identification system. The beamforming is performed by using the spatial references of the desired speaker and the interference locations. A minimum variance algorithm with spatial constraints working in the frequency domain is used to design the weights of the broadband microphone array. Results of the speech recognition system are reported in a simulated environment with several users asking questions to a geographic data base.

52 citations


Proceedings ArticleDOI
12 May 1998
TL;DR: This paper addresses the limitations of current approaches to distant-talker speech acquisition and advocates the development of techniques which explicitly incorporate the nature of the speech signal into a multi-channel context.
Abstract: This paper addresses the limitations of current approaches to distant-talker speech acquisition and advocates the development of techniques which explicitly incorporate the nature of the speech signal (e.g. statistical non-stationarity, method of production, pitch, voicing, formant structure, and source radiator model) into a multi-channel context. The goal is to combine the advantages of spatial filtering achieved through beamforming with knowledge of the desired time-series attributes. The potential utility of such an approach is demonstrated through the application of a multi-channel version of the dual excitation speech model.

46 citations


Proceedings ArticleDOI
12 May 1998
TL;DR: A nonlinear coherence filtering and a linear Wiener filtering are combined in the wavelet transform domain to improve the performance of the Wiener filter based postfilter, especially during pauses.
Abstract: Wiener filter based postfiltering has shown its usefulness in microphone array speech enhancement systems. In our earlier work, we developed a postfilter in the wavelet domain where better performance has been obtained compared to the algorithms developed in the Fourier domain. Furthermore, considerable computational savings are provided thanks to the multi-resolution and multi-rate analysis. This contribution shows that the coherence function, calculated between the beamforming output signal and the reference microphone output signal using the wavelet transform, provides a relevant and exploitable information for further noise suppression. Thus, a nonlinear coherence filtering and a linear Wiener filtering are combined in the wavelet transform domain to improve the performance of the Wiener filter based postfilter, especially during pauses. Evaluations of the new algorithm confirm that the speech quality is indeed improved with significantly reduced distortions. Finally, the results of the objective measures are presented.

Proceedings ArticleDOI
31 May 1998
TL;DR: Computer simulations results show that the proposed method can be used to estimate the DOA of the speech signal at a low signal to noise ratio and with strong interferences and reverberations.
Abstract: In this paper, we address the problem of DOA estimation for speech signal with a linear equal space microphone array. Based on the speech signal spectrum, a frequency selective wideband DOA method is proposed without a priori knowledge about the source location. Resampling, spatial interpolation and least square technique are employed in the proposed DOA estimation for the far field and near field cases. Computer simulations results show that the proposed method can be used to estimate the DOA of the speech signal at a low signal to noise ratio and with strong interferences and reverberations.

Proceedings ArticleDOI
W. Tager1
12 May 1998
TL;DR: It is shown that if the desired source is in the near-field and the other sources are in the far-field, then even a small array can be at the same time highly directive and comparatively robust.
Abstract: In some array applications, the source of interest is close to the array, so that we have to use a near-field model. Almost always the near-field is considered as an additional difficulty. We contradict this point of view and show that if the desired source is in the near-field and the other sources are in the far-field, then even a small array can be at the same time highly directive and comparatively robust. Instead of relying on small phase differences for low frequencies, we fully exploit the fact that the amplitude vector of the source of interest is different from that of any other source. The array geometry should be chosen to enhance this effect. Unlike far-field superdirectivity, we can steer the main lobe to arbitrary directions without prohibitive loss of performance. We applied our method to microphone array sound pick up for workstations. Simulation results and measurements of a real time implementation on a fixed point DSP are provided.

Journal ArticleDOI
TL;DR: In this article, the authors derived an expression which can be used to determine the extent of the near field for a given microphone array size and operating wavelength, based on the distortion of the antenna response, and approximated the distance at which the magnitude response error due to plane-wave beamforming is 1 dB.
Abstract: An expression is derived which can be used to determine the extent of the near field for a given microphone array size and operating wavelength. This expression is based on the distortion of the near-field array response, and approximates the distance at which the magnitude response error due to plane-wave beamforming is 1 dB.

Journal ArticleDOI
TL;DR: In this article, a noise source identification technique is proposed for industrial applications by using a microphone array and beamforming algorithms, both of the directions and the distances of long-range noise sources are calculated.
Abstract: A noise source identification technique is proposed for industrial applications by using a microphone array and beamforming algorithms. Both of the directions and the distances of long-range noise sources are calculated. The conventional method, the minimum variance (MV) method, and the multiple signal classification (MUSIC) method are the main beamforming algorithms employed in this study. The results of numerical simulations and field tests indicate the effectiveness of the acoustic beamformer in identifying noise sources in industrial environments.

PatentDOI
TL;DR: In this article, a control circuit is used to control the gain of a pair of VCA's coupled to the fixed array processors in order to produce an output signal from a source in an on-beam direction relative to the microphone array.
Abstract: An apparatus and method for processing sound, suitable for use in association with a hearing aid, cochlear implant prosthesis or the like. Coupled to an array of microphones (1) are a pair of fixed array processors (2,4) each having different characteristic signal-to-noise performances and internal noise parameters in different levels of ambient noise. Based upon an ambient noise estimate derived from noise floor detector (8) a control circuit (5) controls the gain of a pair of VCA's (7,9) coupled to the fixed array processors (2,4) in order to produce an output signal from summer (16) which maximises the signal-tonoise ratio of a signal emanating from a source in an on-beam direction relative to the microphone array (10).

Proceedings ArticleDOI
12 May 1998
TL;DR: Results for a tracking, real-time microphone-array as an input to an HMM-based connected alpha-digits speech recognizer show that for a talker in the very near field of the array, performance approaches that of a close-talking microphone input device.
Abstract: A major problem for speech recognition systems is relieving the talker of the need to use a close-talking, head-mounted or a deskstand microphone. A likely solution is the use of an array of microphones that can steer itself to the talker and can use a beamforming algorithm to overcome the reduced signal-to-noise ratio due to room acoustics. This paper reports results for a tracking, real-time microphone-array as an input to an HMM-based connected alpha-digits speech recognizer. For a talker in the very near field of the array (within a meter), performance approaches that of a close-talking microphone input device. The effects of both the noise reducing steered array and the use of a maximum a posteriori (MAP) training step are shown to be significant. Here, the array system and the recognizer are described, experiments are presented, and the implications of combining these two systems discussed.

Proceedings ArticleDOI
12 May 1998
TL;DR: A new speech recognition algorithm which considers multiple talker direction hypotheses simultaneously and performs a Viterbi search in 3-dimensional trellis space composed of talker directions, input frames, and HMM states shows that the proposed algorithm works well even if the talker moves.
Abstract: A microphone array is a promising solution for realizing hands-free speech recognition in real environments. Accurate talker localization is very important for speech recognition using a microphone array. However localization of a moving talker is difficult in noisy reverberant environments. Talker localization errors degrade the performance of speech recognition. To solve the problem, this paper proposes a new speech recognition algorithm which considers multiple talker direction hypotheses simultaneously. The proposed algorithm performs a Viterbi search in 3-dimensional trellis space composed of talker directions, input frames, and HMM states. As a result, a locus of the talker and a phoneme sequence of the speech are obtained by finding an optimal path with the highest likelihood. To evaluate the performance of the proposed algorithm, speech recognition experiments are carried out on simulated data and real environment data. These results show that the proposed algorithm works well even if the talker moves.

Patent
TL;DR: In this article, a microphone array is arranged in such a manner that the distances between the microphones and the cut-off frequencies for the low-pass filters are mutually adjusted in relation to one another.
Abstract: Microphone array which comprises a multiple of microphones which are arranged in an elongated element or housing, in which the individual microphones in the microphone array are arranged in pairs. The individual microphones in each pair are disposed on each their side of a centerline for the microphone array, where the signals from the microphones are summated in the formation of the output signal from the microphone array. The microphones on each side of the centerline of the microphone array are disposed with non-equidistant spacing between them, and low-pass filters are coupled between each microphone and a summation link, in that the microphones associated with one and the same pair are connected to low-pass filters having the same cut-off frequency. The cut-off frequency for the low-pass filters is different for each pair of microphones, in that the cut-off frequency is lowest for that pair of microphones which lie furthest away from the centerline, and is higher the closer the pair of microphones lies to the centerline. The microphone array is arranged in such a manner that the distances between the microphones and the cut-off frequencies for the low-pass filters are mutually adjusted in relation to one another.

Journal ArticleDOI
TL;DR: In this article, a new method to locate high speed moving sound sources based on a bilinear time-frequency transformation is proposed, where a microphone array, used as a directional sensor, is optimized in order to focus on the linear part of the frequency modulation due to the Doppler effect.

Journal ArticleDOI
TL;DR: Frequency-invariant beam-forming is used to perform frequency domain averaging, thereby reducing the correlation between the desired signal and the interference, and is useful for hands-free speech acquisition using a microphone array.
Abstract: A new technique for broadband minimum variance beam-forming is presented that overcomes the signal cancellation problem of conventional adaptive beamformers in the presence of correlated interference. Specifically, frequency-invariant beam-forming is used to perform frequency domain averaging, thereby reducing the correlation between the desired signal and the interference. Such a technique is useful for hands-free speech acquisition using a microphone array, since correlated interference will be present due to room reflections.

Patent
16 Jul 1998
TL;DR: In this article, a noise measuring system comprising a microphone array and a directivity forming section for controlling the direction of directivity of the microphone array is described, and the optimal value of a delay time of the delay section is set in such a manner that a noise isolation characteristic for causing any mobile noise generator not targeted for noise measurement to come off a directional plane or a directional line formed in the direction in which a deterioration characteristic of the directivity is balanced with each other.
Abstract: In a noise measuring system comprising a microphone array and a directivity forming section for controlling the direction of directivity of the microphone array, the directivity forming section includes delay section and adding section, and the optimal value of a delay time of the delay section is set in such a manner that a noise isolation characteristic for causing any mobile noise generator not targeted for noise measurement to come off a directional plane or a directional line formed in the direction of directivity of the microphone array and a deterioration characteristic of the directivity of the microphone array balance with each other.

Proceedings ArticleDOI
04 May 1998
TL;DR: In this paper, a microphone array system for use in handsfree mobile telephone equipment is presented based on a fast and efficient "on-site" and self-calibration scheme.
Abstract: Presents a microphone array system for use in handsfree mobile telephone equipment. The array is based on a fast and efficient "on-site" and "self-calibration" scheme. The performance in suppressing the interior car cabin noise and the far-end speech is approximately 17 dB, respectively, while maintaining the near-end speaker level. The near-end signal is almost undistorted. The performance of two different algorithms, normalized least-mean-square (NLMS) and fully connected backpropagation supervised neural network (MLP-NN) are evaluated. The proposed microphone array calibration scheme can also be used in other situations such as speech recognition devices.


Journal ArticleDOI
Masato Abe1, K. Fujii, Y. Nagata, T. Sone, K. Kido 
TL;DR: A method to estimate the waveform of sound sources by using a microphone array where the microphones are not situated close to the sound sources, using the sound velocity and positions of the microphones and sound sources as the only fixed parameters is proposed.
Abstract: This paper proposes a method to estimate the waveform of sound sources. This is accomplished by using a microphone array where the microphones are not situated close to the sound sources. The waveform is estimated using the sound velocity and positions of the microphones and sound sources as the only fixed parameters. The positions of the sound sources can be estimated in advance by using a conventional method such as multiple signal classification method (MUSIC). An iteration method is introduced to reduce the effect of direct sounds and of significant image sources due to reflected noise and/or other sound sources, whose positions are known. Computer simulations and an experiment with loudspeakers were conducted to demonstrate the effectiveness of our method.

Proceedings ArticleDOI
01 Nov 1998
TL;DR: In this article, the authors describe a method of correcting for this high frequency gain without significantly degrading the noise canceling properties of first and second-order differential microphones, which is the most common configuration of differential microphones.
Abstract: Directional microphones are best noted for their noise reduction properties in communication systems. Close-talking differential microphones are particularly useful when the noise environment disturbs the ability to communicate without error, such as in public and cellular telephony, aircraft communications, etc. These differential microphones work best when they are spaced within 1 cm from the lips of the talker where the sound field has a large gradient. For a plane-wave sound field, the sensitivity rises proportional to /spl omega//sup n/ where n is the order of the difference. Users of differential microphones do not always correctly position the sensor at the proper distance from the mouth and therefore the sensitivity of the microphone may also rise proportional to /spl omega//sup n/ especially at high frequencies. We describe a method of correcting for this high frequency gain without significantly degrading the noise canceling properties of first and second-order differential microphones.

Proceedings ArticleDOI
29 Sep 1998
TL;DR: This paper presents a new speech source localization method in an adverse environment for microphone array systems applied to an octave-band decomposition of the signals obtained by the wavelet transform which uses a fast algorithm with short prototype filters.
Abstract: This paper presents a new speech source localization method in an adverse environment for microphone array systems. This method is applied to an octave-band decomposition of the signals obtained by the wavelet transform which uses a fast algorithm with short prototype filters. This method is applied in sub-bands separately and consists of two stages. First, a coarse region where the speech source is present is detected. Then the multi-beamforming operation is used to pinpoint the speaker's location. Both stages are based on the examination of the energy level and its variation. The complete algorithm provides relatively small errors in the source localization estimate.

Proceedings Article
01 Jan 1998
TL;DR: The proposed structure, based on a broadband subband-nested array, performs real-time estimations of the spatial coherence in order to determine the coherent/diffuse nature of the different subbands, using different filters in each case, improving also the classical Wiener post-filter, typically used for diffuse noise supression, for proper cancellation of coherent noises.
Abstract: In this paper, the acoustic characteristics of sound fields in enclosed rooms are studied in the joint presence of speech and noise, in order to design a broadband microphone array system capable of coping with both coherent and diffuse noises. Several state-of-the-art speech enhancement array structures are presented and compared to our new system in terms of correct word recognition rates in a simple command and control task. The proposed structure, based on a broadband subband-nested array, performs real-time estimations of the spatial coherence in order to determine the coherent/diffuse nature of the different subbands, using different filters in each case, improving also the classical Wiener post-filter, typically used for diffuse noise supression, for proper cancellation of coherent noises. The results obtained with a 15-channel simultaneous recording database in different reverberation and noise conditions show better performance than other structures previously proposed.

Patent
03 Apr 1998
TL;DR: In this paper, the authors propose to obtain a desired multiple linear microphone beam by selectively weighting a difference signal between respective microphone output signals in pairs and coupling them so as to strictly control beam shaping and maneuver of the array.
Abstract: PROBLEM TO BE SOLVED: To obtain a desired multiple linear microphone beam by selectively weighting a difference signal between respective microphone output signals in pairs and coupling them so as to strictly control beam shaping and maneuver of the array. SOLUTION: The microphone array comprises six small pressure-sensing omnidirective microphones 221-223 which are mounted flat on a surface of a hard Nylon ball 220 whose diameter is, e.g. 1.9 cm (3/4 inch). The microphones 221-223 are preferably placed at points with which apexes of a regular octahedron included on a spherical surface are in contact. A single or plural general linear differential microphone beams that are directed to an optional or plural angles in a 3-dimensional space by selectively coupling the three Descartes orthogonal pairs with a proper scalar weight. The microphone array as above is considered to be used for applications of surround sound recording/ reproduction and virtual reality. COPYRIGHT: (C)1998,JPO

Proceedings ArticleDOI
06 Oct 1998
TL;DR: In this paper, a visual information directed microphone array system is presented, which uses a real-time mouth tracking system to direct a beam-former focusing on the mouth, and reports better signal-to-noise ratio sound capturing in a high noise environment.
Abstract: A visual information directed microphone array system is presented in this paper. This system uses a real-time mouth tracking system to direct a beam-former focusing on the mouth. The microphone array system is implemented on a PC with a Signalogic 8-channel DSP board and reports a better signal-to-noise ratio sound capturing in a high noise environment.