scispace - formally typeset
Search or ask a question

Showing papers on "Microphone array published in 1994"


PatentDOI
TL;DR: In this paper, a teleconferencing system with a video camera for generating a video signal representative of a video image of a first station B and a microphone array (150, 160) for receiving a sound from one or more fixed non-overlapping volume zones (151-159) into which the first station is divided.
Abstract: A teleconferencing system (100) is disclosed having a video camera for generating a video signal representative of a video image of a first station B. A microphone array (150, 160) is also provided in the first station for receiving a sound from one or more fixed non-overlapping volume zones (151-159) into which the first station is divided. The microphone array is also provided for generating a monochannel audio signal (170) representative of the received sound and a direction signal indicating, based on the sound received from each zone, from which of the volume zones the sound originated. The teleconferencing system also includes a display device (120A) at a second station A for displaying a video image of the first station. A loudspeaker control device (140) is also provided at the second station for selecting a virtual location (121) on the displayed video image depending on the direction signal, and for generating stereo sound from the monochannel audio signal which stereo sound emanates from the virtual location on the displayed video image.

351 citations


PatentDOI
Juergen Cezanne1, Gary W. Elko1
TL;DR: In this paper, a method of apparatus of enhancing the signal-to-noise ratio of a microphone array is proposed, where the array includes a plurality of microphones and has a directivity pattern which is adjustable based on one or more parameters.
Abstract: The present invention is directed to a method of apparatus of enhancing the signal-to-noise ratio of a microphone array. The array includes a plurality of microphones and has a directivity pattern which is adjustable based on one or more parameters. The parameters are evaluated so as to realize an angular orientation of a directivity pattern null. This angular orientation of the directivity pattern null reduces microphone array output signal level. Parameter evaluation is performed under a constraint that the null be located within a predetermined region of space. Advantageously, the predetermined region of space is a region from which undesired acoustic energy is expected to impinge upon the array, and the angular orientation of a directivity pattern null substantially aligns with the angular orientation of undesired acoustic energy. Output signals of the array microphones are modified based on one or more evaluated parameters. An array output signal is formed based on modified and unmodified microphone output signals. The evaluation of parameters, the modification of output signals, and the formation of an array output signal may be performed a plurality of times to obtain an adaptive array response. Embodiments of the invention include those having a plurality of directivity patterns corresponding to a plurality of frequency subbands. Illustratively, the array may comprise a plurality of cardioid sensors.

188 citations


Journal Article
TL;DR: In this paper, the theory and design of a logarithmically spaced microphone array with a spatial filtering capability are presented. And the spatial filtering reduces the effects of reverberation and interfering noise sources that degrade the quality of sound pickup in teleconference systems.
Abstract: The theory and the design of a logarithmically spaced microphone array with a spatial filtering capability are presented. The spatial filtering reduces the effects of reverberation and interfering noise sources that degrade the quality of sound pickup in teleconference systems. This feature is a result of the array's ability to discriminate against sound arrivals from all directions except that of the desired source

73 citations


Journal ArticleDOI
TL;DR: The problem of combining the outputs of an array of microphones as a single input for a hearing aid is investigated and realistic predictions of speech enhancement provided by robust adaptive microphone array processors are discussed.
Abstract: The problem of combining the outputs of an array of microphones as a single input for a hearing aid is investigated. Emphasis is placed on the conservative prediction of realistically achievable performance gains provided by the array over a single microphone. Performance improvement is measured as a change in the speech reception threshold (SRT) between single microphone and multimicrophone conditions. Consistent with previous work, predictions of this change in SRT using intelligibility averaged gain, [symbol: see text] are shown to be good. Consequently, this measure is used, along with changes in signal-to-noise ratios (SNRs), to evaluate array performance. The results presented include the effects of acoustic headshadow, small room reverberation, microphone placement uncertainty, and desired speaker location uncertainty. It is in this context that realistic predictions of speech enhancement provided by robust adaptive microphone array processors are discussed. Performance improvements are demonstrated relative to the "best" single microphone in the array for three types of spatial filters: Fixed, robust block processed, and robust adaptive. The performance of the robust block processed arrays is shown to be attainable with adaptive implementations. One fundamental criterion employed in robust beamformer design directly limits the amount of cancellation of the desired signal that can occur.

70 citations



Journal ArticleDOI
TL;DR: In this article, a method for optimizing a beamformer for a one-dimensional microphone array, taking into consideration nonideal features of the sensors and the mounting, is presented.
Abstract: An array of sensors can be used in conjunction with a beamformer, which processes the sensor signals, to achieve a directional response. The beamformer has to be designed such that a beam pattern with certain desired characteristics like specific main beam direction, defined main beam shape, and desired sidelobe level is formed. Conventional methods for the design of beamformers assume sensors with ideal features and do not take the disturbance of the sound field, caused by the mounting of the array, into account. Therefore the predicted theoretical polar response and the measured response often differ significantly. This paper presents a method for optimizing a beamformer for a one‐dimensional microphone array, taking into consideration nonideal features of the sensors and the mounting. Thus the actual polar response can be improved. By evaluating cross‐correlation functions of the sensor signals during a calibration procedure in an anechoic chamber and minimizing the mean squared error between the beamformer output and a prescribed response, optimum parameters for the beamformer are assigned.

37 citations


Proceedings ArticleDOI
19 Apr 1994
TL;DR: In this paper, potential implications of measuring the radiation pattern of a talker for the recognition and enhancement of speech are discussed and parameter estimates which should prove useful as features for speech recognition are presented.
Abstract: A microphone array has the capability of capturing the properties of a significant portion of a talker's radiation pattern. In this paper, potential implications of measuring the radiation pattern of a talker for the recognition and enhancement of speech are discussed. Current applications of microphone arrays entail their installation in small enclosures such as a conference room or an automobile, typically placing a talker in the array's near field. Fitting a nominal acoustic model to a sparse sampling of the radiation pattern yields parameter estimates which should prove useful as features for speech recognition. Parameter-estimation results using fixed-radius sources (loudspeakers) are presented. Other considerations unique to placement of a talker in close proximity to an array are discussed. >

26 citations


Journal ArticleDOI
TL;DR: A broadband constant‐beamwidth 4 oct steerable linear array microphone using directional elements using FIR filters that are inserted in the delay‐sum beamformer after each element has been designed and constructed.
Abstract: The quality of audio teleconferencing in large rooms and noisy environments can be increased with the use of steerable directional microphone arrays. A minimum bandwidth of 4 oct is required to faithfully transmit the speech signal. In a typical teleconferencing arrangement, only discrete angular directions are of interest and therefore the microphone steering directions are quantized. A standard delay‐sum beamformer can result in noticeable frequency response changes as the talker moves between these steering locations. In an effort to mitigate this problem, a broadband constant‐directivity beamformer has been designed and constructed. A few of the algorithms developed in this work will be discussed and compared to existing techniques. Basically, the solution revolves around the design of FIR filters that are inserted in the delay‐sum beamformer after each element. A constant‐beamwidth 4 oct steerable linear array microphone using directional elements will be described. A real‐time implementation utilizing multiple AT&T DSP3210 digital signal processors is also described.

15 citations


Dissertation
01 Sep 1994
TL;DR: In this article, modified adaptive algorithms were proposed to improve performance at high target-to-jammer ratio (TJR) and in reverberation in anechoic and reverberant environments.
Abstract: : A common complaint of hearing aid users is the difficulty encountered when listening to a talker in a noisy environment. Conventional hearing aids amplify all sounds without discriminating between the desired source (target) and background noises (jammers). These devices increase the overall sound levels, but do nothing to improve target-to-jammer ratio (TJR). Research on microphone array hearing aids is motivated by the lack of success of single-microphone systems, as well as the documented advantages of binaural hearing and multiple-element sensing systems. Array processing can be classified as either fixed (time invariant) or adaptive (time varying). Previous work on microphone array hearing aids has demonstrated that under certain conditions, adaptive arrays can provide significantly better performance than simpler fixed arrays. The benefit of adaptive systems is realized when the input TJR is low and when the signals arriving via direct paths are stronger than the reflections. This benefit is reduced or eliminated at high TJR or in strong reverberation. This work studies modified adaptive algorithms to improve performance at high TJR and in reverberation; it also provides complete specifications for the design of an adaptive microphone array hearing aid. In particular, two previously proposed ad hoc methods for controlling adaptation at high TJR are analyzed and evaluated. The results confirm the usefulness of these methods and provide gnidelines for selecting relevant parameters in anechoic and reverberant environments. In addition, an analysis of the specific causes of target cancellation in reverberation reveals that a simple set of parameter choices can solve this problem. Computer simulations of the complete system demonstrate its benefits in a variety of acoustic environments. Steady-state results show that the system provides very large improvements i

15 citations


Journal ArticleDOI
TL;DR: A practical real-time digital microphone array system that uses digital technology and consists of a 16-channel digitizing front end, a 6-processor AT&T SURFboard for signal processing, and a Sun4 workstation for array control and data recording.
Abstract: OF THE THESIS Digital Hardware and Control for a Beam-Forming Microphone Array by Daniel V. Rabinkin Thesis Director: James L. Flanagan Microphone arrays can be used for high-quality sound pick up in reverberant and noisy environments. Conventional single microphone methods su er severe degradation in quality under these conditions. The beamforming capabilities of the microphone array system allow highly directional sound capture, providing enhanced signal-to-noise ratio (SNR) when compared to single microphone performance. Single beamforming arrays operate by summing the delayed outputs of the component microphones. The array has a focus location that is determined by the geometry of microphone spacing and the individual delay values. The technique of beamforming allows the focus location to be shifted by insertion of variable delay lines between the microphones and the summing element. Directional steering of the array is achieved by control of the delay lines and requires no physical movement of the system. Previous microphone array implementations have been carried out using analog delay lines due to limitations in digital processing speed. Technical advances in processor speed and memory availability now allow construction of a microphone array system that uses digital technology. This provides precise control and easy modi cation of the beamforming algorithm. Additional techniques such as use of adaptive anti-reverberation lters, which were not feasible with the analog approach, can now be implemented. A practical real-time digital microphone array system is described in this thesis. The system consists of a 16-channel digitizing front end, a 6-processor AT&T SURFboard for signal ii processing, and a Sun4 workstation for array control and data recording. The system can easily be expanded to handle a greater channel count. The implemented system is portable and provides hands free untethered use. It can track a moving speaker and adapt to changing environments. The array is an ideal sound capture device for circumstances where it is costly or inconvenient to provide close talking microphones for all potential sound sources of interest. Possible applications for microphone arrays include conference centers, concerts, sporting events, and cellular radio in automobiles. iii Acknowledgements I have many people to thank for the generous help they provided me over the course of this project. I would rst and foremost like to thank my advisor Dr. James Flanagan, who provided unlimited help and support. Without his guidance, goodwill and deep understanding of the subject matter, this work would not have been possible. Thanks also go to Dr. Joseph Wilder and Dr. Mark Kahrs who formed my thesis committee and reviewed my work. I owe deep gratitude to Jim Snyder of Bell Labs AT&T, the designer of the SURFboard, who gave me invaluable advice and insight regarding the SURFboard and other hardware issues. I would also like to thank Mel Melchner and Larry Cohen, also of AT&T Bell Labs who provided additional SURFboard related advice. Further thanks go to Art Dahl who did much of the layout work for the PC boards and provided help in debugging the hardware. Also to be thanked are John Sca di who manufactured the PC board, and Joe French who did the analog design and helped with the hardware debugging. I would like to thank Randy Goldberg and everyone at CAIP who provided advice and encouragement and a most pleasant atmosphere in which to work on this project. Support for this work was provided by a grant from the Circuit and Signal Processing Division of the National Science Foundation under Contract No. MIP-9121541, and by sustaining grants to the CAIP Center from the New Jersey Commission on Science and Technology. iv Table of

13 citations


Journal ArticleDOI
TL;DR: A method for delay estimation and talker location with a microphone array is described that preserves the low computational complexity and rapid tracking ability of the frequency‐domain delay estimator, while improving the coherence and stability of the estimated delays and derived source locations.
Abstract: A frequency‐domain delay estimator has been used as the basis of a microphone‐array talker location and beamforming system [M. S. Brandstein and H. F. Silverman, Techn. Rep. LEMS‐116 (1993)]. While the estimator has advantages over previously employed correlation‐based delay estimation methods [H. F. Silverman and S. E. Kirtman, Comput. Speech Lang. 6, 129–152 (1990)], including a shorter analysis window and greater accuracy at lower computational cost, it has the disadvantage that since delays between microphone pairs are estimated independently of one another, there is nothing to ensure that a set of estimated delays corresponds to a single location. This not only introduces errors in talker location but degrades the performance of the beamformer. A method for delay estimation and talker location with a microphone array is described that preserves the low computational complexity and rapid tracking ability of the frequency‐domain delay estimator, while improving the coherence and stability of the estimated delays and derived source locations. Experimental results using data from a real 16‐element array are presented to demonstrate the performance of the algorithms. [Early work principally funded by DARPA/NSF Grant IRI‐8901882, and current work by NSF Grant No. 9314625.]


Patent
22 Apr 1994
TL;DR: In this paper, a sound source observation using sound image visualizing device is presented, where the sound source is surely observed with simple configuration, the sound image is visualized and which is used for maintenance of an observation object.
Abstract: PURPOSE:To provide the sound source observation use sound image visualizing device by which a sound source is surely observed with simple configuration, the sound image is visualized and which is used for maintenance of an observation object. CONSTITUTION:The system is provided with an observation object having a sound source, a parabolic reflector 1 opposed to the observation object, a microphone array 2 arranged to a focal plane of the parabolic reflector 1, and a display device converting a spatial distribution of sound pressure in front of the parabolic reflector 1 into a 2-dimension picture continuously based on an output signal of the microphone array 2.

Journal ArticleDOI
TL;DR: A system of microphone arrays and neural networks (MANN) for robust hands‐free speech recognition has the advantage that existing speech recognition systems can directly be deployed in practical adverse environments where distant‐talking sound pickup is required.
Abstract: When speech recognition technology moves from the laboratory to real‐world applications, there is increasing need for robustness. This paper describes a system of microphone arrays and neural networks (MANN) for robust hands‐free speech recognition. MANN has the advantage that existing speech recognition systems can directly be deployed in practical adverse environments where distant‐talking sound pickup is required. No retraining nor modification of the recognizers is necessary. MANN consists of two synergistic components: (1) signal enhancement by microphone arrays and (2) feature adaptation by neural network computing. High‐quality sound capture by the microphone array enables successful feature adaptation by the neural network to mitigate environmental interference. Through neural network computation, a matched training and testing condition is approximated which typically elevates performance of speech recognition. Both computer‐simulated and real‐room speech input are used to evaluate the capability...

Journal ArticleDOI
TL;DR: Nordholm et al. as mentioned in this paper presented two adaptive microphone array schemes, aimed for this situation, denoted spatial filtering generalized sidelobe canceller (SFGSC), which gives good noise suppression with little distortion of the speech but requires careful calibration.
Abstract: By employing speech generation models and new algorithms more and more a priori information about speech signals is utilized in speech recognition and speech coding. A fair signal‐to‐noise ratio is therefore required to ensure that the a priori information is correct. This implies a need for noise reduction under adverse conditions, such as hands‐free operation of telephones in the car compartment or speech recognition in cars [S. Nordholm et al., ‘‘Adaptive Array Noise Suppression of Handsfree Speaker Input in Cars,’’ IEEE Trans. Veh. Tech. 42, 514–518 (1993)]. The paper presents two adaptive microphone array schemes, aimed for this situation. The first, denoted spatial filtering generalized sidelobe canceller (SFGSC), gives good noise suppression with little distortion of the speech but requires careful calibration. The second, denoted adaptive microphone array employing calibration signals recorded on‐site (AMAEC), facilitates a simple built‐in calibration. It is beneficial from a user point of view to use a calibration signal recorded on site eliminating amplifier tuning and microphone selection. The calibration can be done within 60 s. The AMAEC calibrates the array to the speakers’ location, microphone positions and lobe gains, amplifiers, and to the acoustic environment in the car. No a priori information about signal statistics or array geometry is utilized. [Work supported by Nutek.]

Proceedings ArticleDOI
19 Apr 1994
TL;DR: The proposal of criteria to evaluate and to optimize the performance of an array of sensors through its geometric configuration, the number and the positions of the sensors, and simulation results show that the criteria are related, and allow proper optimization of the array configuration.
Abstract: Sensor positioning has an important influence on the performance of array processing. The main contributions of the paper are the proposal of criteria to evaluate and to optimize the performance of an array of sensors through its geometric configuration, the number and the positions of the sensors. Each criterion evaluates the performance of the array for some applications, particularly for beamforming and source localization. The simulation results show that the criteria are related, and allow proper optimization of the array configuration. >

Proceedings Article
01 Jan 1994
TL;DR: An adaptive microphone array, which facilitates a simple built-in calibration to the environment and the electronic equipment used, is presented, which is well-suited for speech enhancement in handsfree mobile telephones as well as forspeech enhancement in speech recognition devices.
Abstract: This paper presents an adaptive microphone array, which facilitates a simple built-in calibration to the environment and the electronic equipment used. The method is well-suited for speech enhancement in handsfree mobile telephones as well as for speech enhancement in speech recognition devices.

Journal ArticleDOI
TL;DR: The microphone array consists of 64 microphones which can be configured in a variety of geometries ranging from small patches of only a few square meters to an elongated array spanning 700 m.
Abstract: A sound field propagating outdoors is perturbed by the turbulence in the atmosphere. To study the fluctuations due to turbulence, the sound field is measured simultaneously at a large number of points using a microphone array. The array consists of 64 microphones which can be configured in a variety of geometries ranging from small patches of only a few square meters to an elongated array spanning 700 m. Remote ‘‘satellite’’ arrays are also possible. The data are collected and processed in a mobile equipment trailer. The microphones, electronics, data collection, and processing are described and practical aspects of deploying the array are discussed. The design criteria and example applications of the array are also discussed.

Proceedings ArticleDOI
13 Apr 1994
TL;DR: A multi-microphone speech enhancement system including two stages of processing including an adaptive filter with MRSS (matrix ratio sub space) algorithm based on the orthogonal and coincidental subspace eigen analysis is developed.
Abstract: A multi-microphone speech enhancement system including two stages of processing has been developed. In the first stage, the array is aimed at the desired signal source and the noise signal source(s) respectively by means of the Frost algorithm or DS beamforming. In the second stage, the difference between the SNRs of the array outputs are employed to improve the SNR of the desired signal further by an adaptive filter with MRSS (matrix ratio sub space) algorithm. This algorithm is based on the orthogonal and coincidental subspace eigen analysis. The mathematical derivation and the simulation results that illustrate the advantages of the system are presented. >

Journal ArticleDOI
TL;DR: Miyoshi and Kaneda as mentioned in this paper used row action projection (RAP) to solve a system of linear equations, starting from an initial guess, the solution is repeatedly projected onto each hyperplane of the equation system until it converges.
Abstract: The impulse response of a reverberant environment, in general, is a nonminimum phase and cannot be inverted. But an exact inverse of the environment can be obtained by modeling the room as a multiple input–output (MINT) system [M. Miyoshi and Y. Kaneda ICASSP (1986)]. In this report, this model is applied to a microphone array and is used as a front‐end processor for a speaker verification system. The G matrix is inverted using row action projection (RAP), an iterative approach to solving a system of linear equations. Starting from an initial guess, the solution is repeatedly projected onto each hyperplane of the equation system until it converges. The method is stable, robust to noise, and converges to the pseudo‐inverse solution. In computer‐simulated experiments, the signal‐to‐reverberant‐noise ratio is found to improve with the number of microphones in the array. A speaker verification system using the array is evaluated at various signal‐to‐competing‐noise ratios (SCNR). Results suggest that verification performance can be substantially elevated in adverse acoustic environments.

Journal ArticleDOI
TL;DR: From the experiments, it is found that the high‐ pass filtered signal gives a more reliable estimate of end points than does the low‐pass filtered counterpart, consistent with the fact that reverberation and noise in rooms are typically more prominent at low frequencies and are relatively moderate at mid‐ and high frequencies.
Abstract: This paper describes algorithms for automatic end‐point detection of microphone‐array speech signals. Microphone arrays provide a hands‐free sound pickup. The captured sound typically has a higher signal‐to‐noise ratio (SNR) than that captured with conventional microphones used at distances, such as in teleconferencing environments. However, due to multipath distortion (room reverberation) and ambient noise, the detection of starting/ending points of array speech is more difficult than that of close‐talking speech. In this paper, short‐time energy and short‐time zero‐crossing rate are computed for the original speech waveform and its high‐pass filtered and low‐pass filtered versions. These six functions are then utilized in different combinations to determine the end points. Speech data used in the experiments are collected in a hard‐walled laboratory room, having a reverberation time of approximately 0.5 s with a one‐dimensional beamforming line array. From the experiments, it is found that the high‐pass filtered signal gives a more reliable estimate of end points than does the low‐pass filtered counterpart. This result is consistent with the fact that reverberation and noise in rooms are typically more prominent at low frequencies and are relatively moderate at mid‐ and high frequencies. The detection algorithms have been integrated into a dynamic‐time‐warping‐ (DTW) based speech recognizer. Recognition performance of the system is evaluated for both array speech and for close‐talking speech.

Journal ArticleDOI
TL;DR: The study reported here investigates the feasibility of adaptive super-directive microphone array technique for detection and localization of anomaly in nuclear power plants and confirms the practical applicability of this technique to component monitoring innuclear power plants.
Abstract: Acoustic sensing technique provides an efficient measure to detect and diagnose incipient failures occurred in plant components. The study reported here investigates the feasibility of adaptive super-directive microphone array technique for detection and localization of anomaly in nuclear power plants. The technique extracts anomaly information of objective components from the acoustic signals obtained at geometrically arranged multiple microphones. A specific signal processing was used to obtain a super-directive sensitivity of the microphone array and to adapt the sensitivity following the acoustic environmental change. Two adaptive sensing algorithms have been developed and implemented in a personal computer, and their abilities to extract the objective acoustic signal have been tested through numerical simulation. For appropriate number of the microphones, the satisfactory performance of the sound extraction has been obtained within reasonable computational load. The practical applicability of this technique to component monitoring in nuclear power plants has been confirmed through the present study.

Dissertation
01 Jan 1994
TL;DR: Noise source location in the built environment, using a simple microphone array, has been investigated in this article, where the source is located in the room where the microphone array is placed, and the source location is unknown.
Abstract: Noise source location in the built environment, using a simple microphone array.

Journal ArticleDOI
TL;DR: In this paper, a fixed microphone array with user-controlled mainlobe spatial look direction and attenuation band(s) was developed with a flat frequency response over the speech bandwidth.
Abstract: Speech communication in environments with low signal/noise ratios (SNRs) is a primary complaint of the hearing impaired. Microphone beam formation techniques provide an effective approach to improving SNR in these environments. A novel, fixed microphone array is being developed with user‐controlled mainlobe spatial look direction and attenuation band(s), and with a flat frequency response over the speech bandwidth. The array of R microphones and L taps per microphone maximizes energy concentration over a spatial look region and frequency band, subject to spatial and frequency constraints. Constrained maximization of w*Aw/w*Bw is required, where A and B are matrices specifying spatial and frequency factors, and w is the RL dimensional weight vector. The constraining subspace is specified by the array values, derivative values, and spatial directional constraints; w is obtained as the solution of a tractable unconstrained full‐rank lower dimensional generalized eigenvalue problem. Numerical and simulation results for different values of R and L and for different bandwidths will be reported, as well as results of preliminary listening tests with normally hearing and hearing impaired individuals. The feasibility of real‐time acoustic beamformers with arrays for hearing aids, and the advantages of this scheme over conventional adaptive schemes will also be discussed.