scispace - formally typeset
Search or ask a question

Showing papers on "Microphone array published in 2009"


Journal ArticleDOI
01 Oct 2009-Ecology
TL;DR: This work shows that equally precise and unbiased estimates may be obtained from a single sampling interval, using only the spatial pattern of detections, and suggests detection models for binary acoustic data, and for continuous data comprising measurements of all signals above the threshold.
Abstract: The density of a closed population of animals occupying stable home ranges may be estimated from detections of individuals on an array of detectors, using newly developed methods for spatially explicit capture-recapture. Likelihood-based methods provide estimates for data from multi-catch traps or from devices that record presence without restricting animal movement ("proximity" detectors such as camera traps and hair snags). As originally proposed, these methods require multiple sampling intervals. We show that equally precise and unbiased estimates may be obtained from a single sampling interval, using only the spatial pattern of detections. This considerably extends the range of possible applications, and we illustrate the potential by estimating density from simulated detections of bird vocalizations on a microphone array. Acoustic detection can be defined as occurring when received signal strength exceeds a threshold. We suggest detection models for binary acoustic data, and for continuous data comprising measurements of all signals above the threshold. While binary data are often sufficient for density estimation, modeling signal strength improves precision when the microphone array is small.

194 citations


Proceedings ArticleDOI
04 Dec 2009
TL;DR: This paper presents a novel method for alignment of recorded signals with localizing microphones and sources using independent recording devices as a distributed microphone array using auxiliary function approach.
Abstract: In this paper, aiming to utilize independent recording devices as a distributed microphone array, we present a novel method for alignment of recorded signals with localizing microphones and sources. Unlike conventional microphone array, signals recorded by independent devices have different origins of time, and microphone positions are generally unknown. In order to estimate both of them from only recorded signals, time differences between channels for each source are detected, which still include the differences of time origins, and an objective function defined by their square errors is minimized. For that, simple iterative update rules are derived through auxiliary function approach. The validity of our approach is evaluated by simulative experiment.

104 citations


Proceedings ArticleDOI
04 Dec 2009
TL;DR: It is shown that frequency smoothing can be efficiently performed using spherical arrays due to the decoupling of frequency and angular components, and Experimental comparison of DOA estimation between beamforming and MUSIC with frequency smoothed is performed with data measured in a real auditorium.
Abstract: Direction-of-arrival (DOA) estimation of coherent signals is considered of great importance in signal processing. To estimate both azimuth and elevation angle with the same accuracy, 3-dimensional (3-D) array must be used. Spherical arrays have the advantage of spherical symmetry, facilitating 3-D DOA estimation. To apply high resolution subspace DOA estimation algorithms, such as MUSIC, in a coherent environment, a smoothing technique is required. This paper presents the development of a smoothing technique in the frequency domain for spherical microphone arrays. We show that frequency smoothing can be efficiently performed using spherical arrays due to the decoupling of frequency and angular components. Experimental comparison of DOA estimation between beamforming and MUSIC with frequency smoothing is performed with data measured in a real auditorium.

99 citations


Journal ArticleDOI
TL;DR: The use of an automated dialog-based PERS has the potential to provide users with more autonomy in decisions regarding their own health and more privacy in their own home.
Abstract: Demands on long-term-care facilities are predicted to increase at an unprecedented rate as the baby boomer generation reaches retirement age. Aging-in-place (i.e. aging at home) is the desire of most seniors and is also a good option to reduce the burden on an over-stretched long-term-care system. Personal Emergency Response Systems (PERSs) help enable older adults to age-in-place by providing them with immediate access to emergency assistance. Traditionally they operate with push-button activators that connect the occupant via speaker-phone to a live emergency call-centre operator. If occupants do not wear the push button or cannot access the button, then the system is useless in the event of a fall or emergency. Additionally, a false alarm or failure to check-in at a regular interval will trigger a connection to a live operator, which can be unwanted and intrusive to the occupant. This paper describes the development and testing of an automated, hands-free, dialogue-based PERS prototype. The prototype system was built using a ceiling mounted microphone array, an open-source automatic speech recognition engine, and a 'yes' and 'no' response dialog modelled after an existing call-centre protocol. Testing compared a single microphone versus a microphone array with nine adults in both noisy and quiet conditions. Dialogue testing was completed with four adults. The microphone array demonstrated improvement over the single microphone. In all cases, dialog testing resulted in the system reaching the correct decision about the kind of assistance the user was requesting. Further testing is required with elderly voices and under different noise conditions to ensure the appropriateness of the technology. Future developments include integration of the system with an emergency detection method as well as communication enhancement using features such as barge-in capability. The use of an automated dialog-based PERS has the potential to provide users with more autonomy in decisions regarding their own health and more privacy in their own home.

83 citations


Journal ArticleDOI
TL;DR: It is revealed that the spatial Nyquist criterion has little importance for microphone arrays, and the well-known steered response power (SRP) method is formulated with respect to stationary signals, and modifications are necessary to properly form steered beams in nonstationary signal environments.
Abstract: Microphone arrays sample the sound field in both space and time with the major objective being the extraction of the signal propagating from a desired direction-of-arrival (DOA). In order to reconstruct a spatial sinusoid from a set of discrete samples, the spatial sampling must occur at a rate greater than a half of the wavelength of the sinusoid. This principle has long been adapted to the microphone array context: in order to form an unambiguous beampattern, the spacing between elements in a microphone array needs to conform to this spatial Nyquist criterion. The implicit assumption behind the narrowband beampattern is that one may use linearity and Fourier analysis to describe the response of the array to an arbitrary wideband plane wave. In this paper, this assumption is analyzed. A formula for the broadband beampattern is derived. It is shown that in order to quantify the spatial filtering abilities of a broadband array, the incoming signal's bifrequency spectrum must be taken into account, particularly for nonstationary signals such as speech. Multi-dimensional Fourier analysis is then employed to derive the broadband spatial transform, which is shown to be the limiting case of the broadband beampattern as the number of sensors tends to infinity. The conditions for aliasing in broadband arrays are then determined by analyzing the effect of computing the broadband spatial transform with a discrete spatial aperture. It is revealed that the spatial Nyquist criterion has little importance for microphone arrays. Finally, simulation results show that the well-known steered response power (SRP) method is formulated with respect to stationary signals, and that modifications are necessary to properly form steered beams in nonstationary signal environments.

75 citations


Patent
Ronald M. Aarts1
29 May 2009
TL;DR: In this article, a microphone array (12) is used to detect acoustic events (e.g., coughs, snores, impact sounds, verbalizations, etc.) relevant to the patient's status and timestamped.
Abstract: When monitoring a patient, acoustic events (e.g., coughs, snores, impact sounds, verbalizations, etc.) relevant to the patient's status are detected by a microphone array (12) and timestamped. Detected event signals generated by the microphone array (12) are filtered to identify signatures such as zero crossings, corner frequencies, amplitude, pitch, etc., for classification purposes. The filtered signals are digitized and classified into one of a plurality of acoustic event classes (e.g., snore, cough, wheeze, breath, etc.) and/or subclasses. The classified events are displayed to a user (e.g., graphically, textually, etc.) with their timestamps to indicate chronology. A user can review the acoustic events, select one or more events, and listen to a recording of the selected event (s). Additionally, specified acoustic events can trigger an alarm to alert a nurse or the like that the patient requires immediate attention.

63 citations


Proceedings ArticleDOI
10 Oct 2009
TL;DR: Results show that a direction refinement procedure can be used to improve localization accuracy and that more efficient and accurate direction searches can be performed using a uniform triangular element grid rather than the typical rectangular element grid.
Abstract: Although research on localization of sound sources using microphone arrays has been carried out for years, providing such capabilities on robots is rather new. Artificial audition systems on robots currently exist, but no evaluation of the methods used to localize sound sources has yet been conducted. This paper presents an evaluation of various real-time audio localization algorithms using a medium-sized microphone array which is suitable for applications in robotics. The techniques studied here are implementations and enhancements of steered response power - phase transform beamformers, which represent the most popular methods for time difference of arrival audio localization. In addition, two different grid topologies for implementing source direction search are also compared. Results show that a direction refinement procedure can be used to improve localization accuracy and that more efficient and accurate direction searches can be performed using a uniform triangular element grid rather than the typical rectangular element grid.

57 citations


Patent
Elias Nemer1, Jes Thyssen1
24 Feb 2009
TL;DR: In this paper, a system and method for performing speaker localization is described, which utilizes speaker recognition to provide an estimate of the direction of arrival (DOA) of speech sound waves emanating from a desired speaker with respect to a microphone array included in the system.
Abstract: A system and method for performing speaker localization is described. The system and method utilizes speaker recognition to provide an estimate of the direction of arrival (DOA) of speech sound waves emanating from a desired speaker with respect to a microphone array included in the system. Candidate DOA estimates may be preselected or generated by one or more other DOA estimation techniques. The system and method is suited to support steerable beamforming as well as other applications that utilize or benefit from DOA estimation. The system and method provides robust performance even in systems and devices having small microphone arrays and thus may advantageously be implemented to steer a beamformer in a cellular telephone or other mobile telephony terminal featuring a speakerphone mode.

56 citations


Patent
23 Jan 2009
TL;DR: In this article, a communication system for a passenger compartment includes at least two microphone arrays arranged within first and second regions, respectively, in the passenger compartment, and a signal processor connected to the microphone arrays and to the loudspeaker.
Abstract: A communication system for a passenger compartment includes at least two microphone arrays arranged within first and second regions, respectively, in the passenger compartment, and at least two loudspeakers and a signal processor connected to the microphone arrays and to the loudspeaker. Each microphone array has at least two microphones and provides an audio signal. Each loudspeaker is located within a different one of the first and the second regions. The signal processor processes the audio signal from the microphone array within the first region and provides the processed audio signal to the loudspeaker located within the second region.

52 citations


Journal ArticleDOI
R.M.M. Derkx1, K. Janse1
TL;DR: The influence of spatial aliasing on the captured desired signal and the directivity index is analyzed and the sensitivity for uncorrelated sensor noise and theensitivity for phase- and magnitude-errors on the individual sensors are investigated.
Abstract: A first-order azimuth-steerable superdirectional microphone response can be constructed by means of a linear combination of three eigenbeams (monopole and two orthogonal dipoles). Via this method, we can construct any first-order directivity pattern (monopole, cardioid, hypercardioid, etc.) that can be electronically steered to a certain angle on the 2-D plane to capture the desired signal. In this paper, the superdirectional responses are generated via a planar microphone array with a square geometry. We analyze the influence of spatial aliasing on the captured desired signal and the directivity index. Furthermore, we investigate the sensitivity for uncorrelated sensor noise and the sensitivity for phase- and magnitude-errors on the individual sensors. Finally, two rules of thumb are derived to choose the size of the microphone array.

48 citations


Proceedings ArticleDOI
01 Dec 2009
TL;DR: Two-layered audio-visual integration to make automatic speech recognition (ASR) more robust against speaker's distance and interfering talkers or environmental noises is presented.
Abstract: The robustness and high performance of ASR is required for robot audition, because people usually speak to each other to communicate. This paper presents two-layered audio-visual integration to make automatic speech recognition (ASR) more robust against speaker's distance and interfering talkers or environmental noises. It consists of Audio-Visual Voice Activity Detection (AV-VAD) and Audio-Visual Speech Recognition (AVSR). The AV-VAD layer integrates several AV features based on a Bayesian network to robustly detect voice activity, or speaker's utterance duration. This is because the performance of VAD strongly affects that of ASR. The AVSR layer integrates the reliability estimation of acoustic features and that of visual features by using a missing-feature theory method. The reliability of audio features is more weighted in a clean acoustic environment, while that of visual features is more weighted in a noisy environment. This AVSR layer integration can cope with dynamically-changing environments in acoustics or vision. The proposed AV integrated ASR is implemented on HARK, our open-sourced robot audition software, with an 8 ch microphone array. Empirical results show that our system improves 9.9 and 16.7 points of ASR results with/without microphone array processing, respectively, and also improves robustness against several auditory/visual noise conditions.

Journal ArticleDOI
TL;DR: It is shown that the robust broadband beamforming using the statistics of the microphone characteristics recently proposed belongs to the class of white noise gain constraint-based techniques, which gives an insight into the nature of the state-of-the-art approach.
Abstract: Broadband beamformers are known to be highly sensitive to the errors in microphone array characteristics, especially for small-sized microphone arrays. This paper proposes an algorithm for the design of robust broadband beamformers with passband shaping characteristics using Tikhonov regularization method by taking into account the statistics of the microphone characteristics. To facilitate the derivation, a weighted least squares based broadband beamforming algorithm with passband shaping characteristics is also proposed, while keeping a minimal stopband level. Unlike the existing criteria for broadband beamforming for microphone arrays, the proposed approaches fully exploit the degrees of freedom in weighting functions of the criteria used to design broadband beamformers. By doing so, efficient and flexible broadband beamforming can be achieved with the desired passband shaping characteristics. In addition, this paper also shows that the robust broadband beamforming using the statistics of the microphone characteristics recently proposed (Doclo and Moonen, 2003 and 2007) belongs to the class of white noise gain constraint-based techniques, which gives an insight into the nature of the state-of-the-art approach. The performance of the proposed beamforming algorithms is illustrated by the design examples and is compared with the existing approaches for microphone arrays.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: A way to produce an acoustical map of the scene by computing the averaged directivity pattern of BSS demixing systems, which allows application for multiple dimensions, in the near field as well as in the far field.
Abstract: In this paper, we propose a versatile acoustic source localization framework exploiting the self-steering capability of Blind Source Separation (BSS) algorithms. We provide a way to produce an acoustical map of the scene by computing the averaged directivity pattern of BSS demixing systems. Since BSS explicitly accounts for multiple sources in its signal propagation model, several simultaneously active sound sources can be located using this method. Moreover, the framework is suitable to any microphone array geometry, which allows application for multiple dimensions, in the near field as well as in the far field. Experiments demonstrate the efficiency of the proposed scheme in a reverberant environment for the localization of speech sources.

Bruno Fazenda1, H. Atmoko1, Fengshou Gu1, Luyang Guan1, Andrew Ball1 
13 Nov 2009
TL;DR: The relaying of information to the driver as a warning signal has been investigated through the use of ambisonic technology and a 4 speaker array which is ubiquitous in most modern vehicles, showing that accurate warning information may be relayed and afford correct action.
Abstract: A system has been investigated for the detection of incoming direction of an emergency vehicle. Acoustic detection methods based on a cross microphone array have been implemented. It is shown that source detection based on time delay estimation outperforms sound intensity techniques, although both techniques perform well for the application. The relaying of information to the driver as a warning signal has been investigated through the use of ambisonic technology and a 4 speaker array which is ubiquitous in most modern vehicles. Simulations show that accurate warning information may be relayed to the driver and afford correct action.

Journal ArticleDOI
TL;DR: This work investigates the use of a weighted fuzzy c-means cluster algorithm for robust source localization using location cues extracted from a microphone array and incorporates observation weights to increase the algorithm's robustness against sound reflections.
Abstract: Successful localization of sound sources in reverberant enclosures is an important prerequisite for many spatial signal processing algorithms. We investigate the use of a weighted fuzzy c-means cluster algorithm for robust source localization using location cues extracted from a microphone array. In order to increase the algorithm's robustness against sound reflections, we incorporate observation weights to emphasize reliable cues over unreliable ones. The weights are computed from local feature statistics around sound onsets because it is known that these regions are least affected by reverberation. Experimental results illustrate the superiority of the method when compared with standard fuzzy clustering. The proposed algorithm successfully located two speech sources for a range of angular separations in room environments with reverberation times of up to 600 ms.

Journal ArticleDOI
TL;DR: A search space clustering method designed to speed up the SRP-PHAT based sound source localization algorithm for intelligent home robots equipped with small scale microphone arrays is proposed.
Abstract: Sound source localization (SSL) is a major function of robot auditory systems for intelligent home robots. The steered response power-phase transform (SRP-PHAT) is a widely used method for robust SSL. However, it is too slow to run in real time, since SRP-PHAT searches a large number of candidate sound source locations. This paper proposes a search space clustering method designed to speed up the SRP-PHAT based sound source localization algorithm for intelligent home robots equipped with small scale microphone arrays. The proposed method reduces the number of candidate sound source locations by 30.6% and achieves 46.7% error reduction compared to conventional methods.

Patent
11 Dec 2009
TL;DR: In this paper, a computer-implemented method for determining a time delay for time delay compensation of a microphone signal from a microphone array in a beamformer arrangement is presented.
Abstract: The invention provides a computer-implemented method for determining a time delay for time delay compensation of a microphone signal from a microphone array in a beamformer arrangement. For a given time, an instantaneous estimate of a position of a wanted sound source and/or of a direction of arrival of a signal originating from the wanted sound source is determined. The computer system then determines whether the instantaneous estimate deviates from a preset estimate of a position of the wanted sound source and/or of a direction of arrival of a signal originating from the wanted sound source according to a predetermined criterion. The predetermined criterion comprises a check whether the instantaneous estimate deviates from the preset estimate by at least a predetermined deviation threshold. If the predetermined criterion is fulfilled, the instantaneous estimate for the given time is set by the computer system as the preset estimate, and the computer system determines the time delay for time delay compensation of the microphone signal based on the instantaneous estimate.

Patent
06 Oct 2009
TL;DR: In this article, a wearable shooter localization system including a microphone array, processor, and output device for determining information about a gunshot is presented, where the microphone array is worn by on the upper arm of the user.
Abstract: A wearable shooter localization system including a microphone array, processor, and output device for determining information about a gunshot. The microphone array may be worn by on the upper arm of the user. A second array, which may operate cooperatively or independently from the first array, may be worn on the other arm. The microphone array is sensitive to the acoustic effects of gunfire and provides a set of electrical signals to the processing unit, which identifies the origin of the fire. The system may include orientation and/or motion detection sensors, which the processor may use to either initially compute a direction to the origin of a projectile in a frame of reference meaningful to a wearer of the system or to subsequently update that direction as the wearer moves.

Patent
26 Aug 2009
TL;DR: In this article, a microphone array system for sound acquisition from multiple sound sources in a reception space surrounding a single microphone array that is interfaced with a beamformer module is described.
Abstract: A microphone array system for sound acquisition from multiple sound sources in a reception space surrounding a microphone array that is interfaced with a beamformer module is disclosed. The microphone array includes microphone transducers that are arranged relative to each other in N-fold rotationally symmetry, and the beamformer includes beamformer weights that are associated with one of a plurality of spatial reception sectors corresponding to the N-fold rotational symmetry of the microphone array. Microphone indexes of the microphone transducers are arithmetically displaceable angularly about the vertical axis during a process cycle, so that a same set of beamformer weights is used selectively for calculating a beamformer output signal associated with any one of the spatial reception sectors. A sound source location module is also disclosed that includes a modified steered power response sound source location method. A post filter module for a microphone array system is also disclosed.

Proceedings ArticleDOI
06 Oct 2009
TL;DR: In this paper, an approach to unsupervised shape calibration of microphone array networks is presented, where a hierarchical procedure is developed to first perform local shape calibration based on coherence analysis and then employ SRP-PHAT in a network calibration method.
Abstract: Microphone arrays represent the basis for many challenging acoustic sensing tasks. The accuracy of techniques like beamforming directly depends on a precise knowledge of the relative positions of the sensors used. Unfortunately, for certain use cases manually measuring the geometry of an array is not feasible due to practical constraints. In this paper we present an approach to unsupervised shape calibration of microphone array networks. We developed a hierarchical procedure that first performs local shape calibration based on coherence analysis and then employs SRP-PHAT in a network calibration method. Practical experiments demonstrate the effectiveness of our approach especially for highly reverberant acoustic environments.

Proceedings ArticleDOI
04 Dec 2009
TL;DR: A recording method using a circularly symmetric array of differential microphones, and a reproduction method use a corresponding array of loudspeakers is presented, and Objective results in the form of active intensity diagrams are presented.
Abstract: Multichannel audio reproduction generally suffers from one or both of the following problems: i) the recorded audio has to be artificially manipulated to provide the necessary spatial cues, which reduces the consistency of the reproduced sound field with the actual one, and ii) reproduction is not panoramic, which degrades realism when the listener is not seated in a desired ideal position facing the center channel. A recording method using a circularly symmetric array of differential microphones, and a reproduction method using a corresponding array of loudspeakers is presented in this paper. Design of microphone directivity patterns to achieve a panoramic auditory scene is discussed. Objective results in the form of active intensity diagrams are presented.

Journal ArticleDOI
TL;DR: This paper studies the position and orientation estimation performances of the ANN for different input/output combinations (and different numbers of hidden units) and finds the best combination of parameters yields 21.8% reduction in the average position error compared to the following baselines.
Abstract: A method which automatically provides the position and orientation of a directional acoustic source in an enclosed environment is proposed. In this method, different combinations of the estimated parameters from the received signals and the microphone positions of each array are used as inputs to the artificial neural network (ANN). The estimated parameters are composed of time delay estimates (TDEs), source position estimates, distance estimates, and energy features. The outputs of the ANN are the source orientation (one out of four possible orientations shifted by 90 degrees and either the best array which is defined as the nearest to the source) or the source position in two dimensional/three dimensional (2D/3D) space. This paper studies the position and orientation estimation performances of the ANN for different input/output combinations (and different numbers of hidden units). The best combination of parameters (TDEs and microphone positions) yields 21.8% reduction in the average position error compared to the following baselines and a correct orientation ratio greater than 99%. Position localization baselines consist of a time delay of arrival based method with an average position error of 34.1 cm and the steered response power with phase transform method with an average position error of 29.8 cm in 3D space.

Journal ArticleDOI
Xun Huang1
TL;DR: An innovative algorithm with real-time capability is proposed in this work, similar to a classical observer in the time domain while extended for the array processing to the frequency domain.
Abstract: Acoustic phased array has become an important testing tool in aeroacoustic research, where the conventional beamforming algorithm has been adopted as a classical processing technique The computation however has to be performed off-line due to the expensive cost An innovative algorithm with real-time capability is proposed in this work The algorithm is similar to a classical observer in the time domain while extended for the array processing to the frequency domain The observer-based algorithm is beneficial mainly for its capability of operating over sampling blocks recursively The expensive experimental time can therefore be reduced extensively since any defect in a testing can be corrected instantaneously

Proceedings ArticleDOI
11 May 2009
TL;DR: In this article, the authors present a method to estimate the position of the microphones of a microphone array in the three-dimensional space using a global positioning system (GPS) based approach.
Abstract: The \delay and sum beamformer" algorithm (\DSB") is a powerful tool for the localisation and quantication of acoustic sources with microphone arrays For the calculation of beamforming maps the DSB algorithm requires the following input data: time series of all microphones, a grid of focus points which includes the region of interest, parameter of the ow for boundary layer or shear layer corrections and the accurate position of all microphones The present paper is focused on the last item: the accurate estimation of the microphone positions Especially for aeroacoustic applications the number of microphones should be large enough in order to obtain good beamforming results The estimation of the accurate microphone positions can mean a huge time consuming eort The method which will be presented in this paper is similar to the well known global positioning system: distances to satellites provide information about the position of a receiver Here, several monopole-like acoustical point sources with known positions and a reference microphone which is installed close to the sound sources are used to compute the position of the microphones of a microphone array in the three-dimensional space After pointing out the basic concepts and algorithms a practical implementation of the test sources is described Eight test sources and the reference microphone are integrated in a so-called calibration unit Afterwards a calibration of a microphone array with known microphone positions is presented to verify the method and to assess the accuracy that can be achieved Furthermore the problem is addressed how many test sources are necessary to achieve accurate results Finally, the procedure is used to calibrate an out-of-ow microphone array with a layout of microphones where the positions are only known with some uncertainty Investigations concerning the frequency dependence of the calibration are presented Beamforming on a loudspeaker is performed to show in how far more accurately known microphone positions can improve beamforming results, particularly in the higher frequency range

Patent
Cheng Yiou-Wen1, Hsi-Wen Nien1
21 Sep 2009
TL;DR: In this paper, a microphone array includes microphone units and a compensation module, which receives adjusted gains corresponding to the amplifier modules, obtains a gain difference between the adjusted gains, and adjusts one amplified signal according to the gain difference to obtain a compensated signal.
Abstract: An audio processing apparatus is provided. A microphone array includes microphone units. Amplifier modules each receives and amplifies an input signal from one microphone unit to generate amplified signals. A compensation module receives adjusted gains corresponding to the amplifier modules, obtains a gain difference between the adjusted gains, and adjusts one amplified signal according to the gain difference to obtain a compensated signal.

Journal ArticleDOI
TL;DR: A distributed, self-organization algorithm for ground target tracking using unattended acoustic sensor network is developed that can dynamically select proper sensor nodes to form the localization sensor groups that can work as a virtual microphone array to perform energy efficient target localization and tracking.

Proceedings ArticleDOI
04 Dec 2009
TL;DR: A spherical harmonics domain microphone array beamforming approach that unifies 3D multi-beam forming with tractable mainlobe levels, automatic multi-null steering, sidelobe control, and robustness control into one optimization framework, using a single spherical microphone array is proposed.
Abstract: A spherical harmonics domain microphone array beamforming approach is proposed. It unifies 3D multi-beam forming with tractable mainlobe levels, automatic multi-null steering, sidelobe control, and robustness control into one optimization framework, using a single spherical microphone array. The optimum array weights are designed by maintaining distortionless responses in multiple mainlobe directions and guaranteeing all sidelobes below given threshold values, while minimizing the beamformer output power. A weight vector norm constraint is also employed to improve the robustness of the beamformer. A convex optimization formulation is derived, and implemented by the second order cone programming (SOCP) method. Design examples demonstrate a satisfactory performance.

Patent
03 Jun 2009
TL;DR: In this paper, a voice enhancement method employing a combination of nesting-subarray-based post filtering and spectrum subtraction is proposed for indoor environment, comprising the enhancement of multi-channel voice signal in vehicle environment, as the problems of unstable the broadband of the voice signals, the inconsistent frequency response of the microphone-arraybased multichannel voiceenhancement method to the voice signal and the correlation among all-channel noise in actual noise-field environment are considered.
Abstract: The invention discloses a voice enhancement method employing a combination of nesting-subarray-based post filtering and spectrum-subtraction and is suitable for indoor environment, comprising the enhancement of multi-channel voice signal in vehicle environment; as the problems of unstable the broadband of the voice signals, the inconsistent frequency response of the microphone-array-based multi-channel voice-enhancement method to the voice signal and the correlation among all-channel noise in actual noise-field environment are considered, by utilizing the microphone array nested by the subarrays with different spacing, the voice signals are collected; and the voice signals formed by subarray beams are divided into a high-frequency section and a low-frequency section, different voice-enhancement algorism are adopted for carrying out the treatment; all the advantages are complementary with each other, thus improving the effect of voice enhancement.

Proceedings Article
01 Aug 2009
TL;DR: Five methods for direction estimation in the concept of sound intensity vectors are compared with real data from a concert hall and the results indicate that the methods that are based on convolutive mixture models perform slightly better than some of the simple averaging methods.
Abstract: The direction of a sound source in an enclosure can be estimated with a microphone array and some proper signal processing. Earlier, in applications and in research the use of time delay estimation methods, such as the cross correlation, has been popular. Recently, techniques for direction estimation that involve sound intensity vectors have been developed and used in applications, e.g. in teleconferencing. Unlike in time delay estimation, these methods have not been compared widely. In this article, five methods for direction estimation in the concept of sound intensity vectors are compared with real data from a concert hall. The results of the comparison indicate that the methods that are based on convolutive mixture models perform slightly better than some of the simple averaging methods. The convolutive mixture model based methods are also more robust against additive noise.

Proceedings ArticleDOI
11 May 2009
TL;DR: In this article, an impedance eduction technique for ducts with plane waves at the source and duct termination planes was extended to support higher-order modes at these locations, and a microphone array located in a wall either adjacent or opposite to the test liner was presented.
Abstract: An impedance eduction technique, previously validated for ducts with plane waves at the source and duct termination planes, has been extended to support higher-order modes at these locations. Inputs for this method are the acoustic pressures along the source and duct termination planes, and along a microphone array located in a wall either adjacent or opposite to the test liner. A second impedance eduction technique is then presented that eliminates the need for the microphone array. The integrity of both methods is tested using three sound sources, six Mach numbers, and six selected frequencies. Results are presented for both a hardwall and a test liner (with known impedance) consisting of a perforated plate bonded to a honeycomb core. The primary conclusion of the study is that the second method performs well in the presence of higher-order modes and flow. However, the first method performs poorly when most of the microphones are located near acoustic pressure nulls. The negative effects of the acoustic pressure nulls can be mitigated by a judicious choice of the mode structure in the sound source. The paper closes by using the first impedance eduction method to design a rectangular array of 32 microphones for accurate impedance eduction in the NASA LaRC Curved Duct Test Rig in the presence of expected measurement uncertainties, higher order modes, and mean flow.