scispace - formally typeset
Search or ask a question

Showing papers on "Microphone array published in 2013"


Journal ArticleDOI
TL;DR: DEMAND (Diverse Environments Multi-channel Acoustic Noise Database) is provided, providing a set of 16-channel noise files recorded in a variety of indoor and outdoor settings to encourage research into algorithms beyond the stereo setup.
Abstract: Multi-microphone arrays allow for the use of spatial filtering techniques that can greatly improve noise reduction and source separation. However, for speech and audio data, work on noise reduction or separation has focused primarily on one- or two-channel systems. Because of this, databases of multichannel environmental noise are not widely available. DEMAND (Diverse Environments Multi-channel Acoustic Noise Database) addresses this problem by providing a set of 16-channel noise files recorded in a variety of indoor and outdoor settings. The data was recorded using a planar microphone array consisting of four staggered rows, with the smallest distance between microphones being 5 cm and the largest being 21.8 cm. DEMAND is freely available under a Creative Commons license to encourage research into algorithms beyond the stereo setup.

413 citations


Journal ArticleDOI
TL;DR: This work shows how to compute the shape of a convex polyhedral room from its response to a known sound, recorded by a few microphones, and reconstructs the full 3D geometry of the room from a single sound emission, and with an arbitrary geometry ofThe microphone array.
Abstract: Imagine that you are blindfolded inside an unknown room. You snap your fingers and listen to the room’s response. Can you hear the shape of the room? Some people can do it naturally, but can we design computer algorithms that hear rooms? We show how to compute the shape of a convex polyhedral room from its response to a known sound, recorded by a few microphones. Geometric relationships between the arrival times of echoes enable us to “blindfoldedly” estimate the room geometry. This is achieved by exploiting the properties of Euclidean distance matrices. Furthermore, we show that under mild conditions, first-order echoes provide a unique description of convex polyhedral rooms. Our algorithm starts from the recorded impulse responses and proceeds by learning the correct assignment of echoes to walls. In contrast to earlier methods, the proposed algorithm reconstructs the full 3D geometry of the room from a single sound emission, and with an arbitrary geometry of the microphone array. As long as the microphones can hear the echoes, we can position them as we want. Besides answering a basic question about the inverse problem of room acoustics, our results find applications in areas such as architectural acoustics, indoor localization, virtual reality, and audio forensics.

241 citations


Journal ArticleDOI
TL;DR: A multiple sound source localization and counting method is presented, that imposes relaxed sparsity constraints on the source signals, and is shown to have excellent performance for DOA estimation and source counting, and to be highly suitable for real-time applications due to its low complexity.
Abstract: In this work, a multiple sound source localization and counting method is presented, that imposes relaxed sparsity constraints on the source signals. A uniform circular microphone array is used to overcome the ambiguities of linear arrays, however the underlying concepts (sparse component analysis and matching pursuit-based operation on the histogram of estimates) are applicable to any microphone array topology. Our method is based on detecting time-frequency (TF) zones where one source is dominant over the others. Using appropriately selected TF components in these “single-source” zones, the proposed method jointly estimates the number of active sources and their corresponding directions of arrival (DOAs) by applying a matching pursuit-based approach to the histogram of DOA estimates. The method is shown to have excellent performance for DOA estimation and source counting, and to be highly suitable for real-time applications due to its low complexity. Through simulations (in various signal-to-noise ratio conditions and reverberant environments) and real environment experiments, we indicate that our method outperforms other state-of-the-art DOA and source counting methods in terms of accuracy, while being significantly more efficient in terms of computational complexity.

218 citations


Book
30 Apr 2013
TL;DR: In the comprehensive treatment of microphone arrays, the topics covered include an introduction to the theory, far-field and near-field array signal processing algorithms, practical implementations, and common applications: vehicles, computing and communications equipment, compressors, fans, and household appliances, and hands-free speech.
Abstract: Presents a unified framework of far-field and near-field array techniques for noise source identification and sound field visualization, from theory to application.Acoustic Array Systems: Theory, Implementation, and Applicationprovides an overview of microphone array technology with applications in noise source identification and sound field visualization. In the comprehensive treatment of microphone arrays, the topics covered include an introduction to the theory, far-field and near-field array signal processing algorithms, practical implementations, and common applications: vehicles, computing and communications equipment, compressors, fans, and household appliances, and hands-free speech. The author concludes with other emerging techniques and innovative algorithms.Encompasses theoretical background, implementation considerations and application know-howShows how to tackle broader problems in signal processing, control, and transudcersCovers both farfield and nearfield techniques in a balanced wayIntroduces innovative algorithms including equivalent source imaging (NESI) and high-resolution nearfield arraysSelected code examples available for download for readers to practice on their ownPresentation slides available for instructor useA valuable resource for Postgraduates and researchers in acoustics, noise control engineering, audio engineering, and signal processing.

80 citations


Journal ArticleDOI
TL;DR: The integration of the ManyEars Library with Willow Garage’s Robot Operating System is presented and the customized microphone board and sound card distributed as an open hardware solution for implementation of robotic audition systems are introduced.
Abstract: ManyEars is an open framework for microphone array-based audio processing. It consists of a sound source localization, tracking and separation system that can provide an enhanced speaker signal for improved speech and sound recognition in real-world settings. ManyEars software framework is composed of a portable and modular C library, along with a graphical user interface for tuning the parameters and for real-time monitoring. This paper presents the integration of the ManyEars Library with Willow Garage's Robot Operating System. To facilitate the use of ManyEars on various robotic platforms, the paper also introduces the customized microphone board and sound card distributed as an open hardware solution for implementation of robotic audition systems.

77 citations


Journal ArticleDOI
TL;DR: The GSC-form implementation, by separating the constraints and the minimization, enables the adaptation of the BF during speech-absent time segments, and relaxes the requirement of other distributed LCMV based algorithms to re-estimate the sources RTFs after each iteration.
Abstract: This paper proposes a distributed multiple constraints generalized sidelobe canceler (GSC) for speech enhancement in an N-node fully connected wireless acoustic sensor network (WASN) comprising M microphones. Our algorithm is designed to operate in reverberant environments with constrained speakers (including both desired and competing speakers). Rather than broadcasting M microphone signals, a significant communication bandwidth reduction is obtained by performing local beamforming at the nodes, and utilizing only transmission channels. Each node processes its own microphone signals together with the N + P transmitted signals. The GSC-form implementation, by separating the constraints and the minimization, enables the adaptation of the BF during speech-absent time segments, and relaxes the requirement of other distributed LCMV based algorithms to re-estimate the sources RTFs after each iteration. We provide a full convergence proof of the proposed structure to the centralized GSC-beamformer (BF). An extensive experimental study of both narrowband and (wideband) speech signals verifies the theoretical analysis.

65 citations


Journal ArticleDOI
TL;DR: Simulation results proved that the proposed method effectively separated up to M(M-1)+1 sound sources if the fixed beamformers were appropriately selected, and was also effective in practical use.
Abstract: A method for separating underdetermined sound sources based on a novel power spectral density (PSD) estimation is proposed. The method enables up to M(M-1)+1 sources to be separated when we use a microphone array of M sensors and a Wiener post-filter calculated by the estimated PSDs. The PSD of a beamformer's output is modelled by a mixture of source PSDs multiplied by the beamformer's directivity gain in the particular angle where each source is located. Based on this model, the PSD of each sound source is estimated from the PSD of multiple fixed beamformers' outputs using the difference in the combination of directivity gains. Simulation results proved that the proposed method effectively separated up to M(M-1)+1 sound sources if the fixed beamformers were appropriately selected. Experiments were also conducted in a reverberant chamber to ensure the proposed method was also effective in practical use.

64 citations


Journal ArticleDOI
TL;DR: An experimental study of spatial sound perception with the use of a spherical microphone array for sound recording and headphone-based binaural sound synthesis shows that a source will be perceived more spatially sharp and more externalized when represented by a bINAural stimuli reconstructed with a higher spherical harmonics order.
Abstract: The area of sound field synthesis has significantly advanced in the past decade, facilitated by the development of high-quality sound-field capturing and re-synthesis systems. Spherical microphone arrays are among the most recently developed systems for sound field capturing, enabling processing and analysis of three-dimensional sound fields in the spherical harmonics domain. In spite of these developments, a clear relation between sound fields recorded by spherical microphone arrays and their perception with a re-synthesis system has not yet been established, although some relation to scalar measures of spatial perception was recently presented. This paper presents an experimental study of spatial sound perception with the use of a spherical microphone array for sound recording and headphone-based binaural sound synthesis. Sound field analysis and processing is performed in the spherical harmonics domain with the use of head-related transfer functions and simulated enclosed sound fields. The effect of several factors, such as spherical harmonics order, frequency bandwidth, and spatial sampling, are investigated by applying the repertory grid technique to the results of the experiment, forming a clearer relation between sound-field capture with a spherical microphone array and its perception using binaural synthesis regarding space, frequency, and additional artifacts. The experimental study clearly shows that a source will be perceived more spatially sharp and more externalized when represented by a binaural stimuli reconstructed with a higher spherical harmonics order. This effect is apparent from low spherical harmonics orders. Spatial aliasing, as a result of sound field capturing with a finite number of microphones, introduces unpleasant artifacts which increased with the degree of aliasing error.

60 citations


Journal Article
TL;DR: A method is proposed to localize an acoustic source within a frequency band from 100Hz to 4 KHz in two dimensions using microphone array by calculating the direction of arrival (DOA) of the acoustic signals using a set of spatially separated microphones.
Abstract: Source Localization is a very well established technique that has a wide range of applications from remote sensing to the Global Positioning System. Sound source localization techniques are used in commercial applications like improving speech quality in hands free telephony, video conferencing to military applications like SONAR, surveillance systems and devices to locate the sources of artillery fire. A method is proposed to localize an acoustic source within a frequency band from 100Hz to 4 KHz in two dimensions using microphone array by calculating the direction of arrival (DOA) of the acoustic signals. Direction of arrival (DOA) estimation of acoustic signals using a set of spatially separated microphones uses the phase information present in signals. For this the time-delays are estimated for each pair of microphones in the array. From the known array geometry and the direction of arrival, the location of source can be obtained.

57 citations


Book
06 Aug 2013
TL;DR: In this paper, a deconvolution approach for the mapping of acoustic sources (DAMAS) was developed which decouples the array design and processing influence from the noise being measured, using a simple and robust algorithm.
Abstract: At the 2004 AIAA/CEAS Aeroacoustic Conference, a breakthrough in acoustic microphone array technology was reported by the authors. A Deconvolution Approach for the Mapping of Acoustic Sources (DAMAS) was developed which decouples the array design and processing influence from the noise being measured, using a simple and robust algorithm. For several prior airframe noise studies, it was shown to permit an unambiguous and accurate determination of acoustic source position and strength. As a follow-on effort, this paper examines the technique for three-dimensional (3D) applications. First, the beamforming ability for arrays, of different size and design, to focus longitudinally and laterally is examined for a range of source positions and frequency. Advantage is found for larger array designs with higher density microphone distributions towards the center. After defining a 3D grid generalized with respect to the array s beamforming characteristics, DAMAS is employed in simulated and experimental noise test cases. It is found that spatial resolution is much less sharp in the longitudinal direction in front of the array compared to side-to-side lateral resolution. 3D DAMAS becomes useful for sufficiently large arrays at sufficiently high frequency. But, such can be a challenge to computational capabilities, with regard to the required expanse and number of grid points. Also, larger arrays can strain basic physical modeling assumptions that DAMAS and all traditional array methodologies use. An important experimental result is that turbulent shear layers can negatively impact attainable beamforming resolution. Still, the usefulness of 3D DAMAS is demonstrated by the measurement of landing gear noise source distributions in a difficult hard-wall wind tunnel environment.

55 citations


Journal ArticleDOI
TL;DR: This work presents the wrapped Kalman filter (WKF) for tracking the azimuth of a speaker with a compact, 3-channel microphone array with a wrapped Gaussian distribution and shows that this achieves a lower mean squared error than 2-D methods.
Abstract: We present the wrapped Kalman filter (WKF) for tracking the azimuth of a speaker with a compact, 3-channel microphone array. Traditional extended and unscented filters assume that the observation is a rotating vector in \BBR2. However, the azimuth inhabits a 1-D subspace: the unit circle. We model the state variable with a wrapped Gaussian distribution and show that this achieves a lower mean squared error than 2-D methods. We demonstrate the superior tracking performance of the WKF in simulated and real reverberant environments.

Journal ArticleDOI
TL;DR: In this article, acoustic array measurements performed in a cryogenic wind tunnel are described for various Reynolds numbers using a 9.24% Dornier-728 half model, and the background noise of the empty test section was measured within the range of the measurements performed on the DORNIER half model.
Abstract: The measurement of airframe noise on small-scale models is well known and common practice in conventional wind tunnels. Since conventional wind tunnels cannot generally achieve full-scale Reynolds numbers, measurements during the development process of modern aircraft are often performed in cryogenic and/or pressurized wind tunnels which are capable of higher Reynolds number flows. Thus, the characteristics of the moving fluid are better adapted to the scale model. At the DLR Institute of Aerodynamics and Flow Technology the microphone array measurement technique was further developed to perform measurements in a cryogenic wind tunnel at temperatures down to 100 K. A microphone array consisting of 144 microphones was designed and constructed for this purpose. In this paper, acoustic array measurements performed in a cryogenic wind tunnel are described for various Reynolds numbers using a 9.24% Dornier-728 half model. Additionally, the background noise of the empty test section was measured within the range of the measurements performed on the Dornier-728 half model. Our results seems to indicate a Reynolds number dependency of the measured sound power for various sources.


Journal ArticleDOI
TL;DR: A modified Matching Pursuit algorithm is proposed to estimate the position of a small set of virtual sources, that have the key property of being sparse in the time domain, that can be exploited in a framework of model-based Compressed Sensing.
Abstract: This paper deals with the interpolation of the Room Impulse Responses (RIRs) within a whole volume, from as few measurements as possible, and without the knowledge of the geometry of the room. We focus on the early reflections of the RIRs, that have the key property of being sparse in the time domain: this can be exploited in a framework of model-based Compressed Sensing. Starting from a set of RIRs randomly sampled in the spatial domain of interest by a 3D microphone array, we propose a modified Matching Pursuit algorithm to estimate the position of a small set of virtual sources. Then, the reconstruction of the RIRs at interpolated positions is performed using a projection onto a basis of monopoles, which correspond to the estimated virtual sources. An extension of the proposed algorithm allows the interpolation of the positions of both source and receiver, using the acquisition of four different source positions. This approach is validated both by numerical examples, and by experimental measurements using a 3D array with up to 120 microphones.

Journal ArticleDOI
01 Mar 2013
TL;DR: A hybrid descent method is proposed which consists of a genetic algorithm together with a gradient-based method which can help to locate the optimal solution rapidly around the start point, while the genetic algorithm is used to jump out from local minima.
Abstract: In beamformer design, the microphone locations are often fixed and only the filter coefficients are varied in order to improve on the noise reduction performance. However, the positions of the microphone elements play an important role in the overall performance and should be optimized at the same time. However, this nonlinear optimization problem is non-convex and local search techniques might not yield the best result. This problem is addressed in this paper. A hybrid descent method is proposed which consists of a genetic algorithm together with a gradient-based method. The gradient-based method can help to locate the optimal solution rapidly around the start point, while the genetic algorithm is used to jump out from local minima. This hybrid method has the descent property and can help us to find the optimal placement for better beamformer design. Numerical examples are provided to demonstrate the effectiveness of the method.

Patent
15 Mar 2013
TL;DR: In this paper, an accelerometer that is included in a pair of earbuds is used to detect vibration of the user's vocal chords filtered by the vocal tract based on vibrations in bones and tissue of a user's head.
Abstract: A method of improving voice quality in a mobile device starts by receiving acoustic signals from microphones included in earbuds and the microphone array included on a headset wire. The headset may include the pair of earbuds and the headset wire. An output from an accelerometer that is included in the pair of earbuds is then received. The accelerometer may detect vibration of the user's vocal chords filtered by the vocal tract based on vibrations in bones and tissue of the user's head. A spectral mixer included in the mobile device may then perform spectral mixing of the scaled output from the accelerometer with the acoustic signals from the microphone array to generate a mixed signal. Performing spectral mixing includes scaling the output from the inertial sensor by a scaling factor based on a power ratio between the acoustic signals from the microphone array and the output from the inertial sensor. Other embodiments are also described.

Journal ArticleDOI
TL;DR: A localization algorithm based on discrimination of cross-correlation functions that provides higher localization accuracy than the SRP-PHAT algorithm in reverberant noisy environment is proposed.

Journal ArticleDOI
TL;DR: The proposed time- and frequency-domain adaptive algorithms based on crossrelation formulation are regularized by incorporating an l1 -norm sparseness constraint, which is realized using a split Bregman method.
Abstract: Localization of early room reflections can be achieved by estimating the time-differences-of-arrival (TDOAs) of reflected waves between elements of a microphone array. For an unknown source, we propose to apply sparse blind system identification (BSI) methods to identify the acoustic impulse responses, from which the TDOAs of temporally sparse reflections are estimated. The proposed time- and frequency-domain adaptive algorithms based on crossrelation formulation are regularized by incorporating an l1 -norm sparseness constraint, which is realized using a split Bregman method. These algorithms are shown to outperform standard crossrelation-based BSI techniques when estimating TDOAs of reflections in the presence of background noise.

Proceedings Article
01 Sep 2013
TL;DR: Experimental results show that a system combining the proposed equalizer and the adaptive beamformer improves noise reduction performance without degrading the speech quality.
Abstract: This paper discusses using dereverberation technologies to improve the robustness of beamforming microphone arrays against reverberation. It is known that, while conventional adaptive beamformers are able to effectively cancel the interference from spatially separate noise sources in the absence of reverberation, their efficacy deteriorates in reverberant rooms. To provide a widely applicable solution to this problem, we propose an adaptive multiple point blind equalizer that can shorten all the room impulse responses between the sound sources and microphones. This algorithm reduces the impact of reverberation on the interference cancellation performance when it is used to pre-process the microphone signals prior to beamforming. Experimental results using real conversation data show that a system combining the proposed equalizer and the adaptive beamformer improves noise reduction performance without degrading the speech quality.

Journal ArticleDOI
TL;DR: In this paper, an array of piezoelectric monocrystalline silicon microphones for audio-range acoustic sensing is presented. But the authors focus on low-frequency modes of operation, where residual stress causes deformation and mitigate these ill effects through the use of stress-compensating layer thicknesses and a stress-free monocrystaline diaphragm.
Abstract: We report an array of piezoelectric monocrystalline silicon microphones for audio-range acoustic sensing. Thirteen cantilever-type diaphragm transducers make up the array, each having a closely spaced and precisely controlled resonant frequency. These overlapping resonances serve to greatly boost the sensitivity of the array when the signals are added; if the signals are individually taken, the array acts as a physical filter bank with a quality factor over 40. Such filtering would enhance the performance and the efficiency of speech-recognition systems. In the “summing mode,” the array demonstrates high response over a large bandwidth, with unamplified sensitivity greater than 2.5 mV/Pa from 240 to 6.5 kHz. Both modes of operation rely on the precise control of resonant frequencies, often a challenge with large compliant microelectromechanical-system (MEMS) structures, where residual stress causes deformation. We mitigate these ill effects through the use of stress-compensating layer thicknesses and a stress-free monocrystalline diaphragm. For determining device geometry, we develop a simple analytical method that yields excellent agreement between designed and measured resonant frequency; all devices are within 4.5%, and four are within 0.5% (just several hertz). The technique could be useful not only for microphones but also for other low-frequency MEMS transducers designed for resonance operation at a specific frequency.

Patent
06 May 2013
TL;DR: In this article, a processor is configured to determine the social interaction between the participants based on the similarities between the first spatially filtered output and each of the second spatially filtering outputs.
Abstract: A system which performs social interaction analysis for a plurality of participants includes a processor. The processor is configured to determine a similarity between a first spatially filtered output and each of a plurality of second spatially filtered outputs. The processor is configured to determine the social interaction between the participants based on the similarities between the first spatially filtered output and each of the second spatially filtered outputs and display an output that is representative of the social interaction between the participants. The first spatially filtered output is received from a fixed microphone array, and the second spatially filtered outputs are received from a plurality of steerable microphone arrays each corresponding to a different participant.

Journal ArticleDOI
TL;DR: A general approach to acoustic scene analysis based on a novel data structure (ray-space image) that encodes the directional plenacoustic function over a line segment (Observation Window, OW) and how to acquire it using a microphone array is proposed.
Abstract: In this work we propose a general approach to acoustic scene analysis based on a novel data structure (ray-space image) that encodes the directional plenacoustic function over a line segment (Observation Window, OW). We define and describe a system for acquiring a ray-space image using a microphone array and refer to it as ray-space (or “soundfield”) camera. The method consists of acquiring the pseudo-spectra corresponding to a grid of sampling points over the OW, and remapping them onto the ray space, which parameterizes acoustic paths crossing the OW. The resulting ray-space image displays the information gathered by the sensors in such a way that the elements of the acoustic scene (sources and reflectors) will be easy to discern, recognize and extract. The key advantage of this method is that ray-space images, irrespective of the application, are generated by a common (and highly parallelizable) processing layer, and can be processed using methods coming from the extensive literature of pattern analysis. After defining the ideal ray-space image in terms of the directional plenacoustic function, we show how to acquire it using a microphone array. We also discuss resolution and aliasing issues and show two simple examples of applications of ray-space imaging.

Journal ArticleDOI
TL;DR: This work presents a least squares method for temporal offset estimation of a static ad-hoc microphone array that utilizes the captured audio content without the need to emit calibration signals, provided that during the recording a sufficient amount of sound sources surround the array.
Abstract: In recent years ad-hoc microphone arrays have become ubiquitous, and the capture hardware and quality is increasingly more sophisticated. Ad-hoc arrays hold a vast potential for audio applications, but they are inherently asynchronous, i.e., temporal offset exists in each channel, and furthermore the device locations are generally unknown. Therefore, the data is not directly suitable for traditional microphone array applications such as source localization and beamforming. This work presents a least squares method for temporal offset estimation of a static ad-hoc microphone array. The method utilizes the captured audio content without the need to emit calibration signals, provided that during the recording a sufficient amount of sound sources surround the array. The Cramer-Rao lower bound of the estimator is given and the effect of limited number of surrounding sources on the solution accuracy is investigated. A practical implementation is then presented using non-linear filtering with automatic parameter adjustment. Simulations over a range of reverberation and noise levels demonstrate the algorithm's robustness. Using smartphones an average RMS error of 3.5 samples (at 48 kHz) was reached when the algorithm's assumptions were met.

Journal ArticleDOI
TL;DR: This paper proposes a real-time method for capturing and reproducing spatial audio based on a circular microphone array that estimates the directions of arrival of the active sound sources on a per time-frame basis and performs source separation with a fixed superdirective beamformer, which results in more accurate modelling and reproduction of the recorded acoustic environment.
Abstract: This paper proposes a real-time method for capturing and reproducing spatial audio based on a circular microphone array. Following a different approach than other recently proposed array-based methods for spatial audio, the proposed method estimates the directions of arrival of the active sound sources on a per time-frame basis and performs source separation with a fixed superdirective beamformer, which results in more accurate modelling and reproduction of the recorded acoustic environment. The separated source signals are downmixed into onemonophonic audio signal, which, along with side information, is transmitted to the reproduction side. Reproduction is possible using either headphones or an arbitrary loudspeaker configuration. The method is compared with other recently proposed array-based spatial audio methods through a series of listening tests for both simulated and real microphone array recordings. Reproduction using both loudspeakers and headphones is considered in the listening tests. As the results indicate, the proposed method achieves excellent spatialization and sound quality.

Patent
02 Jan 2013
TL;DR: In this paper, the authors proposed a sound source locating method and device for the technical field of sound processing, which comprises the steps of: collecting sound source signals by utilizing a microphone array and preprocessing the sound source signal collected by any two microphones, confirming a cross-power spectral density function of the two sound sources signals; confirming a weighting function adjusted along with the variation of the present signal to noise ratio; confirming the time delay of sound source singles to two microphones according to the maximum value of the cross-correlation function.
Abstract: The invention is suitable for the technical field of sound processing and provides a sound source locating method and device. The method comprises the steps of: collecting sound source signals by utilizing a microphone array and preprocessing the sound source signals collected by any two microphones; confirming a cross-power spectral density function of the two sound source signals; confirming a weighting function adjusted along with the variation of the present signal to noise ratio; confirming a sequence of values of the cross-correlation function of the two sound source signals according to the cross-power spectral density function and the weighting function; confirming the time delay of the sound source singles to two microphones according to the maximum value of the cross-correlation function; and locating the sound source positions according to the permutation distribution of the microphone array and the time delay of the sound source signals to the any two microphones. According to the method and the device, the adopted weighting function can be correspondingly adjusted along with the variation of the present signal to noise ratio to ensure that under the environment that the signal to noise ratio of a sound source is changed, the time delay of the sound source can be accurately obtained through correspondingly adjusting the weighting function, and therefore, the sound source locating accuracy is improved.


Patent
17 Dec 2013
TL;DR: In this article, a method and system for directional enhancement of a microphone array comprising at least two microphones by analysis of the phase angle of the coherence between the two microphones is presented.
Abstract: Herein provided is a method and system for directional enhancement of a microphone array comprising at least two microphones by analysis of the phase angle of the coherence between at least two microphones. The method can further include communicating directional data with the microphone signal to a secondary device, and adjusting at least one parameter of the device in view of the directional data. Other embodiments are disclosed.

Proceedings ArticleDOI
26 May 2013
TL;DR: This paper proposes a novel rank constraint on a matrix depending on the measurements and the unknown time delays from Time-Difference-of-Arrival (TDOA) measurements and recovers the time delays with good accuracy for noisy and missing data.
Abstract: Measurements with unknown time delays are common in different applications such as microphone array, radio antenna array calibration, where the sources (e.g. sounds) are transmitted in unknown time instants. In this paper, we present a method for estimating unknown time delays from Time-Difference-of-Arrival (TDOA) measurements. We propose a novel rank constraint on a matrix depending on the measurements and the unknown time delays. The time delays are recovered by solving a truncated nuclear norm minimization problem using alternating direction method of multipliers (ADMM). We show in synthetic experiments that the proposed method recovers the time delays with good accuracy for noisy and missing data.


Proceedings ArticleDOI
26 May 2013
TL;DR: The proposed calibration algorithm is derived from a maximum-likelihood approach employing circular statistics, and since a sensor node consists of a microphone array with known intra-array geometry, is able to obtain an absolute geometry estimate, including angles and distances.
Abstract: In this paper we propose an approach to retrieve the absolute geometry of an acoustic sensor network, consisting of spatially distributed microphone arrays, from reverberant speech input. The calibration relies on direction of arrival measurements of the individual arrays. The proposed calibration algorithm is derived from a maximum-likelihood approach employing circular statistics. Since a sensor node consists of a microphone array with known intra-array geometry, we are able to obtain an absolute geometry estimate, including angles and distances. Simulation results demonstrate the effectiveness of the approach.