scispace - formally typeset
Search or ask a question

Showing papers on "Acoustic source localization published in 2012"


Journal ArticleDOI
TL;DR: The performance of acoustic-FADE is evaluated using simulated fall and nonfall sounds performed by three stunt actors trained to behave like elderly under different environmental conditions and achieves 100% sensitivity at a specificity of 97%.
Abstract: More than a third of elderly fall each year in the United States. It has been shown that the longer the lie on the floor, the poorer is the outcome of the medical intervention. To reduce delay of the medical intervention, we have developed an acoustic fall detection system (acoustic-FADE) that automatically detects a fall and reports it promptly to the caregiver. Acoustic-FADE consists of a circular microphone array that captures the sounds in a room. When a sound is detected, acoustic-FADE locates the source, enhances the signal, and classifies it as “fall” or “nonfall.” The sound source is located using the steered response power with phase transform technique, which has been shown to be robust under noisy environments and resilient to reverberation effects. Signal enhancement is performed by the beamforming technique based on the estimated sound source location. Height information is used to increase the specificity. The mel-frequency cepstral coefficient features computed from the enhanced signal are utilized in the classification process. We have evaluated the performance of acoustic-FADE using simulated fall and nonfall sounds performed by three stunt actors trained to behave like elderly under different environmental conditions. Using a dataset consisting of 120 falls and 120 nonfalls, the acoustic-FADE achieves 100% sensitivity at a specificity of 97%.

275 citations


Journal ArticleDOI
TL;DR: For the first time a technique is proposed to locate the acoustic source in large anisotropic plates with the help of only six sensors without knowing the direction dependent velocity profile in the plate.

207 citations


Journal ArticleDOI
TL;DR: In this paper, an azimuthal ring array of six microphones, whose polar angle, θ, was progressively varied, allows the decomposition of the acoustic pressure into axisymmetric Fourier modes.
Abstract: We present experimental results for the acoustic field of jets with Mach numbers between 0.35 and 0.6. An azimuthal ring array of six microphones, whose polar angle, θ, was progressively varied, allows the decomposition of the acoustic pressure into azimuthal Fourier modes. In agreement with past observations, the sound field for low polar angles (measured with respect to the jet axis) is found to be dominated by the axisymmetric mode, particularly at the peak Strouhal number. The axisymmetric mode of the acoustic field can be clearly associated with an axially non-compact source, in the form of a wavepacket: the sound pressure level for peak frequencies is found be superdirective for all Mach numbers considered, with exponential decay as a function of (1 – M_c cos θ)^2, where M_c is the Mach number based on the phase velocity U_c of the convected wave. While the mode m = 1 spectrum scales with Strouhal number, suggesting that its energy content is associated with turbulence scales, the axisymmetric mode scales with Helmholtz number – the ratio between source length scale and acoustic wavelength. The axisymmetric radiation has a stronger velocity dependence than the higher-order azimuthal modes, again in agreement with predictions of wavepacket models. We estimate the axial extent of the source of the axisymmetric component of the sound field to be of the order of six to eight jet diameters. This estimate is obtained in two different ways, using, respectively, the directivity shape and the velocity exponent of the sound radiation. The analysis furthermore shows that compressibility plays a significant role in the wavepacket dynamics, even at this low Mach number. Velocity fluctuations on the jet centreline are reduced as the Mach number is increased, an effect that must be accounted for in order to obtain a correct estimation of the velocity dependence of sound radiation. Finally, the higher-order azimuthal modes of the sound field are considered, and a model for the low-angle sound radiation by helical wavepackets is developed. The measured sound for azimuthal modes 1 and 2 at low Strouhal numbers is seen to correspond closely to the predicted directivity shapes.

195 citations


Journal ArticleDOI
TL;DR: This work investigating a multi-source localization framework in which monaural source segregation is used as a mechanism to increase the robustness of azimuth estimates from a binaural input shows that robust performance can be achieved with limited prior training.
Abstract: Sound source localization from a binaural input is a challenging problem, particularly when multiple sources are active simultaneously and reverberation or background noise are present. In this work, we investigate a multi-source localization framework in which monaural source segregation is used as a mechanism to increase the robustness of azimuth estimates from a binaural input. We demonstrate performance improvement relative to binaural only methods assuming a known number of spatially stationary sources. We also propose a flexible azimuth-dependent model of binaural features that independently captures characteristics of the binaural setup and environmental conditions, allowing for adaptation to new environments or calibration to an unseen binaural setup. Results with both simulated and recorded impulse responses show that robust performance can be achieved with limited prior training.

131 citations


Journal ArticleDOI
TL;DR: Potential issues associated with scale-up of the structure are addressed and dynamic response of the array structures under discrete frequency excitation was investigated using laser vibrometry to verify negative dynamic mass behavior.
Abstract: Metamaterials have emerged as promising solutions for manipulation of sound waves in a variety of applications. Locally resonant acoustic materials (LRAM) decrease sound transmission by 500% over acoustic mass law predictions at peak transmission loss (TL) frequencies with minimal added mass, making them appealing for weight-critical applications such as aerospace structures. In this study, potential issues associated with scale-up of the structure are addressed. TL of single-celled and multi-celled LRAM was measured using an impedance tube setup with systematic variation in geometric parameters to understand the effects of each parameter on acoustic response. Finite element analysis was performed to predict TL as a function of frequency for structures with varying complexity, including stacked structures and multi-celled arrays. Dynamic response of the array structures under discrete frequency excitation was investigated using laser vibrometry to verify negative dynamic mass behavior.

100 citations


Proceedings ArticleDOI
24 Dec 2012
TL;DR: A prototype system for auditory scene analysis based on the proposed MUltiple SIgnal Classification based on incremental Generalized EigenValue Decomposition (iGEVD-MUSIC) showed that dynamically-changing noise is properly suppressed with the proposed method and multiple human voice sources are able to be localized even when the AR.Drone is moving in an outdoor environment.
Abstract: This paper addresses auditory scene analysis, especially, sound source localization using an aerial vehicle with a microphone array in an outdoor environment. Since such a vehicle is able to search sound sources quickly and widely, it is useful to detect outdoor sound sources, for instance, to find distressed people in a disaster situation. In such an environment, noise is quite loud and dynamically-changing, and conventional microphone array techniques studied in the field of indoor robot audition are of less use. We, thus, proposed MUltiple SIgnal Classification based on incremental Generalized EigenValue Decomposition (iGEVD-MUSIC). It can deal with high power noise by introducing a noise correlation matrix and GEVD even when the signal-to-noise ratio is less than 0 dB. In addition, the noise correlation matrix is incrementally estimated to adapt to dynamic changes in noise. We developed a prototype system for auditory scene analysis based on the proposed method using the Parrot AR.Drone with a microphone array and a Kinect device. Experimental results using the prototype system showed that dynamically-changing noise is properly suppressed with the proposed method and multiple human voice sources are able to be localized even when the AR.Drone is moving in an outdoor environment.

95 citations


Journal ArticleDOI
TL;DR: A multi-target tracking (MTT) methodology to allow for an unknown and time-varying number of speakers in a fully probabilistic manner and in doing so does not resort to independent modules for new target proposal or target number estimation as in previous works.
Abstract: Particle filter-based acoustic source tracking algorithms track (online and in real-time) the position of a sound source-a person speaking in a room-based on the current data from a distributed microphone array as well as the previously recorded data. This paper develops a multi-target tracking (MTT) methodology to allow for an unknown and time-varying number of speakers in a fully probabilistic manner and in doing so does not resort to independent modules for new target proposal or target number estimation as in previous works. The approach uses the concept of an existence grid to propose possible regions of activity before tracking is carried out with a variable dimension particle filter-which also explicitly supports the concept of a null particle, containing no target states, when no speakers are active. Examples demonstrate typical tracking performance in a number of different scenarios with simultaneously active speech sources.

80 citations


Journal ArticleDOI
TL;DR: This paper describes a method of generating a controlled sound field for listeners inside a circular array of loudspeakers without disturbing people outside the array appreciably with a method that combines pure contrast maximization with a pressure matching technique.
Abstract: This paper describes a method of generating a controlled sound field for listeners inside a circular array of loudspeakers without disturbing people outside the array appreciably. To achieve this objective, a double-layer array of loudspeakers is used. Several solution methods are suggested, and their performance is examined using computer simulations. Two performance indices are used in this work, (a) the level difference between the average sound energy density in the listening zone and that in the quiet zone (sometimes called “the acoustic contrast”), and (b) a normalized measure of the deviations between the desired and the generated sound field in the listening zone. It is concluded that the best compromise is obtained with a method that combines pure contrast maximization with a pressure matching technique.

77 citations


Journal ArticleDOI
TL;DR: The present work derives the fundamental equations governing the acousto-optic effect in air, and demonstrates that it can be measured with a laser Doppler vibrometer in the audible frequency range.
Abstract: When sound propagates through a medium, it results in pressure fluctuations that change the instantaneous density of the medium. Under such circumstances, the refractive index that characterizes the propagation of light is not constant, but influenced by the acoustic field. This kind of interaction is known as the acousto-optic effect. The formulation of this physical phenomenon into a mathematical problem can be described in terms of the Radon transform, which makes it possible to reconstruct an arbitrary sound field using tomography. The present work derives the fundamental equations governing the acousto-optic effect in air, and demonstrates that it can be measured with a laser Doppler vibrometer in the audible frequency range. The tomographic reconstruction is tested by means of computer simulations and measurements. The main features observed in the simulations are also recognized in the experimental results. The effectiveness of the tomographic reconstruction is further confirmed with representations of the very same sound field measured with a traditional microphone array.

76 citations


Journal ArticleDOI
TL;DR: Results indicate that the acoustic cues used by fish during sound-source localization include the axes of particle motion of the local sound field, which is in line with the local particle motion axes.
Abstract: Sound-source localization behavior was studied in the plainfin midshipman fish (Porichthys notatus) by making use of the naturally occurring phonotaxis response of gravid females to playback of the male's advertisement call. The observations took place outdoors in a circular concrete tank. A dipole sound projector was placed at the center of the tank and an 80-90 Hz tone (the approximate fundamental frequency to the male's advertisement call) was broadcast to gravid females that were released from alternative sites approximately 100 cm from the source. The phonotaxic responses of females to the source were recorded, analyzed and compared with the sound field. One release site was approximately along the vibratory axis of the dipole source, and the other was approximately orthogonal to the vibratory axis. The sound field in the tank was fully characterized through measurements of the sound pressure field using hydrophones and acoustic particle motion using an accelerometer. These measurements confirmed that the sound field was a nearly ideal dipole. When released along the dipole vibratory axis, the responding female fish took essentially straight paths to the source. However, when released approximately 90 deg to the source's vibratory axis, the responding females took highly curved paths to the source that were approximately in line with the local particle motion axes. These results indicate that the acoustic cues used by fish during sound-source localization include the axes of particle motion of the local sound field.

73 citations


Proceedings ArticleDOI
12 Nov 2012
TL;DR: This paper sonifies objects that do not intrinsically produce sound, with the purpose of revealing additional information about them, and uses computer vision methods to identify high-level features of interest in an RGB-D stream, which are then sonified as virtual objects at their respective real-world coordinates.
Abstract: Augmented reality applications have focused on visually integrating virtual objects into real environments. In this paper, we propose an auditory augmented reality, where we integrate acoustic virtual objects into the real world. We sonify objects that do not intrinsically produce sound, with the purpose of revealing additional information about them. Using spatialized (3D) audio synthesis, acoustic virtual objects are placed at specific real-world coordinates, obviating the need to explicitly tell the user where they are. Thus, by leveraging the innate human capacity for 3D sound source localization and source separation, we create an audio natural user interface. In contrast with previous work, we do not create acoustic scenes by transducing low-level (for instance, pixel-based) visual information. Instead, we use computer vision methods to identify high-level features of interest in an RGB-D stream, which are then sonified as virtual objects at their respective real-world coordinates. Since our visual and auditory senses are inherently spatial, this technique naturally maps between these two modalities, creating intuitive representations. We evaluate this concept with a head-mounted device, featuring modes that sonify flat surfaces, navigable paths and human faces.

Proceedings ArticleDOI
24 Dec 2012
TL;DR: This paper presents a sound source localization system for a MAV to locate narrowband sound sources on the ground, such as the sound of a whistle or personal alarm siren, and proposes a method based on a particle filter to combine information from the cross correlation between signals.
Abstract: In search and rescue missions, Micro Air Vehicles (MAV's) can assist rescuers to faster locate victims inside a large search area and to coordinate their efforts. Acoustic signals play an important role in outdoor rescue operations. Emergency whistles, as found on most aircraft life vests, are commonly carried by people engaging in outdoor activities, and are also used by rescue teams, as they allow to signal reliably over long distances and far beyond visibility. For a MAV involved in such missions, the ability to locate the source of a distress sound signal, such as an emergency whistle blown by a person in need of help, is therefore significantly important and would allow the localization of victims and rescuers during night time, through foliage and in adverse conditions such as dust, fog and smoke. In this paper we present a sound source localization system for a MAV to locate narrowband sound sources on the ground, such as the sound of a whistle or personal alarm siren. We propose a method based on a particle filter to combine information from the cross correlation between signals of four spatially separated microphones mounted on the MAV, the dynamics of the aerial platform, and the doppler shift in frequency of the sound due to the motion of the MAV. Furthermore, we evaluate our proposed method in a real world experiment where a flying micro air vehicle is used to locate and track the position of a narrowband sound source on the ground.

Journal Article
TL;DR: In this paper, a method of generating a controlled sound field for listeners inside a circular array of loudspeakers without disturbing people outside the array appreciably is described, and several solution methods are suggested, and their performance is examined using computer simulations.
Abstract: This paper describes a method of generating a controlled sound field for listeners inside a circular array of loudspeakers without disturbing people outside the array appreciably. To achieve this objective, a double-layer array of loudspeakers is used. Several solution methods are suggested, and their performance is examined using computer simulations. Two performance indices are used in this work, (a) the level difference between the average sound energy density in the listening zone and that in the quiet zone (sometimes called "the acoustic contrast"), and (b) a normalized measure of the deviations between the desired and the generated sound field in the listening zone. It is concluded that the best compromise is obtained with a method that combines pure contrast maximization with a pressure matching technique.

Journal ArticleDOI
TL;DR: A contaminated Gaussian (CG) noise model is proposed to characterize the impulsive, non-Gaussian nature of acoustic background noise observed in some real-world WSNs to provide robust estimation of acoustic source locations in the presence of outlier.
Abstract: A distributed, robust source location estimation method using acoustic signatures in a wireless sensor network (WSN) is presented. A contaminated Gaussian (CG) noise model is proposed to characterize the impulsive, non-Gaussian nature of acoustic background noise observed in some real-world WSNs. A bi-square M-estimate approach then is applied to provide robust estimation of acoustic source locations in the presence of outlier. Moreover, a Consensus based Distributed Robust Acoustic Source Localization (C-DRASL) algorithm is proposed. With C-DRASL, individual sensor nodes will solve for the bi-square M-estimate of the source location locally using a lightweight Iterative Nonlinear Reweighted Least Square (INRLS) algorithm. These local estimates then will be exchanged among nearest neighboring nodes via one-hop wireless channels. Finally, at each node, a robust consensus algorithm will aggregate the local estimates of neighboring nodes iteratively and converge to a unified global estimate on the source location. The effectiveness and robustness of C-DRASL are clearly demonstrated through extensive simulation results.

Proceedings ArticleDOI
24 Dec 2012
TL;DR: This work proposes two methods, MUSIC based on Generalized Singular Value Decomposition (GSVD-MUSIC), and Hierarchical SSL (H-SSL), which drastically reduces the computational cost while maintaining noise-robustness in localization.
Abstract: Sound Source Localization (SSL) is an essential function for robot audition and yields the location and number of sound sources, which are utilized for post-processes such as sound source separation. SSL for a robot in a real environment mainly requires noise-robustness, high resolution and real-time processing. A technique using microphone array processing, that is, Multiple Signal Classification based on Standard Eigen-Value Decomposition (SEVD-MUSIC) is commonly used for localization. We improved its robustness against noise with high power by incorporating Generalized EigenValue Decomposition (GEVD). However, GEVD-based MUSIC (GEVD-MUSIC) has mainly two issues: 1) the resolution of pre-measured Transfer Functions (TFs) determines the resolution of SSL, 2) its computational cost is expensive for real-time processing. For the first issue, we propose a TF interpolation method integrating time-domain-based and frequency-domain-based interpolation. The interpolation achieves super-resolution SSL, whose resolution is higher than that of the pre-measured TFs. For the second issue, we propose two methods, MUSIC based on Generalized Singular Value Decomposition (GSVD-MUSIC), and Hierarchical SSL (H-SSL). GSVD-MUSIC drastically reduces the computational cost while maintaining noise-robustness in localization. H-SSL also reduces the computational cost by introducing a hierarchical search algorithm instead of using greedy search in localization. These techniques are integrated into an SSL system using a robot embedded microphone array. The experimental result showed: the proposed interpolation achieved approximately 1 degree resolution although we have only TFs at 30 degree intervals, GSVD-MUSIC attained 46.4% and 40.6% of the computational cost compared to SEVD-MUSIC and GEVD-MUSIC, respectively, H-SSL reduces 59.2% computational cost in localization of a single sound source.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed method has more accuracy and less computation time simultaneously, in comparison with previous closed-form hyperbolic intersection and other TDOA-based state-of-the-art location calculation methods, overcoming their major weaknesses.
Abstract: This paper investigates wideband sound source localization in outdoor cases. In such cases, time difference of arrival (TDOA)-based methods are commonly used for 2-D and 3-D wideband sound source localization. These methods have lower accuracy in comparison with direction of arrival-based approaches. However, they feature fewer microphones and less computation time. High accuracy sound source localization using these methods needs highly accurate time delay measurement, and therefore, high frequency signal sampling rates. Moreover, the need to use numerical analysis methods for local calculations (solving nonlinear equations of closed-form methods) will increase computation time while the calculations may still not converge. Also, a good initial guess close to the true solution is needed to avoid local minima. In this paper, a simple, fast (real time) and accurate pure geometric phase transform-based exact location calculation approach for 3-D localization of wideband sound sources in outdoor far-field and low degree reverberation cases, using only four microphones, is proposed. Based on the proposed method, a simple arrangement of microphones is implemented. Experimental results show that the proposed method has more accuracy and less computation time simultaneously, in comparison with previous closed-form hyperbolic intersection and other TDOA-based state-of-the-art location calculation methods, overcoming their major weaknesses. Also, as the nonlinear closed-form equations are linearized, no initial guess is required. It features less than 0.2° error for angle of arrival, less than 5% error for 3-D location finding, and computation times as low as 250 ms for the localization of a typical wideband sound source such as a flying object (helicopter).

Journal ArticleDOI
TL;DR: This paper presents a localization framework based on circular harmonics beamforming and T-F processing that provides accurate localization performance under very adverse acoustic conditions and discusses the validity of the proposed approach and comparing its performance with other state-of-the-art methods.
Abstract: The Spanish Ministry of Science and Innovation supported this work under Project No. TEC2009-14414-C03-01.

Proceedings ArticleDOI
12 Nov 2012
TL;DR: This work introduces a methodology to experimentally verify the existence of a locally-linear bijective mapping between sound-source positions and high-dimensional interaural data, using manifold learning, and proposes an novel method, namely probabilistic piecewise affine regression, that learns the localization-to-interaural mapping and its inverse.
Abstract: The problem of 2D sound-source localization based on a robotic binaural setup and audio-motor learning is addressed. We first introduce a methodology to experimentally verify the existence of a locally-linear bijective mapping between sound-source positions and high-dimensional interaural data, using manifold learning. Based on this local linearity assumption, we propose an novel method, namely probabilistic piecewise affine regression, that learns the localization-to-interaural mapping and its inverse. We show that our method outperforms two state-of-the art mapping methods, and allows to achieve accurate 2D localization of natural sounds from real world binaural recordings.

Proceedings ArticleDOI
18 Jul 2012
TL;DR: In this paper, the authors presented a real-time 3D wideband sound localization system with a miniature XYZO microphone array, which was designed with both bidirectional (pressure gradient) and omnidirectional microphones.
Abstract: This paper presents a real-time three-dimensional (3D) wideband sound localization system designed with a miniature XYZO microphone array. Unlike the conventional microphone arrays for sound localization using only omnidirectional microphones, the presented microphone array is designed with both bidirectional (pressure gradient) and omnidirectional microphones. Therefore, the array has significantly reduced size and is known as the world's smallest microphone array design for 3D sound source localization in air. In this paper, we describe the 3D array configuration and perform array calibration. For 3D sound localization, we provide studies on the array output model of the XYZO array, the widely known direction-of-arrival (DOA) estimation methods and the direction search in 3D space. To achieve the real-time processing for 1° search resolution, we accelerate the parallel computations on GPU platform with CUDA programming, and a 130X speedup is achieved compared to a multi-thread CPU implementation. The performance of the proposed system is studied under various reverberation lengths and signal-to-noise levels. We also demonstrate a real-time 3D sound localization demo showing good ability to virtual reality.

Journal ArticleDOI
TL;DR: The possibility of using the time-reversal technique to localize acoustic sources in a wind-tunnel flow is investigated and an application to a dipolar sound source shows that this type of source is also very satisfactorily characterized.
Abstract: The possibility of using the time-reversal technique to localize acoustic sources in a wind-tunnel flow is investigated. While the technique is widespread, it has scarcely been used in aeroacoustics up to now. The proposed method consists of two steps: in a first experimental step, the acoustic pressure fluctuations are recorded over a linear array of microphones; in a second numerical step, the experimental data are time-reversed and used as input data for a numerical code solving the linearized Euler equations. The simulation achieves the back-propagation of the waves from the array to the source and takes into account the effect of the mean flow on sound propagation. The ability of the method to localize a sound source in a typical wind-tunnel flow is first demonstrated using simulated data. A generic experiment is then set up in an anechoic wind tunnel to validate the proposed method with a flow at Mach number 0.11. Monopolar sources are first considered that are either monochromatic or have a narrow or wide-band frequency content. The source position estimation is well-achieved with an error inferior to the wavelength. An application to a dipolar sound source shows that this type of source is also very satisfactorily characterized.

Journal ArticleDOI
TL;DR: The double layer velocity method seems to be more robust to noise and flanking sound than the combined pressure-velocity method, although it requires an additional measurement surface.
Abstract: In conventional near-field acoustic holography (NAH) it is not possible to distinguish between sound from the two sides of the array, thus, it is a requirement that all the sources are confined to only one side and radiate into a free field. When this requirement cannot be fulfilled, sound field separation techniques make it possible to distinguish between outgoing and incoming waves from the two sides, and thus NAH can be applied. In this paper, a separation method based on the measurement of the particle velocity in two layers and another method based on the measurement of the pressure and the velocity in a single layer are proposed. The two methods use an equivalent source formulation with separate transfer matrices for the outgoing and incoming waves, so that the sound from the two sides of the array can be modeled independently. A weighting scheme is proposed to account for the distance between the equivalent sources and measurement surfaces and for the difference in magnitude between pressure and velocity. Experimental and numerical studies have been conducted to examine the methods. The double layer velocity method seems to be more robust to noise and flanking sound than the combined pressure-velocity method, although it requires an additional measurement surface. On the whole, the separation methods can be useful when the disturbance of the incoming field is significant. Otherwise the direct reconstruction is more accurate and straightforward.

Journal ArticleDOI
TL;DR: In this article, a weighted subspace-fitting matched field (WSF-MF) method for passive localization is presented by exploiting the physical properties of a reliable acoustic path (RAP) environment.
Abstract: The physical properties of a reliable acoustic path (RAP) are analysed and subsequently a weighted-subspace-fitting matched field (WSF-MF) method for passive localization is presented by exploiting the properties of the RAP environment. The RAP is an important acoustic duct in the deep ocean, which occurs when the receiver is placed near the bottom where the sound velocity exceeds the maximum sound velocity in the vicinity of the surface. It is found that in the RAP environment the transmission loss is rather low and no blind zone of surveillance exists in a medium range. The ray theory is used to explain these phenomena. Furthermore, the analysis of the arrival structures shows that the source localization method based on arrival angle is feasible in this environment. However, the conventional methods suffer from the complicated and inaccurate estimation of the arrival angle. In this paper, a straightforward WSF-MF method is derived to exploit the information about the arrival angles indirectly. The method is to minimize the distance between the signal subspace and the spanned space by the array manifold in a finite range-depth space rather than the arrival-angle space. Simulations are performed to demonstrate the features of the method, and the results are explained by the arrival structures in the RAP environment.

Proceedings ArticleDOI
25 Mar 2012
TL;DR: This work proposes a method for indoor acoustic source localization in which the physical modeling is implicit and demonstrates how exploiting the bandwidth leads to improved performance and surprising results, such as localization of multiple sources with one microphone, or hearing around corners.
Abstract: Acoustic source localization often relies on the free-space/far-field model Recent work exploiting spatio-temporal sparsity promises to go beyond these scenarios However, it requires the knowledge of the transfer functions from each possible source location to each microphone We propose a method for indoor acoustic source localization in which the physical modeling is implicit By approximating the wave equation with the finite element method (FEM), we naturally get a sparse recovery formulation of the source localization We demonstrate how exploiting the bandwidth leads to improved performance and surprising results, such as localization of multiple sources with one microphone, or hearing around corners Numerical simulation results show the feasibility of such schemes

Proceedings ArticleDOI
01 Jan 2012
TL;DR: This paper focusses on implementing an accurate Sound Source Localizer (SSL) for estimating the position of a sound source on a digital signal processor, using as less CPU resources as possible, and describes a time-domain PHAT equivalent.
Abstract: Due to hard- and software progress applications based on sound enhancement are gaining popularity. But such applications are often still limited by hardware costs, energy and real-time constraints, thereby bounding the available complexity. One task often accompanied with (multichannel) sound enhancement is the localization of the sound source. This paper focusses on implementing an accurate Sound Source Localizer (SSL) for estimating the position of a sound source on a digital signal processor, using as less CPU resources as possible. One of the least complex algorithms for SSL is a simple correlation, implemented in the frequency-domain for efficiency, combined with a frequency bin weighing for robustness. Together called Generalized Cross Correlation (GCC). One popular weighing called GCC PHAse Transform (GCC-PHAT) will be handled. In this paper it is explained that for small microphone arrays this frequency-domain implementation is inferior to its time-domain alternative in terms of algorithmic complexity. Therefore a time-domain PHAT equivalent will be described. Both implementations are compared in terms of complexity (clock cycles needed on a Texas Instruments C5515 DSP) and obtained results, showing a complexity gain with a factor of 146, with hardly any loss in localization accuracy.

01 Jan 2012
Abstract: The conventional triangulation technique cannot locate the acoustic source in an anisotropic plate because this technique requires the wave speed to be independent of the propagation direction which is not the case for an anisotropic plate. All methods, proposed so far for source localization in anisotropic plates, require either the knowledge of the direction dependent velocity profile or a dense array of sensors. In this paper a technique is proposed to locate the acoustic source in large anisotropic plates with the help of only six sensors without knowing the direction dependent velocity profile in the plate. The proposed technique should work equally well for monitoring large isotropic and anisotropic plates. For an isotropic plate the number of sensors required for the acoustic source localization can be reduced to four.

Journal ArticleDOI
TL;DR: The object of the present work was to provide an equivalent source technique that allows the recovery of a free field in a noisy environment by using near-field acoustical holography based on the ESM, which makes the results equivalent to those that could be obtained from a free-field measurement.
Abstract: In previous studies, a sound field separation technique based on the equivalent source method (ESM) was successfully applied to separate the incoming and outgoing fields composing a non-free field However, if the incoming wave is scattered by the source surface, the outgoing field is not the field that would be generated by the source in a free field The object of the present work was to provide an equivalent source technique that allows the recovery of that free field in a noisy environment In this approach, the incoming and outgoing fields, including the scattered and directly radiated fields on the measurement surface, are separated to obtain the free-field pressure that would be radiated by the source in an anechoic environment The recovered free-field pressure is then used to reconstruct the whole free field of the source by using near-field acoustical holography based on the ESM, which makes the results equivalent to those that could be obtained from a free-field measurement A theoretical description of the technique is given first, and then three numerical cases are investigated to demonstrate the ability of the proposed method

Journal ArticleDOI
TL;DR: A novel acoustic source localization method based on a 3-D parameter space defined by the 2-D spatial location of a source and the range difference extracted from the time difference of arrival (TDOA), which offers excellent localization accuracy in the scenario of multiple unsynchronized arrays as well as in simpler single-array scenarios.
Abstract: In this paper, we propose a novel acoustic source localization method that accommodates the general scenario of multiple independent microphone arrays. The method is based on a 3-D parameter space defined by the 2-D spatial location of a source and the range difference extracted from the time difference of arrival (TDOA). In this space, the set of points that correspond to a given range lie on a circle that expand as the range increases, forming a cone whose apex is the actual location of the source. In this parameter space, the lack of synchronization between arrays results in the fact that clusters of data associated to individual arrays are free to shift along the range axis. The cone constraint, in fact, enables the realignment of such clusters while positioning the cone vertex (source location), thus resulting in a joint data re-synchronization and source localization. We also propose a novel and general analysis methodology for swiftly assessing the localization error as a function of the TDOA uncertainties, which is remarkably accurate for small localization bias. With the aid of this method, simulations and experiments on real data, we show that the cone-fitting process offers excellent localization accuracy in the scenario of multiple unsynchronized arrays, as well as in simpler single-array scenarios, also in comparison with state-of-the-art techniques. We also show that the proposed method offers the desired flexibility for adapting to arbitrary geometries of microphone clusters.

Proceedings ArticleDOI
22 Oct 2012
TL;DR: This paper addresses robot audition for dynamic environments, where speakers and/or a robot is moving within a dynamically-changing acoustic environment, and develops new techniques for a robot to listen to several things simultaneously using its own ears even in dynamic environments; MUltiple SIgnal Classification based on Generalized Eigen-Value Decomposition, GEVD-MUSIC, and HRLE.
Abstract: This paper addresses robot audition for dynamic environments, where speakers and/or a robot is moving within a dynamically-changing acoustic environment. Robot Audition studied so far assumed only stationary human-robot interaction scenes, and thus they have difficulties in coping with such dynamic environments. We recently developed new techniques for a robot to listen to several things simultaneously using its own ears even in dynamic environments; MUltiple SIgnal Classification based on Generalized Eigen-Value Decomposition (GEVD-MUSIC), Geometrically constrained High-order Decorrelation based Source Separation with Adaptive Step-size control (GHDSS-AS), Histogram-based Recursive Level Estimation (HRLE), and Template-based Ego Noise Suppression (TENS). GEVD-MUSIC provides noise-robust sound source localization. GHDSS-AS is a new sound source separation method which quickly adapts its sound source separation parameters to dynamic changes. HRLE is a practical post-filtering method with a small number of parameters. ENS estimates the motor noise of the robot by using templates recorded in advance and eliminates it. These methods are implemented as modules for our open-source robot audition software HARK to be easily integrated. We show that each of these methods and their combinations are effective to cope with dynamic environments through off-line experiments and on-line real-time demonstrations.

Proceedings ArticleDOI
25 Mar 2012
TL;DR: This work proposes a novel real-time adaptative localization approach for multiple sources using a circular array, in order to suppress the localization ambiguities faced with linear arrays, and assuming a weak sound source sparsity which is derived from blind source separation methods.
Abstract: We propose a novel real-time adaptative localization approach for multiple sources using a circular array, in order to suppress the localization ambiguities faced with linear arrays, and assuming a weak sound source sparsity which is derived from blind source separation methods. Our proposed method performs very well both in simulations and in real conditions at 50% real-time.

Journal ArticleDOI
TL;DR: It is proposed to employ the well-known concept of decomposing the sound field under consideration into a continuum of plane waves, for which the secondary source selection is straightforward.
Abstract: Near-field compensated higher order Ambisonics (NFC-HOA) and wave field synthesis (WFS) constitute the two best-known analytic sound field synthesis methods. While WFS is typically used for the synthesis of virtual sound scenes, NFC-HOA is typically employed in order to synthesize sound fields that have been captured with appropriate microphone arrays. Such recorded sound fields are essentially represented by the coefficients of the underlying surface spherical harmonics expansion. A sound field described by such coefficients cannot be straightforwardly synthesized in WFS. This is a consequence of the fact that, unlike in NFC-HOA, it is critical in WFS to carefully select those loudspeakers that contribute to the synthesis of a given sound source in a sound field under consideration. In order to enable such a secondary source selection, it is proposed to employ the well-known concept of decomposing the sound field under consideration into a continuum of plane waves, for which the secondary source selection is straightforward. The plane wave representation is projected onto the horizontal plane and a closed form expression of the secondary source driving signals for horizontal WFS systems of arbitrary convex shape is derived.