scispace - formally typeset
Search or ask a question

Showing papers on "Acoustic source localization published in 2013"


Journal ArticleDOI
TL;DR: In this article, the velocity field of unforced, high Reynolds number, subsonic jets, issuing from round nozzles with turbulent boundary layers, is measured using a hot-wire anemometer and a stereoscopic, time-resolved PIV system.
Abstract: We study the velocity fields of unforced, high Reynolds number, subsonic jets, issuing from round nozzles with turbulent boundary layers. The objective of the study is to educe wavepackets in such flows and to explore their relationship with the radiated sound. The velocity field is measured using a hot-wire anemometer and a stereoscopic, time-resolved PIV system. The field can be decomposed into frequency and azimuthal Fourier modes. The low-angle sound radiation is measured synchronously with a microphone ring array. Consistent with previous observations, the azimuthal wavenumber spectra of the velocity and acoustic pressure fields are distinct. The velocity spectrum of the initial mixing layer exhibits a peak at azimuthal wavenumbers ranging from 4 to 11, and the peak is found to scale with the local momentum thickness of the mixing layer. The acoustic pressure field is, on the other hand, predominantly axisymmetric, suggesting an increased relative acoustic efficiency of the axisymmetric mode of the velocity field, a characteristic that can be shown theoretically to be caused by the radial compactness of the sound source. This is confirmed by significant correlations, as high as 10 %, between the axisymmetric modes of the velocity and acoustic pressure fields, these values being significantly higher than those reported for two-point flow–acoustic correlations in subsonic jets. The axisymmetric and first helical modes of the velocity field are then compared with solutions of linear parabolized stability equations (PSE) to ascertain if these modes correspond to linear wavepackets. For all but the lowest frequencies close agreement is obtained for the spatial amplification, up to the end of the potential core. The radial shapes of the linear PSE solutions also agree with the experimental results over the same region. The results suggests that, despite the broadband character of the turbulence, the evolution of Strouhal numbers 0.3 ≤ St ≤ 0.9 and azimuthal modes 0 and 1 can be modelled as linear wavepackets, and these are associated with the sound radiated to low polar angles.

226 citations


Journal ArticleDOI
TL;DR: A multiple sound source localization and counting method is presented, that imposes relaxed sparsity constraints on the source signals, and is shown to have excellent performance for DOA estimation and source counting, and to be highly suitable for real-time applications due to its low complexity.
Abstract: In this work, a multiple sound source localization and counting method is presented, that imposes relaxed sparsity constraints on the source signals. A uniform circular microphone array is used to overcome the ambiguities of linear arrays, however the underlying concepts (sparse component analysis and matching pursuit-based operation on the histogram of estimates) are applicable to any microphone array topology. Our method is based on detecting time-frequency (TF) zones where one source is dominant over the others. Using appropriately selected TF components in these “single-source” zones, the proposed method jointly estimates the number of active sources and their corresponding directions of arrival (DOAs) by applying a matching pursuit-based approach to the histogram of DOA estimates. The method is shown to have excellent performance for DOA estimation and source counting, and to be highly suitable for real-time applications due to its low complexity. Through simulations (in various signal-to-noise ratio conditions and reverberant environments) and real environment experiments, we indicate that our method outperforms other state-of-the-art DOA and source counting methods in terms of accuracy, while being significantly more efficient in terms of computational complexity.

218 citations


Journal ArticleDOI
TL;DR: A method for localizing an acoustic source with distributed microphone networks that turns to exhibit a significantly lower computational cost compared with state-of-the-art techniques, while retaining an excellent localization accuracy in fairly reverberant conditions.
Abstract: We propose a method for localizing an acoustic source with distributed microphone networks. Time Differences of Arrival (TDOAs) of signals pertaining the same sensor are estimated through Generalized Cross-Correlation. After a TDOA filtering stage that discards measurements that are potentially unreliable, source localization is performed by minimizing a fourth-order polynomial that combines hyperbolic constraints from multiple sensors. The algorithm turns to exhibit a significantly lower computational cost compared with state-of-the-art techniques, while retaining an excellent localization accuracy in fairly reverberant conditions.

96 citations


Journal ArticleDOI
TL;DR: Realistic acoustic effects such as diffraction, low-passed sound behind obstructions, focusing, scattering, high-order reflections, and echoes on a variety of scenes are demonstrated.
Abstract: We present a novel approach for wave-based sound propagation suitable for large, open spaces spanning hundreds of meters, with a small memory footprint. The scene is decomposed into disjoint rigid objects. The free-field acoustic behavior of each object is captured by a compact per-object transfer function relating the amplitudes of a set of incoming equivalent sources to outgoing equivalent sources. Pairwise acoustic interactions between objects are computed analytically to yield compact inter-object transfer functions. The global sound field accounting for all orders of interaction is computed using these transfer functions. The runtime system uses fast summation over the outgoing equivalent source amplitudes for all objects to auralize the sound field for a moving listener in real time. We demonstrate realistic acoustic effects such as diffraction, low-passed sound behind obstructions, focusing, scattering, high-order reflections, and echoes on a variety of scenes.

90 citations


Proceedings ArticleDOI
01 Nov 2013
TL;DR: The method uses a Gaussian process regression to estimate the noise correlation matrix in each time period from the measurements of self-monitoring sensors attached to the UAV such as the pitch-roll-yaw tilt angles, xyz speeds, and motor control values.
Abstract: A method has been developed for improving sound source localization (SSL) using a microphone array from an unmanned aerial vehicle with multiple rotors, a “multirotor UAV”. One of the main problems in SSL from a multirotor UAV is that the ego noise of the rotors on the UAV interferes with the audio observation and degrades the SSL performance. We employ a generalized eigenvalue decomposition-based multiple signal classification (GEVD-MUSIC) algorithm to reduce the effect of ego noise. While GEVD-MUSIC algorithm requires a noise correlation matrix corresponding to the auto-correlation of the multichannel observation of the rotor noise, the noise correlation is nonstationary due to the aerodynamic control of the UAV. Therefore, we need an adaptive estimation method of the noise correlation matrix for a robust SSL using GEVD-MUSIC algorithm. Our method uses a Gaussian process regression to estimate the noise correlation matrix in each time period from the measurements of self-monitoring sensors attached to the UAV such as the pitch-roll-yaw tilt angles, xyz speeds, and motor control values. Experiments compare our method with existing SSL methods in terms of precision and recall rates of SSL. The results demonstrate that our method outperforms existing methods, especially under high signal-to-noise-ratio conditions.

75 citations


Journal ArticleDOI
01 Nov 2013
TL;DR: A novel two-way pressure coupling technique at the interface of near-object and far-field regions that is able to simulate high-fidelity acoustic effects in large, complex indoor and outdoor environments and Half-Life 2 game engine.
Abstract: We present a novel hybrid approach that couples geometric and numerical acoustic techniques for interactive sound propagation in complex environments. Our formulation is based on a combination of spatial and frequency decomposition of the sound field. We use numerical wave-based techniques to precompute the pressure field in the near-object regions and geometric propagation techniques in the far-field regions to model sound propagation. We present a novel two-way pressure coupling technique at the interface of near-object and far-field regions. At runtime, the impulse response at the listener position is computed at interactive rates based on the stored pressure field and interpolation techniques. Our system is able to simulate high-fidelity acoustic effects such as diffraction, scattering, low-pass filtering behind obstruction, reverberation, and high-order reflections in large, complex indoor and outdoor environments and Half-Life 2 game engine. The pressure computation requires orders of magnitude lower memory than standard wave-based numerical techniques.

62 citations


Journal ArticleDOI
TL;DR: An experimental study of spatial sound perception with the use of a spherical microphone array for sound recording and headphone-based binaural sound synthesis shows that a source will be perceived more spatially sharp and more externalized when represented by a bINAural stimuli reconstructed with a higher spherical harmonics order.
Abstract: The area of sound field synthesis has significantly advanced in the past decade, facilitated by the development of high-quality sound-field capturing and re-synthesis systems. Spherical microphone arrays are among the most recently developed systems for sound field capturing, enabling processing and analysis of three-dimensional sound fields in the spherical harmonics domain. In spite of these developments, a clear relation between sound fields recorded by spherical microphone arrays and their perception with a re-synthesis system has not yet been established, although some relation to scalar measures of spatial perception was recently presented. This paper presents an experimental study of spatial sound perception with the use of a spherical microphone array for sound recording and headphone-based binaural sound synthesis. Sound field analysis and processing is performed in the spherical harmonics domain with the use of head-related transfer functions and simulated enclosed sound fields. The effect of several factors, such as spherical harmonics order, frequency bandwidth, and spatial sampling, are investigated by applying the repertory grid technique to the results of the experiment, forming a clearer relation between sound-field capture with a spherical microphone array and its perception using binaural synthesis regarding space, frequency, and additional artifacts. The experimental study clearly shows that a source will be perceived more spatially sharp and more externalized when represented by a binaural stimuli reconstructed with a higher spherical harmonics order. This effect is apparent from low spherical harmonics orders. Spatial aliasing, as a result of sound field capturing with a finite number of microphones, introduces unpleasant artifacts which increased with the degree of aliasing error.

60 citations


Journal ArticleDOI
TL;DR: In this article, the authors reviewed psychophysical studies with electric acoustic stimulation, along with the current state of the art in fitting, and experimental signal processing techniques for electric acoustic stimulations.
Abstract: The addition of acoustic stimulation to electric stimulation via a cochlear implant has been shown to be advantageous for speech perception in noise, sound quality, music perception, and sound source localization. However, the signal processing and fitting procedures of current cochlear implants and hearing aids were developed independently, precluding several potential advantages of bimodal stimulation, such as improved sound source localization and binaural unmasking of speech in noise. While there is a large and increasing population of implantees who use a hearing aid, there are currently no generally accepted fitting methods for this configuration. It is not practical to fit current commercial devices to achieve optimal binaural loudness balance or optimal binaural cue transmission for arbitrary signals and levels. There are several promising experimental signal processing systems specifically designed for bimodal stimulation. In this article, basic psychophysical studies with electric acoustic stimulation are reviewed, along with the current state of the art in fitting, and experimental signal processing techniques for electric acoustic stimulation.

59 citations


Journal Article
TL;DR: A method is proposed to localize an acoustic source within a frequency band from 100Hz to 4 KHz in two dimensions using microphone array by calculating the direction of arrival (DOA) of the acoustic signals using a set of spatially separated microphones.
Abstract: Source Localization is a very well established technique that has a wide range of applications from remote sensing to the Global Positioning System. Sound source localization techniques are used in commercial applications like improving speech quality in hands free telephony, video conferencing to military applications like SONAR, surveillance systems and devices to locate the sources of artillery fire. A method is proposed to localize an acoustic source within a frequency band from 100Hz to 4 KHz in two dimensions using microphone array by calculating the direction of arrival (DOA) of the acoustic signals. Direction of arrival (DOA) estimation of acoustic signals using a set of spatially separated microphones uses the phase information present in signals. For this the time-delays are estimated for each pair of microphones in the array. From the known array geometry and the direction of arrival, the location of source can be obtained.

57 citations


Journal ArticleDOI
TL;DR: Electroencephalography (EEG) in conjunction with sound field stimulus presentation was used to inform the development of an explicit computational model of human sound source localization and predicted the oft-reported decrease in spatial acuity measured psychophysically with increasing reference azimuth.
Abstract: Research with barn owls suggested that sound source location is represented topographically in the brain by an array of neurons each tuned to a narrow range of locations. However, research with small-headed mammals has offered an alternative view in which location is represented by the balance of activity in two opponent channels broadly tuned to the left and right auditory space. Both channels may be present in each auditory cortex, although the channel representing contralateral space may be dominant. Recent studies have suggested that opponent channel coding of space may also apply in humans, although these studies have used a restricted set of spatial cues or probed a restricted set of spatial locations, and there have been contradictory reports as to the relative dominance of the ipsilateral and contralateral channels in each cortex. The current study used electroencephalography (EEG) in conjunction with sound field stimulus presentation to address these issues and to inform the development of an explicit computational model of human sound source localization. Neural responses were compatible with the opponent channel account of sound source localization and with contralateral channel dominance in the left, but not the right, auditory cortex. A computational opponent channel model reproduced every important aspect of the EEG data and allowed inferences about the width of tuning in the spatial channels. Moreover, the model predicted the oft-reported decrease in spatial acuity measured psychophysically with increasing reference azimuth. Predictions of spatial acuity closely matched those measured psychophysically by previous authors.

55 citations


Journal ArticleDOI
TL;DR: A geometry-based spatial sound acquisition technique is proposed to compute virtual microphone signals that manifest a different perspective of the sound field using a parametric sound field model formulated in the time-frequency domain.
Abstract: Traditional spatial sound acquisition aims at capturing a sound field with multiple microphones such that at the reproduction side a listener can perceive the sound image as it was at the recording location. Standard techniques for spatial sound acquisition usually use spaced omnidirectional microphones or coincident directional microphones. Alternatively, microphone arrays and spatial filters can be used to capture the sound field. From a geometric point of view, the perspective of the sound field is fixed when using such techniques. In this paper, a geometry-based spatial sound acquisition technique is proposed to compute virtual microphone signals that manifest a different perspective of the sound field. The proposed technique uses a parametric sound field model that is formulated in the time-frequency domain. It is assumed that each time-frequency instant of a microphone signal can be decomposed into one direct and one diffuse sound component. It is further assumed that the direct component is the response of a single isotropic point-like source (IPLS) of which the position is estimated for each time-frequency instant using distributed microphone arrays. Given the sound components and the position of the IPLS, it is possible to synthesize a signal that corresponds to a virtual microphone at an arbitrary position and with an arbitrary pick-up pattern.

Journal ArticleDOI
TL;DR: A theoretical model of a micro-perforated panel backed by a finite cavity and flush-mounted in an infinite baffle is developed and its performance in terms of sound absorption is analyzed and shows that the absorption coefficient is a function of angle and frequency of the incident sound.
Abstract: In this paper, a theoretical model of a micro-perforated panel (MPP) backed by a finite cavity and flush-mounted in an infinite baffle is developed and its performance in terms of sound absorption is analyzed. The model allows an oblique incidence sound impinging upon the MPP absorber. The simplified Rayleigh integral method, thin plate theory and the acoustical impedance of the MPP are used to calculate the sound energy absorbed by the MPP's surface. Results show that the absorption coefficient of the absorber is a function of angle and frequency of the incident sound, and is controlled by the coupling between the MPP and the acoustical modes in the back cavity. In particular, grazing modes can be induced in the cavity by sound with an oblique angle of incidence, which may result in peak sound absorptions at the natural frequencies of the modes. The mechanism involved is used to explain the absorption properties of the MPP absorber for a diffuse incidence of sound.

Journal ArticleDOI
TL;DR: This study addresses a framework for a robot audition system, including sound source localization (SSL) and sound source separation (SSS), that can robustly recognize simultaneous speeches in a real environment and proposes two methods for SSL: MUSIC based on generalized singular value decomposition (GSVD-MUSIC) and hierarchical SSL (H-SSL).
Abstract: This study addresses a framework for a robot audition system, including sound source localization (SSL) and sound source separation (SSS), that can robustly recognize simultaneous speeches in a real environment. Because SSL estimates not only the location of speakers but also the number of speakers, such a robust framework is essential for simultaneous speech recognition. Moreover, improvement in the performance of SSS is crucial for simultaneous speech recognition because the robot has to recognize the individual source of speeches. For simultaneous speech recognition, current robot audition systems mainly require noise-robustness, high resolution, and real-time implementation. Multiple signal classification (MUSIC) based on standard Eigenvalue decomposition (SEVD) and Geometric-constrained high-order decorrelation-based source separation (GHDSS) are techniques utilizing microphone array processing, which are used for SSL and SSS, respectively. To enhance SSL robustness against noise while detec...

Journal ArticleDOI
TL;DR: In this paper, a high-power neodymium-yttrium-aluminum-garnet pulse laser is used in this system for generating the laser-induced breakdown in acoustic fields.

Journal ArticleDOI
TL;DR: This sound source localization task suggests that listeners with normal hearing perform with high reliability/repeatability, little response bias, and with performance measures that are normally distributed with a mean root-mean-square error of 6.2° and a standard deviation of 1.79°.
Abstract: Several measures of sound source localization performance of 45 listeners with normal hearing were obtained when loudspeakers were in the front hemifield Localization performance was not statistically affected by filtering the 200-ms, 2-octave or wider noise bursts (125 to 500, 1500 to 6000, and 125 to 6000 Hz wide noise bursts) This implies that sound source localization performance for noise stimuli is not differentially affected by which interaural cue (interaural time or level difference) a listener with normal hearing uses for sound source localization, at least for relatively broadband signals This sound source localization task suggests that listeners with normal hearing perform with high reliability/repeatability, little response bias, and with performance measures that are normally distributed with a mean root-mean-square error of 62° and a standard deviation of 179°

Journal ArticleDOI
TL;DR: A localization algorithm based on discrimination of cross-correlation functions that provides higher localization accuracy than the SRP-PHAT algorithm in reverberant noisy environment is proposed.

Journal ArticleDOI
12 Jul 2013-Sensors
TL;DR: This paper aims at estimating the azimuth, range and depth of a cooperative broadband acoustic source with a single vector sensor in a multipath underwater environment, where the received signal is assumed to be a linear combination of echoes of the source emitted waveform.
Abstract: This paper aims at estimating the azimuth, range and depth of a cooperative broadband acoustic source with a single vector sensor in a multipath underwater environment, where the received signal is assumed to be a linear combination of echoes of the source emitted waveform. A vector sensor is a device that measures the scalar acoustic pressure field and the vectorial acoustic particle velocity field at a single location in space. The amplitudes of the echoes in the vector sensor components allow one to determine their azimuth and elevation. Assuming that the environmental conditions of the channel are known, source range and depth are obtained from the estimates of elevation and relative time delays of the different echoes using a ray-based backpropagation algorithm. The proposed method is tested using simulated data and is further applied to experimental data from the Makai'05 experiment, where 8-14 kHz chirp signals were acquired by a vector sensor array. It is shown that for short ranges, the position of the source is estimated in agreement with the geometry of the experiment. The method is low computational demanding, thus well-suited to be used in mobile and light platforms, where space and power requirements are limited.

Proceedings ArticleDOI
01 Nov 2013
TL;DR: A sound source azimuth estimation approach in reverberant environments that exploits binaural signals in a humanoid robotic context using Interaural Time and Level Differences and a neural network-based learning scheme.
Abstract: Sound source localization is an important feature designed and implemented on robots and intelligent systems Like other artificial audition tasks, it is constrained to multiple problems, notably sound reflections and noises This paper presents a sound source azimuth estimation approach in reverberant environments It exploits binaural signals in a humanoid robotic context Interaural Time and Level Differences (ITD and ILD) are extracted on multiple frequency bands and combined with a neural network-based learning scheme A cue filtering process is used to reduce the reverberations effects The system has been evaluated with simulation and real data, in multiple aspects covering realistic robot operating conditions, and was proven satisfying and effective as will be shown and discussed in the paper

Journal ArticleDOI
TL;DR: The technique presented here provides a method to localize the relevant radiating surface areas on a vibrating structure that contribute to the radiated sound power.
Abstract: This paper presents a method to identify the surface areas of a vibrating structure that contribute to the radiated sound power. The surface contributions of the structure are based on the acoustic radiation modes and are computed for all boundaries of the acoustic domain. The surface contributions are compared to the acoustic intensity, which is a common measure for near-field acoustic energy. Sound intensity usually has positive and negative values that correspond to energy sources and sinks on the surface of the radiating structure. Sound from source and sink areas partially cancel each other and only a fraction of the near-field acoustic energy reaches the far-field. In contrast to the sound intensity, the surface contributions are always positive and no cancelation effects exist. The technique presented here provides a method to localize the relevant radiating surface areas on a vibrating structure. To illustrate the method, the radiated sound power from a baffled square plate is presented.

Journal ArticleDOI
TL;DR: A wave-trapping barrier (WTB), with its inner surface covered by wedge-shaped structures, has been proposed to confine waves within the area between the barrier and the reflecting surface, and thus improve the performance of a conventional sound barrier.
Abstract: The performance of a sound barrier is usually degraded if a large reflecting surface is placed on the source side. A wave-trapping barrier (WTB), with its inner surface covered by wedge-shaped structures, has been proposed to confine waves within the area between the barrier and the reflecting surface, and thus improve the performance. In this paper, the deterioration in performance of a conventional sound barrier due to the reflecting surface is first explained in terms of the resonance effect of the trapped modes. At each resonance frequency, a strong and mode-controlled sound field is generated by the noise source both within and in the vicinity outside the region bounded by the sound barrier and the reflecting surface. It is found that the peak sound pressures in the barrier's shadow zone, which correspond to the minimum values in the barrier's insertion loss, are largely determined by the resonance frequencies and by the shapes and losses of the trapped modes. These peak pressures usually result in high sound intensity component impinging normal to the barrier surface near the top. The WTB can alter the sound wave diffraction at the top of the barrier if the wavelengths of the sound wave are comparable or smaller than the dimensions of the wedge. In this case, the modified barrier profile is capable of re-organizing the pressure distribution within the bounded domain and altering the acoustic properties near the top of the sound barrier.

Journal ArticleDOI
TL;DR: This paper investigates the efficiency of a field separation method for the identification of sound sources in small and non-anechoic spaces by using spherical harmonic expansions and the influence of the walls' reflection coefficient.
Abstract: This paper investigates the efficiency of a field separation method for the identification of sound sources in small and non-anechoic spaces. When performing measurements in such environments, the acquired data contain information from the direct field radiated by the source of interest and reflections from walls. To get rid of the unwanted contributions and assess the field radiated by the source of interest, a field separation method is used. Acoustic data (pressure or velocity) are then measured on a hemispheric array whose base is laying on the surface of interest. Then, by using spherical harmonic expansions, contributions from outgoing and incoming waves can be separated if the impedance of the tested surface is high enough. Depending on the probe type, different implementations of the separation method are numerically compared. In addition, the influence of the walls' reflection coefficient is studied. Finally, measurements are performed using an array made-up of 36 p-p probes. Results obtained in a car trunk mock-up with controlled sources are first presented before reporting results measured in a real car running on a roller bench.

Journal ArticleDOI
TL;DR: In this paper, the authors examined an experimental method for locating aerodynamic sound sources from a bluff body in a stream using cross-correlation analysis of the pressure fluctuations on and around the flow field and the sound pressure fluctuations.

Journal ArticleDOI
TL;DR: Four acoustic Seagliders were deployed in the Philippine Sea November 2010 to April 2011 in the vicinity of an acoustic tomography array and measured acoustic arrival peaks were unambiguously associated with predicted ray arrivals.
Abstract: Four acoustic Seagliders were deployed in the Philippine Sea November 2010 to April 2011 in the vicinity of an acoustic tomography array. The gliders recorded over 2000 broadband transmissions at ranges up to 700 km from moored acoustic sources as they transited between mooring sites. The precision of glider positioning at the time of acoustic reception is important to resolve the fundamental ambiguity between position and sound speed. The Seagliders utilized GPS at the surface and a kinematic model below for positioning. The gliders were typically underwater for about 6.4 h, diving to depths of 1000 m and traveling on average 3.6 km during a dive. Measured acoustic arrival peaks were unambiguously associated with predicted ray arrivals. Statistics of travel-time offsets between received arrivals and acoustic predictions were used to estimate range uncertainty. Range (travel time) uncertainty between the source and the glider position from the kinematic model is estimated to be 639 m (426 ms) rms. Least-squares solutions for glider position estimated from acoustically derived ranges from 5 sources differed by 914 m rms from modeled positions, with estimated uncertainty of 106 m rms in horizontal position. Error analysis included 70 ms rms of uncertainty due to oceanic sound-speed variability.

Journal ArticleDOI
TL;DR: A novel architecture for on-the-fly inference while collecting data from sparse sensor networks, considering source localization using acoustic sensors dispersed over a large area, with the individual sensors located too far apart for direct connectivity.
Abstract: We propose and demonstrate a novel architecture for on-the-fly inference while collecting data from sparse sensor networks. In particular, we consider source localization using acoustic sensors dispersed over a large area, with the individual sensors located too far apart for direct connectivity. An Unmanned Aerial Vehicle (UAV) is employed for collecting sensor data, with the UAV route adaptively adjusted based on data from sensors already visited, in order to minimize the time to localize events of interest. The UAV therefore acts as a information-seeking data mule, not only providing connectivity, but also making Bayesian inferences from the data gathered in order to guide its future actions. The system we demonstrate has a modular architecture, comprising efficient algorithms for acoustic signal processing, routing the UAV to the sensors, and source localization. We report on extensive field tests which not only demonstrate the effectiveness of our general approach, but also yield specific practical insights into GPS time synchronization and localization accuracy, acoustic signal and channel characteristics, and the effects of environmental phenomena.

Journal ArticleDOI
Xiaofei Li1, Hong Liu1
TL;DR: Fourth-order cumulant spectrum is derived, based on which a time delay estimation (TDE) algorithm that is available for speech signal and immune to spatially correlated Gaussian noise is proposed and a spatial grid matching algorithm is proposed for localization step, which handles some problems that geometric positioning method faces effectively.
Abstract: In human-robot interaction (HRI), speech sound source localization (SSL) is a convenient and efficient way to obtain the relative position between a speaker and a robot. However, implementing a SSL system based on TDOA method encounters many problems, such as noise of real environments, the solution of nonlinear equations, switch between far field and near field. In this paper, fourth-order cumulant spectrum is derived, based on which a time delay estimation (TDE) algorithm that is available for speech signal and immune to spatially correlated Gaussian noise is proposed. Furthermore, time difference feature of sound source and its spatial distribution are analyzed, and a spatial grid matching (SGM) algorithm is proposed for localization step, which handles some problems that geometric positioning method faces effectively. Valid feature detection algorithm and a decision tree method are also suggested to improve localization performance and reduce computational complexity. Experiments are carried out in real environments on a mobile robot platform, in which thousands of sets of speech data with noise collected by four microphones are tested in 3D space. The effectiveness of our TDE method and SGM algorithm is verified.

Journal ArticleDOI
TL;DR: This work investigates the effect of aperture size on the behavior of coupled-volume systems using both acoustic scale-models and diffusion equation models and reveals valid ranges and limitations of the diffusion equation model for room-acoustic modeling.
Abstract: Some recent concert hall designs have incorporated coupled reverberation chambers to the main hall that have stimulated a range of research activities in architectural acoustics. The coupling apertures between two or more coupled-volume systems are of central importance for sound propagation and sound energy decays throughout the coupled-volume systems. In addition, positions of sound sources and receivers relative to the aperture also have a profound influence on the sound energy distributions and decays. This work investigates the effect of aperture size on the behavior of coupled-volume systems using both acoustic scale-models and diffusion equation models. In these physical and numerical models, the sound source and receiver positions relative to the aperture are also investigated. Through systematic comparisons between results achieved from both physical scale models and numerical models, this work reveals valid ranges and limitations of the diffusion equation model for room-acoustic modeling.

Journal ArticleDOI
TL;DR: For sound source localization in patients with bilateral CIs and bilateral hearing preservation, Interaural level difference cues may dominate interaural time difference cues and hearing-preservation surgery can be of benefit to patients fit with bilateralCIs.
Abstract: OBJECTIVES The authors describe the localization and speech-understanding abilities of a patient fit with bilateral cochlear implants (CIs) for whom acoustic low-frequency hearing was preserved in both cochleae. DESIGN Three signals were used in the localization experiments: low-pass, high-pass, and wideband noise. Speech understanding was assessed with the AzBio sentences presented in noise. RESULTS Localization accuracy was best in the aided, bilateral acoustic hearing condition, and was poorer in both the bilateral CI condition and when the bilateral CIs were used in addition to bilateral low-frequency hearing. Speech understanding was best when low-frequency acoustic hearing was combined with at least one CI. CONCLUSIONS The authors found that (1) for sound source localization in patients with bilateral CIs and bilateral hearing preservation, interaural level difference cues may dominate interaural time difference cues and (2) hearing-preservation surgery can be of benefit to patients fit with bilateral CIs.

Proceedings ArticleDOI
01 Nov 2013
TL;DR: Results showed the effectiveness of using reflection information, depending on the position and orientation of the sound sources relative to the array, walls, and the source type, increased the source position detection rates by 10% on average and up to 60% for the best case.
Abstract: We proposed a method for estimating sound source locations in a 3D space by integrating sound directions estimated by multiple microphone arrays and taking advantage of reflection information. Two types of sources with different directivity properties (human speech and loudspeaker speech) were evaluated for different positions and orientations. Experimental results showed the effectiveness of using reflection information, depending on the position and orientation of the sound sources relative to the array, walls, and the source type. The use of reflection information increased the source position detection rates by 10% on average and up to 60% for the best case.

Patent
21 Aug 2013
TL;DR: In this article, a method and system for investigating structure near a borehole is described, which includes generating an acoustic beam by an acoustic source, directing at one or more azimuthal angles the acoustic beam towards a selected location in a vicinity of a Borehole, receiving at one of the receivers an acoustic signal, the acoustic signal originating from a reflection or a refraction of the acoustic wave by a material at the selected location, and analyzing the received acoustic signal to characterize features of the material around the borehole.
Abstract: A method and system for investigating structure near a borehole are described herein. The method includes generating an acoustic beam by an acoustic source; directing at one or more azimuthal angles the acoustic beam towards a selected location in a vicinity of a borehole; receiving at one or more receivers an acoustic signal, the acoustic signal originating from a reflection or a refraction of the acoustic wave by a material at the selected location; and analyzing the received acoustic signal to characterize features of the material around the borehole.

Journal ArticleDOI
TL;DR: It was found that both methods are capable of reconstructing the free-field pressure radiated by the target source based on measurements made in a noisy environment, but the velocity-based method shows a large benefit in the reconstruction of thefree-field particle velocity.
Abstract: In a noisy environment, the sound field of a source is composed of three parts, which are: The field that would be radiated by the target source into free space, the incoming field from disturbing sources or reflections, and the scattered field that is created by the incoming wave falling on the target source. To accurately identify the sound source with nearfield acoustic holography in that situation, the last two parts must be removed from the mixed field. In a previous study, a method for recovering the free sound field in a noisy environment was proposed based on the equivalent source method and measurements of pressure [J. Acoust. Soc. Am. 131(2), 1260–1270 (2012)]. In the present paper, that method was modified by allowing the input data to be measurements of particle velocity instead of pressure. An experiment was carried out to examine both the pressure- and velocity-based methods, and the performance of the two methods was compared. It was found that both methods are capable of reconstructing the free-field pressure radiated by the target source based on measurements made in a noisy environment, but the velocity-based method shows a large benefit in the reconstruction of the free-field particle velocity.