scispace - formally typeset
Search or ask a question

Showing papers on "Acoustic source localization published in 2010"


Journal ArticleDOI
TL;DR: The design and implementation of the HARK robot audition software system consisting of sound source localization modules, sound source separation modules and automatic speech recognition modules of separated speech signals that works on any robot with any microphone configuration are presented.
Abstract: This paper presents the design and implementation of the HARK robot audition software system consisting of sound source localization modules, sound source separation modules and automatic speech re...

209 citations


Journal ArticleDOI
TL;DR: A subset of children who use sequential BICIs can acquire sound localization abilities, even after long intervals between activation of hearing in the first- and second-implanted ears, suggesting that children with activation of the second implant later in life may be capable of developing spatial hearing abilities.
Abstract: Objectives To measure sound source localization in children who have sequential bilateral cochlear implants (BICIs); to determine if localization accuracy correlates with performance on a right-left discrimination task (i.e., spatial acuity); to determine if there is a measurable bilateral benefit on a sound source identification task (i.e., localization accuracy) by comparing performance under bilateral and unilateral listening conditions; to determine if sound source localization continues to improve with longer durations of bilateral experience.

136 citations


Journal ArticleDOI
TL;DR: The proposed method gives an improvement in the efficiency of radiation into the space in which the sound should be audible, while maintaining the acoustic pressure difference between two acoustic spaces.
Abstract: There has recently been an increasing interest in the generation of a sound field that is audible in one spatial region and inaudible in an adjacent region. The method proposed here ensures the control of the amplitude and phase of multiple acoustic sources in order to maximize the acoustic energy difference between two adjacent regions while also ensuring that evenly distributed source strengths are used. The performance of the method proposed is evaluated by computer simulations and experiments with real loudspeaker arrays in the shape of a circle and a sphere. The proposed method gives an improvement in the efficiency of radiation into the space in which the sound should be audible, while maintaining the acoustic pressure difference between two acoustic spaces. This is shown to give an improvement of performance compared to the contrast control method previously proposed.

113 citations


Journal ArticleDOI
TL;DR: The waveguide invariant principle is used to estimate the range to a broadband acoustic source in a shallow-water waveguide using a single acoustic receiver towed along a path directly toward the acoustic source.
Abstract: The waveguide invariant principle is used to estimate the range to a broadband acoustic source in a shallow-water waveguide using a single acoustic receiver towed along a path directly toward the acoustic source. A relationship between the signal processing parameters and the ocean-acoustic environmental parameters is used to increase the effective signal-to-noise ratio without requiring detailed knowledge of the environment. Heuristics are developed to estimate the minimum source bandwidth and minimum horizontal aperture required for range estimation. A range estimation algorithm is tested on experimental and simulated data for source ranges of 500–2200 m and frequencies from 350 to 700 Hz. The algorithm is accurate to within approximately 25% for the cases tested and requires only a minimal amount of a priori environmental knowledge.

86 citations


Journal ArticleDOI
TL;DR: In this article, a high frequency boundary integral equation is employed to calculate the sound pressure in the acoustic field, and the structural damping is considered as Rayleigh damping, where the structural vibrations are excited by a time-harmonic external mechanical surface loading with prescribed excitation frequency, amplitude and spatial distribution.
Abstract: This paper deals with topological design optimization of vibrating bi-material elastic structures placed in an acoustic medium. The structural vibrations are excited by a time-harmonic external mechanical surface loading with prescribed excitation frequency, amplitude and spatial distribution. The design objective is minimization of the sound pressure generated by the vibrating structures on a prescribed reference plane or surface in the acoustic medium. The design variables are the volumetric densities of material in the admissible design domain for the structure. A high frequency boundary integral equation is employed to calculate the sound pressure in the acoustic field. This way the acoustic analysis and the corresponding sensitivity analysis can be carried out in a very efficient manner. The structural damping is considered as Rayleigh damping. Penalization models with respect to the acoustic transformation matrix and/or the damping matrix are proposed in order to eliminate intermediate material volume densities, which have been found to appear obstinately in some of the high frequency designs. The influences of the excitation frequency and the structural damping on optimum topologies are investigated by numerical examples. Also, the problem of maximizing (rather than minimizing) sound pressures in points on a reference plane in the acoustic medium is treated. Many interesting features of the examples are revealed and discussed.

83 citations


Journal ArticleDOI
TL;DR: A parametric array loudspeaker (PAL) is introduced which produces a spatially focused sound beam due to the attribute of ultrasound used for carrier waves, thereby allowing one to suppress the sound pressure at the designated point without causing spillover in the whole sound field.
Abstract: Arguably active noise control enables the sound suppression at the designated control points, while the sound pressure except the targeted locations is likely to augment. The reason is clear; a control source normally radiates the sound omnidirectionally. To cope with this problem, this paper introduces a parametric array loudspeaker (PAL) which produces a spatially focused sound beam due to the attribute of ultrasound used for carrier waves, thereby allowing one to suppress the sound pressure at the designated point without causing spillover in the whole sound field. First the fundamental characteristics of PAL are overviewed. The scattered pressure in the near field contributed by source strength of PAL is then described, which is needed for the design of an active noise control system. Furthermore, the optimal control law for minimizing the sound pressure at control points is derived, the control effect being investigated analytically and experimentally. With a view to tracking a moving target point, a steerable PAL based upon a phased array scheme is presented, with the result that the generation of a moving zone of quiet becomes possible without mechanically rotating the PAL. An experiment is finally conducted, demonstrating the validity of the proposed method.

78 citations


Proceedings Article
01 Aug 2010
TL;DR: This paper presents an acoustic source localization method with low computational complexity which, instead of using individual microphone signals, combines them to form eigenbeams to compute a pseudointensity vector pointing in the direction of the sound source.
Abstract: The problem of acoustic source localization is important in many acoustic signal processing applications, such as distant speech acquisition and automated camera steering. In noisy and reverberant environments, the source localization problem becomes challenging and many existing algorithms deteriorate. Three-dimensional source localization presents advantages for certain applications such as beamforming, where we can steer a beam to both the desired azimuth and the desired elevation. In this paper, we present an acoustic source localization method with low computational complexity which, instead of using individual microphone signals, combines them to form eigenbeams. We then use the zero-and first-order eigenbeams to compute a pseudointensity vector pointing in the direction of the sound source. In an experimental study, the proposed method's localization accuracy is compared with that of a steered response power localization method, which uses the same eigenbeams. The results demonstrate that the proposed method has higher localization accuracy.

77 citations


Journal ArticleDOI
TL;DR: It is shown that under certain conditions and limitations, reverberation can be used to improve SSL performance and even a simplistic room model is sufficient to produce significant improvements in range and elevation estimation, tasks which would be very difficult when relying only on direct path signal components.
Abstract: Sound source localization (SSL) is an essential task in many applications involving speech capture and enhancement. As such, speaker localization with microphone arrays has received significant research attention. Nevertheless, existing SSL algorithms for small arrays still have two significant limitations: lack of range resolution, and accuracy degradation with increasing reverberation. The latter is natural and expected, given that strong reflections can have amplitudes similar to that of the direct signal, but different directions of arrival. Therefore, correctly modeling the room and compensating for the reflections should reduce the degradation due to reverberation. In this paper, we show a stronger result. If modeled correctly, early reflections can be used to provide more information about the source location than would have been available in an anechoic scenario. The modeling not only compensates for the reverberation, but also significantly increases resolution for range and elevation. Thus, we show that under certain conditions and limitations, reverberation can be used to improve SSL performance. Prior attempts to compensate for reverberation tried to model the room impulse response (RIR). However, RIRs change quickly with speaker position, and are nearly impossible to track accurately. Instead, we build a 3-D model of the room, which we use to predict early reflections, which are then incorporated into the SSL estimation. Simulation results with real and synthetic data show that even a simplistic room model is sufficient to produce significant improvements in range and elevation estimation, tasks which would be very difficult when relying only on direct path signal components.

72 citations


Journal ArticleDOI
TL;DR: In this article, the vibro-acoustic performance of a rectangular double-panel partition with enclosed air cavity and simply mounted on an infinite acoustic rigid baffle is investigated analytically.
Abstract: The vibro-acoustic performance of a rectangular double-panel partition with enclosed air cavity and simply mounted on an infinite acoustic rigid baffle is investigated analytically. The sound velocity potential method rather than the commonly used cavity modal function method is employed, which possesses good expandability and has significant implications for further vibro-acoustic investigations. The simply supported boundary condition is accounted for by using the method of modal function and the double Fourier series solutions are obtained to characterize the vibro-acoustic behaviors of the structure. The results for sound transmission loss, panel vibration level, and sound pressure level are presented to explore the physical mechanisms of sound energy penetration across the finite double-panel partition. Specifically, focus is placed on the influence of several key system parameters on sound transmission including the thickness of air cavity, structural dimensions, and the elevation angle and azimuth angle of the incidence sound. Further extensions of the sound velocity potential method to typical framed double-panel structures are also proposed.

63 citations


Journal ArticleDOI
TL;DR: The method involving the Fourier transform and some processing in the frequency-wavenumber domain is suitable for the study of stationary acoustic sources, providing an image of the spatial acoustic field for one frequency when the behavior of acoustic sources fluctuates in time.
Abstract: Near-field acoustic holography (NAH) is an effective tool for visualizing acoustic sources from pressure measurements made in the near-field of sources using a microphone array. The method involving the Fourier transform and some processing in the frequency-wavenumber domain is suitable for the study of stationary acoustic sources, providing an image of the spatial acoustic field for one frequency. When the behavior of acoustic sources fluctuates in time, NAH may not be used. Unlike time domain holography or transient method, the method proposed in the paper needs no transformation in the frequency domain or any assumption about local stationary properties. It is based on a time formulation of forward sound prediction or backward sound radiation in the time-wavenumber domain. The propagation is described by an analytic impulse response used to define a digital filter. The implementation of one filter in forward propagation and its inverse to recover the acoustic field on the source plane implies by simulations that real-time NAH is viable. Since a numerical filter is used rather than a Fourier transform of the time-signal, the emission on a point of the source may be rebuilt continuously and used for other post-processing applications.

61 citations


01 Jan 2010
TL;DR: In this thesis, acoustic source localization (ASL) using coincident and thus inherently space-saving and handy microphone arrays is tackled and a pattern recognition approach for ASL is presented, which outperforms the intensity vector approach at low SNR.
Abstract: A typical application of microphone arrays is to estimate the position of sound sources. The term microphone array is usually related to an arrangement of several microphones placed at different locations. Within this thesis, however, acoustic source localization (ASL) using coincident and thus inherently space-saving and handy microphone arrays is tackled. Besides established ASL-method based on analyzing the direction of the intensity vector, a pattern recognition approach for ASL is presented. A minimum distance classifier is employed, i.e. feature vectors calculated frame by frame from the array signals are compared with a prerecorded featuredatabase. The characteristics of the presented approaches are discussed with the help of a mathematical model of first order gradient microphones, as well as with measurements with a planar 4-channel coincident array prototype. Particular focus is given to robust single speaker-tracking in noisy environments. In this context, several advances to the basic algorithm for improving robustness and accuracy are proposed. In addition to source localization, a brief outline of beamforming using coincident arrays is provided. The performance of the presented ASL-algorithms is experimentally evaluated using array recordings of static and moving sound sources. Different signal to noise ratios are considered. As a basis for quantification of the estimation error, the actual position of the sound source was captured with an optical tracking system. The results are very promising and show the practicability of the presented algorithms. The similarity approach outperforms the intensity vector approach, in particular at low SNR. At 0 dB SNR (1.8s male speech in a diffuse pink noise field) the azimuth of all (100%) individual frames is correctly estimated if 15◦ absolute error is allowed (82.5% at 5°, 98% at 10°). The corresponding mean absolute azimuth estimation error is 3◦. Though accurate for static sources, the algorithm is able to track rapid azimuth changes.

Proceedings ArticleDOI
14 Mar 2010
TL;DR: A novel room modeling algorithm is proposed, which uses a constrained room model and ℓ1-regularized least-squares to achieve good estimation of room geometry.
Abstract: Acoustic room modeling has several applications. Recent results using large microphone arrays show good performance, and are helpful in many applications. For example, when designing a better acoustic treatment for a concert hall, these large arrays can be used to help map the acoustic environment and aid in the design. However, in real-time applications - including de-reverberation, sound source localization, speech enhancement and 3D audio - it is desirable to model the room with existing small arrays and existing loudspeakers. In this paper we propose a novel room modeling algorithm, which uses a constrained room model and l 1 -regularized least-squares to achieve good estimation of room geometry. We present experimental results on both real and synthetic data.

Journal Article
TL;DR: The most common techniques for sound source localization are described and how the methods have evolved over the years are explained, as well as how to assess the quality of the measurement result based on two criteria – spatial resolution and dynamic range.
Abstract: Sound source localization is a complex and cumbersome task that most acoustics engineers face on a daily basis. Today, a number of rather standard methods help accelerate this task. But there is no one-size-fits-all solution. This article describes the most common techniques and reviews vital criteria to select the best method for the job at hand. The toughest challenge facing any acoustics engineer is figuring out where the sound originates – especially when there is considerable interference and reverberation flying around. Since the early 1990s, some rather standard and highly functional methods based on microphone arrays have matured, and today they are used throughout many industries. In general, the methods fall into three categories: near-field acoustic holography, beamforming, and inverse methods. Even though these basic techniques have undergone constant improvement, the problem remains that there is no “magical” sound source localization technique that prevails over the others. Depending on the test object, the nature of the sound and the actual environment, engineers have to select one method or the other. This article reviews available techniques and explains how the methods have evolved over the years. Readers will be able to understand how to assess the quality of the measurement result based on two criteria – spatial resolution and dynamic range – as well as reviewing criteria to determine which method results in the best sound source localization for the job. The various methods discussed will be frequency range, distance to the source, physical properties of the sound source, and operational conditions. A number of examples are presented regarding the different techniques.

Journal ArticleDOI
TL;DR: A multitarget methodology for acoustic source tracking based on the Track Before Detect (TBD) framework is introduced and allows for a vast increase in the number of particles used for a comparitive computational load which results in increased tracking stability in challenging recording environments.
Abstract: Particle Filter-based Acoustic Source Localization algorithms attempt to track the position of a sound source - one or more people speaking in a room - based on the current data from a microphone array as well as all previous data up to that point. This paper first discusses some of the inherent behavioral traits of the steered beamformer localization function. Using conclusions drawn from that study, a multitarget methodology for acoustic source tracking based on the Track Before Detect (TBD) framework is introduced. The algorithm also implicitly evaluates source activity using a variable appended to the state vector. Using the TBD methodology avoids the need to identify a set of source measurements and also allows for a vast increase in the number of particles used for a comparitive computational load which results in increased tracking stability in challenging recording environments. An evaluation of tracking performance is given using a set of real speech recordings with two simultaneously active speech sources.

Journal ArticleDOI
TL;DR: The proposed algorithm is robust to the effects of reverberation caused by multipath reflections and suitable for multiple acoustic source localization in a reverberant room and obtains the DOA estimates via one-dimensional (1-D) search instead of multidimensional search.
Abstract: In this paper, a new algorithm for high-resolution multiple wideband and nonstationary source localization using a sensor array is proposed. The received signals of the sensor array are first converted into the time-frequency domain via short-time Fourier transform (STFT) and we find that a set of short-time power spectrum matrices at different time instants have the joint diagonalization structure in each frequency bin. Based on such joint diagonalization structure, a novel cost function is designed and a new spatial spectrum for direction-of-arrival (DOA) estimation at hand is derived. Compared to the maximum-likelihood (ML) method with high computational complexity, the proposed algorithm obtains the DOA estimates via one-dimensional (1-D) search instead of multidimensional search. Therefore its computational complexity is much lower than the ML method. Unlike the subspace-based high-resolution DOA estimation techniques, it is not necessary to determine the number of sources in advance for the proposed algorithm. Moreover, the proposed method is robust to the effects of reverberation caused by multipath reflections. Hence it is suitable for multiple acoustic source localization in a reverberant room. The results of numerical simulations and experiments in a real room with a moderate reverberation are provided to demonstrate the good performance of the proposed approach.

Journal ArticleDOI
TL;DR: It is shown that particular realizations of reverberation differ from an ideal isotropically diffuse field, and induce an estimation bias which is dependent upon the room impulse responses (RIRs), which is expressed statistically by employing the diffuse qualities of reverberations to extend Polack's statistical RIR model.
Abstract: An acoustic vector sensor provides measurements of both the pressure and particle velocity of a sound field in which it is placed. These measurements are vectorial in nature and can be used for the purpose of source localization. A straightforward approach towards determining the direction of arrival (DOA) utilizes the acoustic intensity vector, which is the product of pressure and particle velocity. The accuracy of an intensity vector based DOA estimator in the presence of noise has been analyzed previously. In this paper, the effects of reverberation upon the accuracy of such a DOA estimator are examined. It is shown that particular realizations of reverberation differ from an ideal isotropically diffuse field, and induce an estimation bias which is dependent upon the room impulse responses (RIRs). The limited knowledge available pertaining the RIRs is expressed statistically by employing the diffuse qualities of reverberation to extend Polack's statistical RIR model. Expressions for evaluating the typical bias magnitude as well as its probability distribution are derived.

Journal ArticleDOI
TL;DR: A low frequency approach is combined with the high frequency approach giving a full bandwidth calibration procedure which can be used in free field conditions using a single calibration setup, and results are compared with results obtained with a standing wave tube.
Abstract: Calibration of acoustic particle velocity sensors is still difficult due to the lack of standardized sensors to compare with. Recently it is shown by Jacobsen and Jaud [J. Acoust. Soc. Am. 120, 830-837 (2006)] that it is possible to calibrate a sound pressure and particle velocity sensor in free field conditions at higher frequencies. This is done by using the known acoustic impedance at a certain distance of a spherical loudspeaker. When the sound pressure is measured with a calibrated reference microphone, the particle velocity can be calculated from the known impedance and the measured pressure. At lower frequencies, this approach gives unreliable results. The method is now extended to lower frequencies by measuring the acoustic pressure inside the spherical source. At lower frequencies, the sound pressure inside the sphere is proportional to the movement of the loudspeaker membrane. If the movement is known, the particle velocity in front of the loudspeaker can be derived. This low frequency approach is combined with the high frequency approach giving a full bandwidth calibration procedure which can be used in free field conditions using a single calibration setup. The calibration results are compared with results obtained with a standing wave tube. © 2010 Acoustical Society of America.

Proceedings ArticleDOI
19 Jul 2010
TL;DR: It is shown that reverberation is not the enemy, and can be used to improve estimation, and is able to use early reflections to significantly improve range and elevation estimation.
Abstract: Sound Source Localization (SSL) based on microphone arrays has numerous applications, and has received significant research attention. Common to all published research is the observation that the accuracy of SSL degrades with reverberation. Indeed, early (strong) reflections can have amplitudes similar to the direct signal, and will often interfere with the estimation. In this paper, we show that reverberation is not the enemy, and can be used to improve estimation. More specifically, we are able to use early reflections to significantly improve range and elevation estimation. The process requires two steps: during setup, a loudspeaker integrated with the array emits a probing sound, which is used to obtain estimates of the ceiling height, as well as the locations of the walls. In a second step (e.g., during a meeting), the device incorporates this knowledge into a maximum likelihood SSL algorithm. Experimental results on both real and synthetic data show huge improvements in range estimation accuracy.

Proceedings ArticleDOI
Hong Liu1, Miao Shen1
03 Dec 2010
TL;DR: A novel approach named guided spectro-temporal (ST) position localization for mobile robots, since generalized cross-correlation function based on time delay of arrival (TDOA) can not get accurate peak, and a new weighting function of GCC named PHAT-ργ is proposed to weaken the effect of noise.
Abstract: It is a great challenge to perform a sound source localization system for mobile robots, because noise and reverberation in room pose a severe threat for continuous localization. This paper presents a novel approach named guided spectro-temporal (ST) position localization for mobile robots. Firstly, since generalized cross-correlation (GCC) function based on time delay of arrival (TDOA) can not get accurate peak, a new weighting function of GCC named PHAT-ργ is proposed to weaken the effect of noise while avoiding intense computational complexity. Secondly, a rough location of sound source is obtained by PHAT-ργ method and room reverberation is estimated using such location as priori knowledge. Thirdly, ST position weighting functions are used for each cell in voice segment and all correlation functions from all cells are integrated to obtain a more optimistical location of sound source. Also, this paper presents a fast, continuous localization method for mobile robots to determine the locations of a number of sources in real-time. Experiments are performed with four microphones on a mobile robot. 2736 sets of data are collected for testing and more than 2500 sets of data are used to obtain accurate results of localization. Even if the noise and reverberation are serious. The proportion data is 92% with angle error less than 15 degrees. What's more, it takes less than 0.4 seconds to locate the position of sound source for each data.

Journal ArticleDOI
TL;DR: The results of an MEG study utilizing realistic spatial sound stimuli presented in a stimulus-specific adaptation paradigm support a population rate code model where neurons in the right hemisphere are more often tuned to the left than to the right of the perceiver while in the left hemisphere these two neuronal populations are of equal size.

Journal ArticleDOI
TL;DR: In this article, the average amplitude and standard deviation of the output of the beamforming, obtained from different array locations, are calculated for beamforming output weighting, so as to enhance the source contribution, which is space invariant, and attenuate the mirrors and sidelobe peaks.

Journal ArticleDOI
11 Feb 2010
TL;DR: A multichannel acoustic data collection recorded under the European DICIT project, during Wizard of Oz experiments carried out at FAU and FBK-irst laboratories, to collect a database supporting efficient development and tuning of acoustic processing algorithms for signal enhancement.
Abstract: This paper describes a multichannel acoustic data collection recorded under the European DICIT project, during Wizard of Oz (WOZ) experiments carried out at FAU and FBK-irst laboratories. The application of interest in DICIT is a distant-talking interface for control of interactive TV working in a typical living room, with many interfering devices. The objective of the experiments was to collect a database supporting efficient development and tuning of acoustic processing algorithms for signal enhancement. In DICIT, techniques for sound source localization, multichannel acoustic echo cancellation, blind source separation, speech activity detection, speaker identification and verification as well as beamforming are combined to achieve a maximum possible reduction of the user speech impairments typical of distant-talking interfaces. The collected database permitted to simulate at preliminary stage a realistic scenario and to tailor the involved algorithms to the observed user behaviors. In order to match the project requirements, the WOZ experiments were recorded in three languages: English, German and Italian. Besides the user inputs, the database also contains non-speech related acoustic events, room impulse response measurements and video data, the latter used to compute three-dimensional positions of each subject. Sessions were manually transcribed and segmented at word level, introducing also specific labels for acoustic events.

Patent
26 Feb 2010
TL;DR: In this paper, sound and image data are sampled simultaneously using a sound/image sampling unit incorporating a plurality of microphones and a camera, and a graph of a time-series waveform of the sound pressure level is displayed on a display screen.
Abstract: Sound and image are sampled simultaneously using a sound/image sampling unit incorporating a plurality of microphones and a camera. Sound pressure waveform data and image data are stored in a storage means. Then the sound pressure waveform data are extracted from the storage means, and a graph of a time-series waveform of the sound pressure level is displayed on a display screen. A time point at which to carry out a calculation to estimate sound direction is designated on the graph, and then sound direction is estimated by calculating the phase differences between the sound pressure signals of the sound sampled by the microphones, using the sound pressure waveform data for a calculation time length having the time point at the center thereof. A sound source position estimation image having a graphic indicating an estimated sound direction is created and displayed by combining the estimated sound direction and the image data sampled at the time point.

Book ChapterDOI
01 Jan 2010
TL;DR: This chapter presents a family of broadband source localization algorithms based on parameterized spatiotemporal correlation, including the popular and robust steered response power (SRP) algorithm.
Abstract: Multiple microphone devices aimed at hands-free capabilities and speech recognition via acoustic beamforming require reliable estimates of the position of the acoustic source. Two-stage approaches based on the timedifference- of-arrival (TDOA) are computationally simple but lack robustness in practical environments. This chapter presents a family of broadband source localization algorithms based on parameterized spatiotemporal correlation, including the popular and robust steered response power (SRP) algorithm. Before forming the conventional spatial correlation matrix, the vector of microphone signals is time-aligned with respect to a hypothesized source location. It is shown that this parametrization easily generalizes classical narrowband techniques to the broadband setting. Methods based on minimum information entropy and temporally constrained minimum variance are developed. In order to ease the high computational demands imposed by a location parameterized scheme, a sparse representation of the parameterized spatial correlation matrix is proposed.

Journal ArticleDOI
01 Dec 2010-Robotica
TL;DR: A two-microphone technique for localization of sound sources to effectively guide robotic navigation and applies the mechanism of binaural cues selection observed in mammalian hearing system to mitigate the effects of sound echoes.
Abstract: In nature, sounds from multiple sources, as well as reflections from the surfaces of the physical surroundings, arrive concurrently from different directions at the ears of a listener. Despite the fact that all of these waveforms sum at the eardrums, humans with normal hearing can effortlessly segregate interesting sounds from echoes and other sources of background noises. This paper presents a two-microphone technique for localization of sound sources to effectively guide robotic navigation. Its fundamental structure is adopted from a binaural signal-processing scheme employed in biological systems for the localization of sources using interaural time differences (ITDs). The two input signals are analyzed for coincidences along left/right-channel delay-line pairs. The coincidence time instants are presented as a function of the interaural coherence (IC). Specifically, we build a sphere head model for the selected robot and apply the mechanism of binaural cues selection observed in mammalian hearing system to mitigate the effects of sound echoes. The sound source is found by determining the azimuth at which the maximum of probability density function (PDF) of ITD cues occurs. This eliminates the localization artifacts found during tests. The experimental results of a systematic evaluation demonstrate the superior performance of the proposed method.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the distribution of sound in a horizontal plane and showed that sound propagation at a particular angle to the normal was the main source of sound radiated by a railway rail.

Proceedings ArticleDOI
26 May 2010
TL;DR: In this article, the authors investigated the use of acoustic emission (AE) for simulated crack in a stainless steel pipeline and proposed an approach to liner sound source localization in pipeline using two or more acoustic sensors.
Abstract: This paper investigates the use of acoustic emission (AE) for simulated crack in a stainless steel pipeline An approach to liner sound source localization in pipeline using two or more acoustic sensors is described, and rules to define timing parameters - peak definition time (PDT), Hit definition time (HDT) and Hit lockout time (HLT) of acoustic emission signal is studied in order to improve the sound localization accuracy Experiments were conducted on a stainless steel pipeline with no special features, such as branch, weld, and flange on the experimental part The detail process of Nielsen-Hsu Pencil Lead Break method which is used to simulate the crack, and main considerations in acoustic emission transducer selection are also introduced in this paper

Patent
28 Sep 2010
TL;DR: In this paper, a region-of-interest setting unit sets a region of interest on an ultrasound image, and a transmission focus control unit gives a transmission focusing instruction, so that a transmitting circuit performs transmission focusing on the area of interest.
Abstract: The presently disclosed subject matter is intended to correctly determine an ambient sound velocity at each level of pixels or line images constituting an ultrasound image, and construct a high-precision ultrasound image. A region-of-interest setting unit sets a region of interest on an ultrasound image. A transmission focus control unit gives a transmission focus instruction, so that a transmitting circuit performs transmission focusing on the region of interest. A set sound velocity specification unit specifies a set sound velocity used to perform reception focusing on RF data. A focus index calculation unit performs reception focusing on the RF data for each of a plurality of set sound velocities to calculate the focus index of the RF data. An ambient sound velocity determination unit determines the ambient sound velocity of the region of interest on the basis of the focus index for each of the plurality of set sound velocities. In other words, the local sound velocity of the region of interest is determined by optimising the ultrasonic image focusing.

Journal ArticleDOI
TL;DR: A unifying study of the most popular cross-correlation-based techniques, such as the SRP, MV, and MCCC, using an arbitrary number of microphones and the eigenanalysis of the parameterized spatial correlation matrix (PSCM) to classify these methods and gain some insight into their performance.
Abstract: Broadband source localization has several applications ranging from automatic video camera steering to target signal tracking and enhancement through beamforming. Consequently, there has been a considerable amount of effort to develop reliable methods for accurate localization over the last few decades. Essentially, the localization process consists in finding the candidate source location that maximizes the synchrony between the properly time-shifted microphone outputs. In addition to using well known cross-correlation-based criteria such as the steered response power (SRP), minimum variance (MV), and multichannel cross-correlation (MCCC), this synchrony can also be measured using the averaged magnitude difference function (AMDF) and the averaged magnitude sum function (AMSF) whose calculations involve low computational cost. In earlier related works, the latter techniques have been used for time delay estimation (TDE) of a target source observed by only one pair of microphones. Their generalization to the multiple microphone case and application to source localization have not been studied yet. In this paper, we consider both categories, i.e., cross-correlation and AMDF (with AMSF)-based approaches, using an arbitrary number of microphones, and analyze their performance. Specifically, we first provide a unifying study of the most popular cross-correlation-based techniques, such as the SRP, MV, and MCCC. In this paper, we use the eigenanalysis of the parameterized spatial correlation matrix (PSCM) to classify these methods and gain some insight into their performance. We demonstrate, for instance, that the MV and SRP consist in searching the major eigenvalue of the PSCM, while the MCCC, essentially, combines its minor eigenvalues when scanning for the source location. Inspired by this analysis, we show, in the second part of this work, the efficiency of the AMDF and AMSF in localizing an acoustic source using multiple microphones. Indeed, we propose two new parameterized matrices named as the parameterized averaged magnitude difference matrix (PAMDM) and the parameterized averaged magnitude sum matrix (PAMSM). The eigenanalysis of these matrices also reveals new criteria for acoustic source localization. Simulation results are provided to illustrate the effectiveness of all the investigated and proposed methods.

Patent
28 Jun 2010
TL;DR: In this paper, a room model is used to estimate wall and ceiling locations for a sound source by incorporating reflections of a known sound at a microphone array, with their corresponding signals processed to estimate the wall locations.
Abstract: Described is modeling a room to obtain estimates for walls and a ceiling, and using the model to improve sound source localization by incorporating reflection (reverberation) data into the location estimation computations. In a calibration step, reflections of a known sound are detected at a microphone array, with their corresponding signals processed to estimate wall (and ceiling) locations. In a sound source localization step, when an actual sound (including reverberations) is detected, the signals are processed into hypotheses that include reflection data predictions based upon possible locations, given the room model. The location corresponding to the hypothesis that matches (maximum likelihood) the actual sound data is the estimated location of the sound source.