scispace - formally typeset
Search or ask a question

Showing papers on "Acoustic source localization published in 2011"


Patent
20 Apr 2011
TL;DR: In this article, a self-calibrating dipole microphone consisting of two omni-directional acoustic sensors and a processor is proposed to compensate for differences in the sensitivities of the acoustic sensors.
Abstract: A self calibrating dipole microphone formed from two omni-directional acoustic sensors. The microphone includes a sound source acoustically coupled to the acoustic sensors and a processor. The sound source is excited with a test signal, exposing the acoustic sensors to acoustic calibration signals. The responses of the acoustic sensors to the calibration signals are compared by the processor, and one or more correction factors determined. Digital filter coefficients are calculated based on the one or more correction factors, and applied to the output signals of the acoustic sensors to compensate for differences in the sensitivities of the acoustic sensors. The filtered signals provide acoustic sensor outputs having matching responses, which are subtractively combined to form the dipole microphone output.

362 citations


Journal ArticleDOI
TL;DR: This letter introduces an effective strategy that extends the conventional SRP-PHAT functional with the aim of considering the volume surrounding the discrete locations of the spatial grid, increasing its robustness and allowing for a coarser spatial grid.
Abstract: The Steered Response Power - Phase Transform (SRP-PHAT) algorithm has been shown to be one of the most robust sound source localization approaches operating in noisy and reverberant environments. However, its practical implementation is usually based on a costly fine grid-search procedure, making the computational cost of the method a real issue. In this letter, we introduce an effective strategy that extends the conventional SRP-PHAT functional with the aim of considering the volume surrounding the discrete locations of the spatial grid. As a result, the modified functional performs a full exploration of the sampled space rather than computing the SRP at discrete spatial positions, increasing its robustness and allowing for a coarser spatial grid. To this end, the Generalized Cross-Correlation (GCC) function corresponding to each microphone pair must be properly accumulated according to the defined microphone setup. Experiments carried out under different acoustic conditions confirm the validity of the proposed approach.

159 citations


Patent
28 Nov 2011
TL;DR: In this article, multiple microphone arrays of different physical sizes are used to acquire signals for spatial tracking of one or more sound sources within the augmented reality environment, and accuracy of the spatial location is improved by selecting different sized arrays.
Abstract: An augmented reality environment allows interaction between virtual and real objects. Multiple microphone arrays of different physical sizes are used to acquire signals for spatial tracking of one or more sound sources within the environment. A first array with a larger size may be used to track an object beyond a threshold distance, while a second array having a size smaller than the first may be used to track the object up to the threshold distance. By selecting different sized arrays, accuracy of the spatial location is improved.

109 citations


Journal ArticleDOI
TL;DR: It is shown that minimization of the statistical dependence using broadband independent component analysis (ICA) can be successfully exploited for acoustic source localization and is tested in highly noisy and reverberant environments.
Abstract: In this paper, we show that minimization of the statistical dependence using broadband independent component analysis (ICA) can be successfully exploited for acoustic source localization. As the ICA signal model inherently accounts for the presence of several sources and multiple sound propagation paths, the ICA criterion offers a theoretically more rigorous framework than conventional techniques based on an idealized single-path and single-source signal model. This leads to algorithms which outperform other localization methods, especially in the presence of multiple simultaneously active sound sources and under adverse conditions, notably in reverberant environments. Three methods are investigated to extract the time difference of arrival (TDOA) information contained in the filters of a two-channel broadband ICA scheme. While for the first, the blind system identification (BSI) approach, the number of sources should be restricted to the number of sensors, the other methods, the averaged directivity pattern (ADP) and composite mapped filter (CMF) approaches can be used even when the number of sources exceeds the number of sensors. To allow fast tracking of moving sources, the ICA algorithm operates in block-wise batch mode, with a proportionate weighting of the natural gradient to speed up the convergence of the algorithm. The TDOA estimation accuracy of the proposed schemes is assessed in highly noisy and reverberant environments for two, three, and four stationary noise sources with speech-weighted spectral envelopes as well as for moving real speech sources.

100 citations


Journal ArticleDOI
TL;DR: In this article, an acoustic liquid sensor based on phononic crystals consisting of steel plate with an array of holes filled with liquid was proposed, where the frequency of this resonant transmission peak is shown to rely on the speed of sound of the liquid, and the resonant frequency can be used as a measure of speed and related properties, like concentration of a component in the liquid mixture.
Abstract: We introduce an acoustic liquid sensor based on phononic crystals consisting of steel plate with an array of holes filled with liquid. We both theoretically and experimentally demonstrate sensor properties considering the mechanism of the extraordinary acoustic transmission as underlying phenomenon. The frequency of this resonant transmission peak is shown to rely on the speed of sound of the liquid, and the resonant frequency can be used as a measure of speed of sound and related properties, like concentration of a component in the liquid mixture. The finite-difference time domain method has been applied for sensor design. Ultrasonic transmission experiments are performed. Good consistency of the resonant frequency shift has been found between theoretical results and experiments. The proposed scheme offers a platform for an acoustic liquid sensor.

88 citations


Journal ArticleDOI
TL;DR: This work proposes a framework that simultaneously localizes the mobile robot and multiple sound sources using a microphone array on the robot and an eigenstructure-based generalized cross correlation method for estimating time delays between microphones under multi-source environments.
Abstract: Sound source localization is an important function in robot audition. Most existing works perform sound source localization using static microphone arrays. This work proposes a framework that simul...

73 citations


Proceedings ArticleDOI
05 Dec 2011
TL;DR: Assessment of robust tracking of humans based on intelligent Sound Source Localization for a robot in a real environment shows GEVD-MUSIC improved the noise-robustness of SSL by a signal-to-noise ratio of 5–6 dB, and audio-visual integration improved the average tracking error by approximately 50%.
Abstract: We have assessed robust tracking of humans based on intelligent Sound Source Localization (SSL) for a robot in a real environment. SSL is fundamental for robot audition, but has three issues in a real environment: robustness against noise with high power, lack of a general framework for selective listening to sound sources, and tracking of inactive and/or noisy sound sources. To address the first issue, we extended Multiple SIgnal Classification by incorporating Generalized EigenValue Decomposition (GEVD-MUSIC) so that it can deal with high power noise and can select target sound sources. To address the second issue, we proposed Sound Source Identification (SSI) based on hierarchical gaussian mixture models and integrated it with GEVD-MUSIC to realize a selective listening function. To address the third issue, we integrated audio-visual human tracking using particle filtering. Integration of these three techniques into an intelligent human tracking system showed: 1) GEVD-MUSIC improved the noise-robustness of SSL by a signal-to-noise ratio of 5–6 dB; 2) SSI performed more than 70% in F-measure even in a noisy environment; and 3) audio-visual integration improved the average tracking error by approximately 50%.

64 citations


Patent
16 Mar 2011
TL;DR: In this paper, a method and system for enhancing a target sound signal from multiple sound signals is provided. But the system is limited to a single sound source and is not suitable for the use of multiple sound sources.
Abstract: A method and system for enhancing a target sound signal from multiple sound signals is provided. An array of an arbitrary number of sound sensors positioned in an arbitrary configuration receives the sound signals from multiple disparate sources. The sound signals comprise the target sound signal from a target sound source, and ambient noise signals. A sound source localization unit, an adaptive beamforming unit, and a noise reduction unit are in operative communication with the array of sound sensors. The sound source localization unit estimates a spatial location of the target sound signal from the received sound signals. The adaptive beamforming unit performs adaptive beamforming by steering a directivity pattern of the array of sound sensors in a direction of the spatial location of the target sound signal, thereby enhancing the target sound signal and partially suppressing the ambient noise signals, which are further suppressed by the noise reduction unit.

61 citations


Journal ArticleDOI
TL;DR: A theoretical analysis of the spherical harmonics spectrum of spatially translated sources and defines four measures for the misalignment of the acoustic center of a radiating source to promote optimal alignment.
Abstract: The radiation patterns of acoustic sources have great significance in a wide range of applications, such as measuring the directivity of loudspeakers and investigating the radiation of musical instruments for auralization. Recently, surrounding spherical microphone arrays have been studied for sound field analysis, facilitating measurement of the pressure around a sphere and the computation of the spherical harmonics spectrum of the sound source. However, the sound radiation pattern may be affected by the location of the source inside the microphone array, which is an undesirable property when aiming to characterize source radiation in a unique manner. This paper presents a theoretical analysis of the spherical harmonics spectrum of spatially translated sources and defines four measures for the misalignment of the acoustic center of a radiating source. Optimization is used to promote optimal alignment based on the proposed measures and the errors caused by numerical and array-order limitations are investigated. This methodology is examined using both simulated and experimental data in order to investigate the performance and limitations of the different alignment methods.

57 citations


Journal ArticleDOI
TL;DR: The EPF scheme is adapted to the multiple-hypothesis model to track a single acoustic source in reverberant environments and it is shown that splitting the array into several sub-arrays improves the robustness of the estimated source location.
Abstract: Particle filtering has been shown to be an effective approach to solving the problem of acoustic source localization in reverberant environments. In reverberant environment, the direct- arrival of the single source is accompanied by multiple spurious arrivals. Multiple-hypothesis model associated with these arrivals can be used to alleviate the unreliability often attributed to the acoustic source localization problem. Until recently, this multiple- hypothesis approach was only applied to bootstrap-based particle filter schemes. Recently, the extended Kalman particle filter (EPF) scheme which allows for an improved tracking capability was proposed for the localization problem. The EPF scheme utilizes a global extended Kalman filter (EKF) which strongly depends on prior knowledge of the correct hypotheses. Due to this, the extension of the multiple-hypothesis model for this scheme is not trivial. In this paper, the EPF scheme is adapted to the multiple-hypothesis model to track a single acoustic source in reverberant environments. Our work is supported by an extensive experimental study using both simulated data and data recorded in our acoustic lab. Various algorithms and array constellations were evaluated. The results demonstrate the superiority of the proposed algorithm in both tracking and switching scenarios. It is further shown that splitting the array into several sub-arrays improves the robustness of the estimated source location.

52 citations


Journal ArticleDOI
TL;DR: A new method for converting the signal of the original multichannel sound system into that of an alternative system with a different number of channels while maintaining the physical properties of sound at the listening point in the reproduced sound field is described.
Abstract: In this paper, we describe a new method for converting the signal of the original multichannel sound system into that of an alternative system with a different number of channels while maintaining the physical properties of sound at the listening point in the reproduced sound field. Such a conversion problem can be described by the underdetermined linear equation. To obtain an analytical solution to the equation, the method partitions the sound field of the alternative system on the basis of the positions of three loudspeakers and solves the “local solution” in each subfield. As a result, the alternative system localizes each channel signal of the original sound system at the corresponding loudspeaker position as a phantom source. The composition of the local solutions introduces the “global solution,” that is, the analytical solution to the conversion problem. 22-channel signals of a 22.2 multichannel sound system without the two low-frequency effect channels were converted into 10-, 8-, and 6-channel signals by the method. Subjective evaluations showed that the proposed method could reproduce the spatial impression of the original 22-channel sound with eight loudspeakers.

Journal ArticleDOI
TL;DR: This paper analyzes the performance of several weighting functions in the generalized cross-correlation time-delay estimation algorithm and gives the simulation results.

Journal ArticleDOI
TL;DR: In this article, an assessment of the time-domain equivalent source method for the prediction of acoustic scattering is presented, including the sensitivity of the method to numerical parameters, such as the number of the surface collocation points, the number and position of the equivalent sources, the time step, and the cut-off singular value.
Abstract: Equivalent source methods have been developed in the frequency and time domain to provide a fast and efficient computation for acoustic scattering. Although the advantages and capabilities of the method have been demonstrated, the limitations and drawbacks of the method have not yet been explored in detail. A detailed understanding of the equivalent source method is needed to use the method in a wide range of applications with more confidence. This paper presents an assessment of the time-domain equivalent source method for the prediction of acoustic scattering. The sensitivity of the method to numerical parameters, including the number of the surface collocation points, the number and position of the equivalent sources, the time step, and the cut-off singular value, is investigated and suggestions for these parameters are given for accurate predictions. A numerical instability issue is shown and a way to stabilize the solution with a time-averaging scheme is introduced. The sound power is calculated using the equivalent source strength to demonstrate the redistribution of the sound intensity by a scattering body and the conservation of the total power. Finally, scattering of sound from a source in a short duct is tested to demonstrate the utility of the tool for a more complicated shape of the scattering surface.

Proceedings ArticleDOI
09 May 2011
TL;DR: The design and implementation of selectable sound separation functions on the telepresence system “Texai” using the robot audition software “HARK” are presented and it is shown that the resulting system localizes sound sources with a tolerance of 5 degrees.
Abstract: This paper presents the design and implementation of selectable sound separation functions on the telepresence system “Texai” using the robot audition software “HARK.” An operator of Texai can “walk” around a faraway office to attend a meeting or talk with people through video-conference instead of meeting in person. With a normal microphone, the operator has difficulty recognizing the auditory scene of the Texai, e.g., he/she cannot know the number and the locations of sounds. To solve this problem, we design selectable sound separation functions with 8 microphones in two modes, overview and filter modes, and implement them using HARK's sound source localization and separation. The overview mode visualizes the direction-of-arrival of surrounding sounds, while the filter mode provides sounds that originate from the range of directions he/she specifies. The functions enable the operator to be aware of a sound even if it comes from behind the Texai, and to concentrate on a particular sound. The design and implementation was completed in five days due to the portability of HARK. Experimental evaluations with actual and simulated data show that the resulting system localizes sound sources with a tolerance of 5 degrees.

Journal Article
TL;DR: Sound Field Reconstruction (SFR) as mentioned in this paper is a technique that uses the previously stated observation to express the reproduction of a continuous sound field as an inversion of the discrete acoustic channel from a loudspeaker array to a grid of control points.
Abstract: Sound fields are essentially band-limited phenomena, both temporally and spatially. This implies that a spatially sampled sound field respecting the Nyquist criterion is effectively equivalent to its continuous original. We describe Sound Field Reconstruction (SFR)---a technique that uses the previously stated observation to express the reproduction of a continuous sound field as an inversion of the discrete acoustic channel from a loudspeaker array to a grid of control points. The acoustic channel is inverted using truncated singular value decomposition (SVD) in order to provide optimal sound field reproduction subject to a limited effort constraint. Additionally, a detailed procedure for obtaining loudspeaker driving signals that involves selection of active loudspeakers, coverage of the listening area with control points, and frequency domain FIR filter design is described. Extensive simulations comparing SFR with Wave Field Synthesis show that on average, SFR provides higher sound field reproduction accuracy.

Proceedings Article
01 Aug 2011
TL;DR: This work implemented online 2-source acoustic event detection and localization algorithms in a Smart-room, a closed space equipped with multiple microphones, showing high recognition accuracy for most of acoustic events both isolated and overlapped with speech.
Abstract: Real-time processing is a requirement for many practical signal processing applications. In this work we implemented online 2-source acoustic event detection and localization algorithms in a Smart-room, a closed space equipped with multiple microphones. Acoustic event detection is based on HMMs that enable to process the input audio signal with very low latency; acoustic source localization is based on the SRP-PHAT localization method which is known to perform robustly in most scenarios. The experimental results from online tests show high recognition accuracy for most of acoustic events both isolated and overlapped with speech.

Journal ArticleDOI
TL;DR: The proposed technique is based on the reciprocal time reversal approach (inverse filtering) applied to a number of waveforms stored into a database containing the experimental Green's function of the structure, using only one passive transducer.
Abstract: This paper presents an imaging method for the localization of the impact point in complex anisotropic structures with diffuse field conditions, using only one passive transducer. The proposed technique is based on the reciprocal time reversal approach (inverse filtering) applied to a number of waveforms stored into a database containing the experimental Green’s function of the structure. Unlike most acoustic emission monitoring systems, the present method exploits the benefits of multiple scattering, mode conversion, and boundaries reflections to achieve the focusing of the source with high resolution. Compared to a standard time reversal approach, the optimal refocusing of the back propagated wave field at the impact point is accomplished through a “virtual” imaging process. The robustness of the inverse filtering technique is experimentally demonstrated on a dissipative stiffened composite panel and the source position can be retrieved with a high level of accuracy in any position of the structure. Its very simple configuration and minimal processing requirements make this method a valid alternative to the conventional imaging Structural Health Monitoring systems for the acoustic emission source localization.

Journal ArticleDOI
TL;DR: The experimental results indicate that significant amplification of the directional cues and directional sensitivity can be achieved with the fly-ear inspired sensor design, which can provide a basis for the development of miniature sound localization sensors in two dimensions.
Abstract: Inspired by the hearing organ of the fly Ormia ochracea, a miniature sound localization sensor is developed, which can be used to pinpoint a sound source in two dimensions described by the azimuth and elevation angles. The sensor device employs an equilateral triangle configuration consisting of three mechanically coupled circular membranes whose oscillations are detected by a fiber-optic system. The experimental results indicate that significant amplification of the directional cues and directional sensitivity can be achieved with the fly-ear inspired sensor design. This work can provide a basis for the development of miniature sound localization sensors in two dimensions.

Journal ArticleDOI
TL;DR: In this article, a simple acoustic model for centrifugal pumps that considers ideal sound sources of arbitrary position and properties is presented, which is implemented in a software code that applies it systematically to a pump previously tested at laboratory, until identifying the set of ideal sources that best reproduce the pressure fluctuation measurements.

Proceedings ArticleDOI
05 Dec 2011
TL;DR: The experimental results showed that microphone locations and clock differences were estimated properly with 10–15 sound events (handclaps), and the error of sound source localization with the estimated information was less than the grid size of beamforming, that is, the lowest error was theoretically attained.
Abstract: This paper addresses the online calibration of an asynchronous microphone array for robots. Conventional microphone array technologies require a lot of measurements of transfer functions to calibrate microphone locations, and a multi-channel A/D converter for inter-microphone synchronization. We solve these two problems using a framework combining Simultaneous Localization and Mapping (SLAM) and beamforming in an online manner. To do this, we assume that estimations of microphone locations, a sound source location, and microphone clock difference correspond to mapping, self-localization, observation errors in SLAM, respectively. In our framework, the SLAM process calibrates locations and clock differences of microphones every time a microphone array observes a sound like a human's clapping, and a beamforming process works as a cost function to decide the convergence of calibration by localizing the sound with the estimated locations and clock differences. After calibration, beamforming is used for sound source localization. We implemented a prototype system using Extended Kalman Filter (EKF) based SLAM and Delay-and-Sum Beamforming (DS-BF). The experimental results showed that microphone locations and clock differences were estimated properly with 10–15 sound events (handclaps), and the error of sound source localization with the estimated information was less than the grid size of beamforming, that is, the lowest error was theoretically attained.

Journal ArticleDOI
TL;DR: The double layer velocity method circumvents some of the drawbacks of the pressure-velocity based reconstruction, and it can successfully recover the normal velocity radiated by the source, even in the presence of strong disturbing sound.
Abstract: In near-field acoustic holography sound field separation techniques make it possible to distinguish between sound coming from the two sides of the array. This is useful in cases where the sources are not confined to only one side of the array, e.g., in the presence of additional sources or reflections from the other side. This paper examines a separation technique based on measurement of the particle velocity in two closely spaced parallel planes. The purpose of the technique is to recover the particle velocity radiated by a source in the presence of disturbing sound from the opposite side of the array. The technique has been examined and compared with direct velocity based reconstruction, as well as with a technique based on the measurement of the sound pressure and particle velocity. The double layer velocity method circumvents some of the drawbacks of the pressure-velocity based reconstruction, and it can successfully recover the normal velocity radiated by the source, even in the presence of strong disturbing sound.

Journal ArticleDOI
TL;DR: In this paper, a virtual target grid is introduced, and source mapping and strength estimation are obtained disregarding, as much as possible, the reflections influence, and two simple problems are used to compare the generalized inverse performance with fixed regularization factor to performance obtained using the optimized regularization strategy.

Journal ArticleDOI
TL;DR: This paper presents global active noise control using a parametric beam focusing source (PBFS), which is used for a primary sound source and the other for a control sound source, and ensures feasibility of the trivial case.
Abstract: By exploiting a case regarded as trivial, this paper presents global active noise control using a parametric beam focusing source (PBFS). As with a dipole model, one is used for a primary sound source and the other for a control sound source, the control effect for minimizing a total acoustic power depends on the distance between the two. When the distance becomes zero, the total acoustic power becomes null, hence nothing less than a trivial case. Because of the constraints in practice, there exist difficulties in placing a control source close enough to a primary source. However, by projecting a sound beam of a parametric array loudspeaker onto the target sound source (primary source), a virtual sound source may be created on the target sound source, thereby enabling the collocation of the sources. In order to further ensure feasibility of the trivial case, a PBFS is then introduced in an effort to meet the size of the two sources. Reflected sound wave of the PBFS, which is tantamount to the virtual sound source output, aims to suppress the primary sound. Finally, a numerical analysis as well as an experiment is conducted, verifying the validity of the proposed methodology.

Journal ArticleDOI
TL;DR: The authors suggest an alternating air-flow method based on the ratio of sound pressures measured at frequencies higher than 2 Hz inside two cavities coupled through a conventional loudspeaker that showed that the imaginary part of the sound pressure ratio is useful for the evaluation of the air- flow resistance.
Abstract: Air-flow resistivity is a main parameter governing the acoustic behavior of porous materials for sound absorption. The international standard ISO 9053 specifies two different methods to measure the air-flow resistivity, namely a steady-state air-flow method and an alternating air-flow method. The latter is realized by the measurement of the sound pressure at 2 Hz in a small rigid volume closed partially by the test sample. This cavity is excited with a known volume-velocity sound source implemented often with a motor-driven piston oscillating with prescribed area and displacement magnitude. Measurements at 2 Hz require special instrumentation and care. The authors suggest an alternating air-flow method based on the ratio of sound pressures measured at frequencies higher than 2 Hz inside two cavities coupled through a conventional loudspeaker. The basic method showed that the imaginary part of the sound pressure ratio is useful for the evaluation of the air-flow resistance. Criteria are discussed about the choice of a frequency range suitable to perform simplified calculations with respect to the basic method. These criteria depend on the sample thickness, its nonacoustic parameters, and the measurement apparatus as well. The proposed measurement method was tested successfully with various types of acoustic materials.

Journal ArticleDOI
TL;DR: A pressure-sensitive paint (PSP) system capable of measuring high-frequency acoustic fields with non-periodic, acoustic-level pressure changes is described and shows that the paint could resolve the spatial details of the mode shape at the given resonance condition.
Abstract: A pressure-sensitive paint (PSP) system capable of measuring high-frequency acoustic fields with non-periodic, acoustic-level pressure changes is described. As an optical measurement technique, PSP provides the experimenter with a global distribution of pressure on a painted surface. To demonstrate frequency response and enhanced sensitivity to pressure changes, a PSP system consisting of a polymer/ceramic matrix binder with platinum tetra(pentafluorophenyl) porphyrin (PtTFPP) as the oxygen probe was applied to a wall inside an acoustic resonance cavity excited at 1.3 kHz. A data acquisition technique based on the luminescent decay lifetime of the oxygen sensors excited by a single pulse of light afforded the ability to capture instantaneous pressure fields with no phase-averaging. Superimposed wave-like structures were observed with a wavelength corresponding to a 4.7% difference from the theoretical value for a sound wave emanating from the speaker. High sound pressure cases upwards of 145 dB (re 20 μPa) exhibited skewed nodal lines attributed to a nonlinear acoustic field. The lowest sound pressure level of 125.4 dB—corresponding to an amplitude of 52.7 Pa, or approximately 0.05% of standard sea-level atmospheric pressure—showed that the paint could resolve the spatial details of the mode shape at the given resonance condition.

Journal ArticleDOI
TL;DR: A new expectation maximization (EM) algorithm is designed to combat the source localization based on the realistic assumption where the sources are corrupted by the noises with nonuniform variances and the root-mean-square error of the EM algorithm is closer to the derived CRLB than both SC-ML and AC-ML methods.
Abstract: Wideband source localization using acoustic sensor networks has been drawing a lot of research interest recently. The maximum-likelihood is the predominant objective which leads to a variety of source localization approaches. However, the robust and efficient optimization algorithms are still being pursuit by researchers since different aspects about the effectiveness of such algorithms have to be addressed on different circumstances. In this paper, we would like to combat the source localization based on the realistic assumption where the sources are corrupted by the noises with nonuniform variances. We focus on the two popular source localization methods for solving this problem, namely the SC-ML (stepwise-concentrated maximum-likelihood) and AC-ML (approximately-concentrated maximum likelihood) algorithms. We explore the respective limitations of these two methods and design a new expectation maximization (EM) algorithm. Furthermore, we provide the Cramer-Rao lower bound (CRLB) for all these three methods. Through Monte Carlo simulations, we demonstrate that our proposed EM algorithm outperforms the SC-ML and AC-ML methods in terms of the localization accuracy, and the root-mean-square (RMS) error of our EM algorithm is closer to the derived CRLB than both SC-ML and AC-ML methods.

Patent
25 Aug 2011
TL;DR: In this article, a beam former unit of a sound source separator device is used to attenuate each sound source signal that comes from a region wherein the general direction of a target sound source is included and a region opposite to said region, in a plane that intersects a line segment that joins the two microphones.
Abstract: With conventional source separator devices, specific frequency bands are significantly reduced in environments where dispersed static is present that does not come from a particular direction, and as a result, the dispersed static may be filtered irregularly without regard to sound source separation results, giving rise to musical noise. In an embodiment of the present invention, by computing weighting coefficients which are in a complex conjugate relation, for post-spectrum analysis output signals from microphones ( 10, 11 ), a beam former unit ( 3 ) of a sound source separator device ( 1 ) thus carries out a beam former process for attenuating each sound source signal that comes from a region wherein the general direction of a target sound source is included and a region opposite to said region, in a plane that intersects a line segment that joins the two microphones ( 10, 11 ). A weighting coefficient computation unit ( 50 ) computes a weighting coefficient on the basis of the difference between power spectrum information calculated by power calculation units ( 40, 41 ).

01 Jan 2011
TL;DR: The ego noise suppression capability of the proposed estimation method is verified by evaluating it on two important applications in the framework of robot audition: (1) automatic speech recognition and (2) sound source localization.
Abstract: Noise generated due to the motion of a robot deteriorates the quality of the desired sounds recorded by robot-embedded microphones. On top of that, a moving robot is also vulnerable to its loud fan noise that changes its orientation relative to the moving limbs where the microphones are mounted on. To tackle the non-stationary ego-motion noise and the direction changes of fan noise, we propose an estimation method based on instantaneous prediction of ego noise using parameterized templates. We verify the ego noise suppression capability of the proposed estimation method on a humanoid robot by evaluating it on two important applications in the framework of robot audition: (1) automatic speech recognition and (2) sound source localization. We demonstrate that our method improves recognition and localization performance during both head and arm motions considerably.

Proceedings ArticleDOI
Gokhan Ince1, Keisuke Nakamura1, Futoshi Asano1, Hirofumi Nakajima1, Kazuhiro Nakadai1 
09 May 2011
TL;DR: In this paper, an estimation method based on instantaneous prediction of ego noise using parameterized templates is proposed to tackle the nonstationary ego-motion noise and the direction changes of fan noise.
Abstract: Noise generated due to the motion of a robot deteriorates the quality of the desired sounds recorded by robot-embedded microphones. On top of that, a moving robot is also vulnerable to its loud fan noise that changes its orientation relative to the moving limbs where the microphones are mounted on. To tackle the non-stationary ego-motion noise and the direction changes of fan noise, we propose an estimation method based on instantaneous prediction of ego noise using parameterized templates. We verify the ego noise suppression capability of the proposed estimation method on a humanoid robot by evaluating it on two important applications in the framework of robot audition: (1) automatic speech recognition and (2) sound source localization. We demonstrate that our method improves recognition and localization performance during both head and arm motions considerably.

Patent
27 Apr 2011
TL;DR: In this paper, an apparatus for simultaneously controlling a near sound field and a far sound field may classify the near sound fields and the far sound fields based on a distance between an array speaker and a listener, and it is possible to perform focusing even when the listener is located in adjacent to the array speaker.
Abstract: An apparatus and method for forming a Personal Sound Zone (PSZ) at a location of a listener are provided. An apparatus for simultaneously controlling a near sound field and a far sound field may classify the near sound field and the far sound field based on a distance between an array speaker and a listener, and may control the near sound field and the far sound field and thus, it is possible to perform focusing even when the listener is located in adjacent to the array speaker. Additionally, the apparatus may generate a directive sound source using the array speaker, and at the same time, may reduce a sound pressure in a far field, thereby reducing a sound source spreading to the far field while focusing is performed at the location of the listener.