scispace - formally typeset
Search or ask a question

Showing papers on "Acoustic source localization published in 2021"


Posted Content
TL;DR: In this article, a survey on deep learning methods for single and multiple sound source localization is presented, where the authors provide an exhaustive topography of the neural-based localization literature in this context, organized according to several aspects.
Abstract: This article is a survey on deep learning methods for single and multiple sound source localization. We are particularly interested in sound source localization in indoor/domestic environment, where reverberation and diffuse noise are present. We provide an exhaustive topography of the neural-based localization literature in this context, organized according to several aspects: the neural network architecture, the type of input features, the output strategy (classification or regression), the types of data used for model training and evaluation, and the model training strategy. This way, an interested reader can easily comprehend the vast panorama of the deep learning-based sound source localization methods. Tables summarizing the literature survey are provided at the end of the paper for a quick search of methods with a given set of target characteristics.

37 citations


Journal ArticleDOI
TL;DR: A fully convolutional neural network with an encoder-decoder structure to estimate the multiple sources’ positions and strength precisely is proposed, and it is demonstrated that the proposed deep learning model significantly outperforms the model-based methods.

28 citations


Journal ArticleDOI
TL;DR: The experimental results show that the proposed block Hermitian matrix completion (BHMC) method is an accurate and fast non-synchronous measurements beamforming algorithm, which paves a way to the real-time sound source localization in industrial occasions.

24 citations


Journal ArticleDOI
TL;DR: In this paper, a two-stream network structure which handles each modality with attention mechanism is developed for sound source localization, and the network naturally reveals the localized response in the scene without human annotation.
Abstract: Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn to correlate the visual scene and sound, as well as localize the sound source only by observing them like humans? To investigate its empirical learnability, in this work we first present a novel unsupervised algorithm to address the problem of localizing sound sources in visual scenes. In order to achieve this goal, a two-stream network structure which handles each modality with attention mechanism is developed for sound source localization. The network naturally reveals the localized response in the scene without human annotation. In addition, a new sound source dataset is developed for performance evaluation. Nevertheless, our empirical evaluation shows that the unsupervised method generates false conclusions in some cases. Thereby, we show that this false conclusion cannot be fixed without human prior knowledge due to the well-known correlation and causality mismatch misconception. To fix this issue, we extend our network to the supervised and semi-supervised network settings via a simple modification due to the general architecture of our two-stream network. We show that the false conclusions can be effectively corrected even with a small amount of supervision, i.e. , semi-supervised setup. Furthermore, we present the versatility of the learned audio and visual embeddings on the cross-modal content alignment and we extend this proposed algorithm to a new application, sound saliency based automatic camera view panning in 360 degree videos.

24 citations


Journal ArticleDOI
TL;DR: In this article, a square-shaped cluster composed of four densely-spaced sensors forming the four vertices of a square is proposed to improve the estimation accuracy of the incidence angle.

17 citations


Journal ArticleDOI
TL;DR: This is the first systematic work on observability analysis of SLAM-based microphone array calibration and sound source localization with Fisher information matrix approach, and presents necessary and sufficient conditions guaranteeing its full column rankness, which lead to parameter identifiability.
Abstract: Sensor array-based systems, which adopt time difference of arrival (TDOA) measurements among the sensors, have found many robotic applications. However, for existing frameworks and systems to be useful, the sensor array needs to be calibrated accurately. Of particular interest in this article are microphone array-based robot audition systems. In our recent work, by using a moving sound source, and the graph-based formulation of simultaneous localization and mapping (SLAM), we have proposed a framework for joint sound source localization and calibration of microphone array geometrical information, together with the estimation of microphone time offset and clock difference/drift rates. However, a thorough study on the identifiability question, termed observability analysis here, in the SLAM framework for microphone array calibration and sound source localization, is still lacking in the literature. In this article, we will fill the abovementioned gap via a Fisher information matrix approach. Motivated by the equivalence between the full column rankness of the Fisher information matrix and the Jacobian matrix, we leverage the structure of the latter associated with the SLAM formulation, and present necessary and sufficient conditions guaranteeing its full column rankness, which lead to parameter identifiability. We have thoroughly discussed the 3-D case with asynchronous (with both time offset and clock drifts, or with only one of them) and synchronous microphone array, respectively. These conditions are closely related to the motion varieties of the sound source and the microphone array configuration, and have intuitive and physical interpretations. Based on the established conditions, we have also discovered some particular cases where observability is impossible. Connections with calibration of other sensors will also be discussed, amongst others. To our best knowledge, this is the first systematic work on observability analysis of SLAM-based microphone array calibration and sound source localization. The tools and concepts used in this article are also applicable to other TDOA sensing modalities such as ultrawide band (UWB) sensors.

15 citations


Journal ArticleDOI
TL;DR: The use of artificial neural networks (ANNs) for localizing and quantifying multiple sound sources in a grid-less way using the microphones Cross-Spectral-Matrix as input to the network and providing as output both the location and strength of sources contributing to the acoustic field.

14 citations


Journal ArticleDOI
TL;DR: In this paper, a fly Ochracea inspired MEMS directional microphones were designed identically in circular shape and operated using piezoelectric sensing in 3-3 transducer mode.
Abstract: The majority of fly Ormia ochracea inspired sound source localization (SSL) works are limited to 1D, and therefore SSL in 2D can include a new vision for ambiguous acoustic applications. This article reports on an analytical and experimental work on SSL in 2D using a pair of fly O. ochracea inspired MEMS directional microphones. The reported directional microphones were designed identically in circular shape and operated using piezoelectric sensing in 3–3 transducer mode. In X–Y plane, they were canted in a 90° phase difference, i.e., one microphone was in X–axis and another one was in Y–axis. As a result, their directionality results from the X–axis (cosine) and Y–axis (sine) formulated the tangent dependent 2D SSL in the X–Y plane. The highest accuracy of the SSL in 2D was found to be ±2.92° at bending frequency (11.9 kHz) followed by a ±3.25° accuracy at rocking frequency (6.4 kHz), a ±4.68° accuracy at 1 kHz frequency, and a ±6.91° accuracy at 18 kHz frequency. The subjected frequencies were selected based on the measured inter-aural sensitivity difference (mISD) which showed a proportional impact on the cue of 2D SSL, i.e., the directionality. Besides, the basic acoustic functionalities, such as sensitivity, SNR, and self–noise were found to be 20.86 mV/Pa, 66.4 dB, and 27.6 dB SPL, respectively at 1 kHz frequency and 1 Pa sound pressure. Considering this trend of microphones, the outstanding contribution of this work is the SSL in 2D with higher accuracy using a pair of high performing bio–inspired directional microphones.

13 citations


Journal ArticleDOI
TL;DR: In this article, a multiresolution deep learning approach is proposed to encode relevant information contained in unprocessed time-domain acoustic signals captured by microphone arrays for real-time sound source two-dimensional localization tasks.
Abstract: Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in audio signal processing has significantly improved the performances for machine hearing. This has motivated the scientific community to also develop machine learning strategies for source localization applications. This paper presents BeamLearning, a multiresolution deep learning approach that allows the encoding of relevant information contained in unprocessed time-domain acoustic signals captured by microphone arrays. The use of raw data aims at avoiding the simplifying hypothesis that most traditional model-based localization methods rely on. Benefits of its use are shown for real-time sound source two-dimensional localization tasks in reverberating and noisy environments. Since supervised machine learning approaches require large-sized, physically realistic, precisely labelled datasets, a fast graphics processing unit-based computation of room impulse responses was developed using fractional delays for image source models. A thorough analysis of the network representation and extensive performance tests are carried out using the BeamLearning network with synthetic and experimental datasets. Obtained results demonstrate that the BeamLearning approach significantly outperforms the wideband MUSIC and steered response power-phase transform methods in terms of localization accuracy and computational efficiency in the presence of heavy measurement noise and reverberation.

13 citations


Journal ArticleDOI
TL;DR: The improved forward model is indeed time-invariant for the fast-rotating source owing to the time-domain de-Doppler technique and high-resolution acoustic maps can be obtained by proposed LASSO-RS and LAR-RS thanks to the sparsity-based regularization, even under very low signal-to-noise ratio (SNR).

13 citations


Journal ArticleDOI
TL;DR: In this paper, a feature-based approach is proposed to tackle the data association problem and achieve multisource localization in 3D in a distributed microphone array, where features are generated by using interchannel phase difference (IPD) information, which indicates the number of times each frequency bin across all time frames has been assigned to sources.
Abstract: Multisource localization using time difference of arrival (TDOA) is challenging because the correct combination of TDOA estimates across different microphone pairs, corresponding to the same source, is usually unknown, which is termed as the data association problem. Moreover, many existing multisource localization techniques are originally demonstrated in two dimensions, and their extensions to three dimensions (3D) are not straightforward and would lead to much higher computational complexity. In this paper, we propose an efficient, feature-based approach to tackle the data association problem and achieve multisource localization in 3D in a distributed microphone array. The features are generated by using interchannel phase difference (IPD) information, which indicates the number of times each frequency bin across all time frames has been assigned to sources. Based on such features, the data association problem is addressed by correlating most similar features across different microphone pairs, which is executed by solving a two-dimensional assignment problem successively. Thereafter, the locations of multiple sources can be obtained by imposing a single-source location estimator on the resulting TDOA combinations. The proposed approach is evaluated using both simulated data and real-world recordings.

Journal ArticleDOI
TL;DR: In this paper, a sound field morphological component analysis (SF-MCA) model and an enhanced alternating direction method of multipliers (ADMM) algorithm are proposed to achieve accurate acoustic source localization in a reverberant environment.
Abstract: The acoustic imaging results in the diffuse field are seriously affected by reverberation. The acoustic source localization algorithms based on the free-field assumption will produce many artifacts in the reverberant environment. To achieve accurate acoustic source localization in a reverberant environment, a sound field morphological component analysis (SF-MCA) model and an enhanced alternating direction method of multipliers (ADMM) algorithm are proposed in this article. First, the solution of the inhomogeneous Helmholtz equation is analyzed to characterize the acoustic components. Second, Green’s function and plane-wave basis function serve as dictionaries for sparse representation of the acoustic signal in the frequency domain, and the corresponding decomposition coefficients are obtained by the enhanced ADMM algorithm. Finally, the accurate acoustic imaging results are realized in the reverberant chamber space with strong reverberation ( $T_{20}=174.30\,\,\text {ms}$ ). Experiments demonstrate the validation of the proposed SF-MCA method. The acoustic imaging obtained by the proposed SF-MAC dereverberation model is far superior to the acoustic imaging that treats the reverberation as the general Gaussian noise.

Journal ArticleDOI
15 Mar 2021-Robotica
TL;DR: The experimental results confirm that the proposed hidden Markov model-based voice command detection system is capable to recognize the voice commands, and properly performs the task or expresses the right answer.
Abstract: Human–robot interaction (HRI) is becoming more important nowadays. In this paper, a low-cost communication system for HRI is designed and implemented on the Scout robot and a robotic face. A hidden Markov model-based voice command detection system is proposed and a non-native database has been collected by Persian speakers, which contains 10 desired English commands. The experimental results confirm that the proposed system is capable to recognize the voice commands, and properly performs the task or expresses the right answer. Comparing the system with a trained system on the Julius native database shows a better true detection (about 10%).

Journal ArticleDOI
TL;DR: In this paper, a multi-tone phase coding (MTPC) scheme was proposed to encode the interaural time difference (ITD) between binaural pure tones into discriminative spike patterns that can be directly classified by SNNs.
Abstract: Mammals exhibit remarkable capability of detecting and localizing sound sources in complex acoustic environments by using binaural cues in the spiking manner. Emulating the auditory process for sound source localization (SSL) by mammals, we propose a computational model for accurate and robust SSL under the neuromorphic spiking neural network (SNN) framework. The center of this model is a Multi-Tone Phase Coding (MTPC) scheme, which encodes the interaural time difference (ITD) between binaural pure tones into discriminative spike patterns that can be directly classified by SNNs. As such, SSL can be implemented as an event-driven task on highly efficient, neuromorphic parallel processors. We evaluate the proposed computational model on a directional audio dataset recorded from a microphone array in a realistic acoustic environment with background noise, obstruction, reflection, and other interferences. We report superior localization capability with a mean absolute error (MAE) of $1.02^\circ$ or 100% classification accuracy with an angle resolution of $5^\circ$ , which surpasses other SNN-based biologically plausible neuromorphic approaches by a relatively large margin and on par with human performance in similar tasks. This study opens up many application opportunities in human-robot interaction where energy efficiency is crucial. As a case study, we successfully deploy the proposed SSL system in a robotic platform to track the speaker and orient the robot's attention.

Proceedings ArticleDOI
06 Jun 2021
TL;DR: SSLIDE as discussed by the authors uses DEep learning to localize sound sources with random positions in a continuous space, where the spatial features of sound signals received by each microphone are extracted and represented as likelihood surfaces for the sound source locations in each point.
Abstract: This paper presents SSLIDE, Sound Source Localization for Indoors using DEep learning, which applies deep neural networks (DNNs) with encoder-decoder structure to localize sound sources with random positions in a continuous space. The spatial features of sound signals received by each microphone are extracted and represented as likelihood surfaces for the sound source locations in each point. Our DNN consists of an encoder network followed by two decoders. The encoder obtains a compressed representation of the input likelihoods. One decoder resolves the multipath caused by reverberation, and the other decoder estimates the source location. Experiments based on both the simulated and experimental data show that our method can not only outperform multiple signal classification (MUSIC), steered response power with phase transform (SRP-PHAT), sparse Bayesian learning (SBL), and a competing convolutional neural network (CNN) approach in the reverberant environment but also achieve a good generalization performance.

Journal ArticleDOI
TL;DR: In this article, the acoustic source localization in 3D structures is a challenging task especially if the structure is heterogeneous, and a large number of unknown parameters in a 3D heterogeneous structure require a...
Abstract: The acoustic source localization in 3D structures is a challenging task especially if the structure is heterogeneous. A large number of unknown parameters in a 3D heterogeneous structure require a ...

Journal ArticleDOI
TL;DR: In this paper, a probabilistic focalization approach associates detected directions of arrival (DOAs) to modeled DOAs and jointly estimates the time-varying source location.
Abstract: Localizing and tracking an underwater acoustic source is a key task for maritime situational awareness. This paper presents a sequential Bayesian estimation method for passive acoustic source localization in shallow water. The proposed probabilistic focalization approach associates detected directions of arrival (DOAs) to modeled DOAs and jointly estimates the time-varying source location. Embedded ray tracing makes it possible to incorporate environmental parameters that characterize the acoustic waveguide. Due to its statistical model, the proposed method can provide robustness in scenarios with severe environmental uncertainty. We demonstrate performance advantages compared to matched field processing using data collected during the SWellEx-96 experiment.

Posted Content
TL;DR: In this paper, the authors proposed several configurations with more convolutional layers and smaller pooling sizes in between, so that less information is lost across the layers, leading to a better feature extraction.
Abstract: In this work, we propose to extend a state-of-the-art multi-source localization system based on a convolutional recurrent neural network and Ambisonics signals. We significantly improve the performance of the baseline network by changing the layout between convolutional and pooling layers. We propose several configurations with more convolutional layers and smaller pooling sizes in-between, so that less information is lost across the layers, leading to a better feature extraction. In parallel, we test the system's ability to localize up to 3 sources, in which case the improved feature extraction provides the most significant boost in accuracy. We evaluate and compare these improved configurations on synthetic and real-world data. The obtained results show a quite substantial improvement of the multiple sound source localization performance over the baseline network.

Journal ArticleDOI
TL;DR: Although AFISTA-CG consumes more time during each iteration, it achieves the fastest convergence rate among these algorithms in the low iterations, and thus visualizes the accurate locations of sound sources the most rapidly in all cases.

Journal ArticleDOI
TL;DR: In this paper, a localization method of multi-aperture acoustic array based on time difference of arrival (TDOA) is proposed, where the positions of the target sound source in two local coordinates are separately estimated by the two sub microphone arrays based on the generalized trust-region sub-problem in the modified polar representation (GTRS-MPR) method.
Abstract: When the background noise is large or the sound source is far away from the microphone array, the localization accuracy of target sound sources will decrease dramatically. In order to solve this problem, a localization method of multi-aperture acoustic array based on time difference of arrival(TDOA) is proposed. The multi-aperture acoustic array is constructed by two small sub microphone arrays, which is supposed to surpass the performance of the equivalent large array composed of all microphones. The positions of the target sound source in two local coordinates are separately estimated by the two sub microphone arrays based on the generalized trust-region sub-problem in the modified polar representation (GTRS-MPR) method. Through the space coordinate transformation, the estimated two positions of the target sound source in the two local coordinates can be transferred into one global coordinate. To optimize the final estimated position of the target sound source by the multi-aperture acoustic array, two spatial lines are formed between the reference microphones of the two sub-arrays and the estimated two sources’ positions in the global coordinate, and a search algorithm of shortest distance point based on the least-square method is proposed to fuse the localization results of the two sub-arrays. A simulation and an experimental platform are set up to verify the proposed method. The results show that the proposed method has higher localization accuracy compared with the existing methods. Especially, the accuracy of range estimation is greatly improved, which will be more suitable for the far-field sound source localization.

Journal ArticleDOI
TL;DR: In this article, a gradient acoustic metamaterial (GAM) with tunablity for acoustic sensing is proposed, which consists of an array of plates and couples with springs, which enables the working frequency band to be broadened and adjusted by compressing the springs to change the air gap distance between every two plates.

Journal ArticleDOI
TL;DR: It is demonstrated that a system using a small number of hydrophones is capable of producing robust accuracy over a large band frequency in the presence of noise interference.
Abstract: This paper mainly studies the performance of an acoustic beamforming technique applied to a low-cost hydrophone in a linear array of two to four elements for the detection and localization of underwater acoustic sound waves. It also evaluates the integration of the array in an energy-efficient real-time monitoring system architecture, allowing marine sensing to be conducted without human intervention. Such architecture would consist of vertical linear arrays of two or four RHSA-10 hydrophones models attached to a buoy or a vessel for sound detection; a frequency domain beamformer (FDB) technique implemented in a Xilinx Spartan-7 field programmable gate array (FPGA) for sound source localization; a LoRa wireless sensor network mote to provide convenient access from a base center. The architecture aims to alleviate sea traffic control for countries that lack the financial resources to properly address illegal fishing or piracy issues, mostly committed in small fast motorized boats. In our experiment, the sound waves emitted by a small motorized boat were successfully detected and tracked by three data acquisitions at a 1 km range. It is demonstrated that a system using a small number of hydrophones is capable of producing robust accuracy over a large band frequency in the presence of noise interference.

Proceedings ArticleDOI
23 Jun 2021
TL;DR: Zhu et al. as discussed by the authors proposed a simple yet efficient model for visual sound source separation using only a single video frame, where they exploit the information of the sound source category in the separation process.
Abstract: Visual sound source separation aims at identifying sound components from a given sound mixture with the presence of visual cues. Prior works have demonstrated impressive results, but with the expense of large multi-stage architectures and complex data representations (e.g. optical flow trajectories). In contrast, we study simple yet efficient models for visual sound separation using only a single video frame. Furthermore, our models are able to exploit the information of the sound source category in the separation process. To this end, we propose two models where we assume that i) the category labels are available at the training time, or ii) we know if the training sample pairs are from the same or different category. The experiments with the MUSIC dataset show that our model obtains comparable or better performance compared to several recent baseline methods. The code is available at https://github.com/ly-zhu/Leveraging-Category-Information-for-Single-Frame-Visual-Sound-Source-Separation.

Journal ArticleDOI
06 Aug 2021
TL;DR: This study focuses on acoustic signal detection with a drone detecting targets where sounds occur, unlike image-based detection, and implements a system in which a drone detects acoustic sources above the ground by applying a phase difference microphone array technique.
Abstract: Currently, the detection of targets using drone-mounted imaging equipment is a very useful technique and is being utilized in many areas. In this study, we focus on acoustic signal detection with a drone detecting targets where sounds occur, unlike image-based detection. We implement a system in which a drone detects acoustic sources above the ground by applying a phase difference microphone array technique. Localization methods of acoustic sources are based on beamforming methods. The background and self-induced noise that is generated when a drone flies reduces the signal-to-noise ratio for detecting acoustic signals of interest, making it difficult to analyze signal characteristics. Furthermore, the strongly correlated noise, generated when a propeller rotates, acts as a factor that degrades the noise source direction of arrival estimation performance of the beamforming method. Spectral reduction methods have been effective in reducing noise by adjusting to specific frequencies in acoustically very harsh situations where drones are always exposed to their own noise. Since the direction of arrival of acoustic sources estimated from the beamforming method is based on the drone’s body frame coordinate system, we implement a method to estimate acoustic sources above the ground by fusing flight information output from the drone’s flight navigation system. The proposed method for estimating acoustic sources above the ground is experimentally validated by a drone equipped with a 32-channel time-synchronized MEMS microphone array. Additionally, the verification of the sound source location detection method was limited to the explosion sound generated from the fireworks. We confirm that the acoustic source location can be detected with an error performance of approximately 10 degrees of azimuth and elevation at the ground distance of about 150 m between the drone and the explosion location.

Journal ArticleDOI
TL;DR: In this article, a Bayesian network model is proposed to jointly assign the measurements to different sources and estimate the acoustic source locations, which is able to mitigate the miss detection issues in adverse environments.
Abstract: The problem of multiple acoustic source localization using observations from a microphone array network is investigated in this article. Multiple source signals are assumed to be window-disjoint-orthogonal (WDO) on the time-frequency (TF) domain and time delay of arrival (TDOA) measurements are extracted at each TF bin. A Bayesian network model is then proposed to jointly assign the measurements to different sources and estimate the acoustic source locations. Considering that the WDO assumption is usually violated under reverberant and noisy environments, we construct a relational network by coding the distance information between the distributed microphone arrays such that adjacent arrays have higher probabilities of observing the same acoustic source, which is able to mitigate the miss detection issues in adverse environments. A Laplace approximate variational inference method is introduced to estimate the hidden variables in the proposed Bayesian network model. Both simulations and real data experiments are performed. The results show that our proposed method is able to achieve better source localization accuracy than existing methods.

Journal ArticleDOI
TL;DR: Simulations and experiments under a real environment verified the high localization accuracy with a small computational load of ODB-SRP-PHAT and the advantages of anti-noise and anti-reverberation remained.
Abstract: Sound source localization has been increasingly used recently. Among the existing techniques of sound source localization, the steered response power–phase transform (SRP-PHAT) exhibits considerable advantages regarding anti-noise and anti-reverberation. When applied in real-time situations, however, the heavy computational load makes it impossible to localize the sound source in a reasonable time since SRP-PHAT employs a grid search scheme. To solve the problem, an improved procedure called ODB-SRP-PHAT, i.e., steered response power and phase transformation with an offline database (ODB), was proposed by the authors. The basic idea of ODB-SRP-PHAT is to determine the possible sound source positions using SRP-PHAT and density peak clustering before real-time localization and store the identified positions in an ODB. Then, at the online positioning stage, only the power values of the positions in the ODB will be calculated. When used in real-time monitoring, e.g., locating the speaker in a video conference, the computational load of ODB-SRP-PHAT is significantly smaller than that of SRP-PHAT. Simulations and experiments under a real environment verified the high localization accuracy with a small computational load of ODB-SRP-PHAT. In addition, the advantages of anti-noise and anti-reverberation remained. The suggested procedure displayed good applicability in a real environment.

Journal ArticleDOI
TL;DR: The proposed scheme is of great potential for applications in the underwater environment, such as trajectory tracking, oil/gas pipeline security monitoring and coastal defense, and the three-dimensional localization feasibility of the proposed system is experimentally verified.
Abstract: This paper demonstrates an underwater localization system based on an improved phase-sensitive optical time domain reflectometry (φ-OTDR). To localize the underwater acoustic source, 3D-printed materials with relatively high Poisson's ratio and low elastic modulus are wrapped by single-mode optical fibers to serve as an L-shaped planar sensing array, yielding a high-fidelity retrieval of acoustic wave signals. Based on the time difference of arrival (TDOA) algorithm, the time delay of signals detected by multiple sensing elements is used to locate the underwater acoustic source. Consequently, the three-dimensional localization feasibility of the proposed system is experimentally verified, showing a measurement error of about 2% in the localization range. It indicates that the proposed scheme is of great potential for applications in the underwater environment, such as trajectory tracking, oil/gas pipeline security monitoring and coastal defense.

Journal ArticleDOI
Gaomi Wu1, Linsen Xiong1, Zhifei Dong1, Xin Liu1, Chen Cai1, Zhi-mei Qi1 
TL;DR: In this paper, a metal diaphragm-based omnidirectional fiber-optic acoustic sensor with high sensitivity was developed to detect and track a small drone flying in the field.
Abstract: A metal diaphragm-based omnidirectional fiber-optic acoustic sensor with high sensitivity has been developed in this work. The acousto-optic transducer of the sensor is a single-wavelength extrinsic Fabry–Perot interferometer (EFPI) that is highly sensitive to the displacement of the diaphragm’s center. The sensor can stably work in the linear response region of the EFPI in a wide range of temperature from −20 to 60 °C. The pressure sensitivity of the sensor is larger than 800 mV/Pa, and the sensitivity fluctuation in the frequency range from 100 Hz to 6 kHz is smaller than 3 dB. The noise-limited minimum detectable pressure obtained at 1 kHz of the sensor is 126 μPa/Hz1/2. In addition, the above-prepared fiber-optic acoustic sensors present excellent phase consistency with each other, which facilitates the formation of a sensor array for sound source localization. In this work, a cross-shaped fiber-optic sensor array was prepared and then used to detect and track a small drone flying in the field. The experimental results show that the sensor array can capture the acoustic fingerprint of the drone at a distance as far as 300 m. This detection distance is more than ten times longer than that of a conventional electret condenser microphone. The azimuth angle of the drone obtained with the fiber-optic acoustic sensor array has a deviation of smaller than 10° relative to the GPS data from the drone.

Journal ArticleDOI
02 Jun 2021-Energies
TL;DR: A novel sound localization routine has been formulated which uses both the direction of arrival (DOA) of the sound signal along with the location estimation in three-dimensional space to precisely locate a sound source.
Abstract: Sound localization is a field of signal processing that deals with identifying the origin of a detected sound signal. This involves determining the direction and distance of the source of the sound. Some useful applications of this phenomenon exists in speech enhancement, communication, radars and in the medical field as well. The experimental arrangement requires the use of microphone arrays which record the sound signal. Some methods involve using ad-hoc arrays of microphones because of their demonstrated advantages over other arrays. In this research project, the existing sound localization methods have been explored to analyze the advantages and disadvantages of each method. A novel sound localization routine has been formulated which uses both the direction of arrival (DOA) of the sound signal along with the location estimation in three-dimensional space to precisely locate a sound source. The experimental arrangement consists of four microphones and a single sound source. Previously, sound source has been localized using six or more microphones. The precision of sound localization has been demonstrated to increase with the use of more microphones. In this research, however, we minimized the use of microphones to reduce the complexity of the algorithm and the computation time as well. The method results in novelty in the field of sound source localization by using less resources and providing results that are at par with the more complex methods requiring more microphones and additional tools to locate the sound source. The average accuracy of the system is found to be 96.77% with an error factor of 3.8%.

Journal ArticleDOI
TL;DR: The proposed SSL method employing a deep neural network and computer-aided engineering, which is applicable to the structure’s interiors, can estimate the position of the sound source inside the structure based on the spectrum measured by an accelerometer on the surface of the structure.