scispace - formally typeset
Search or ask a question

Showing papers on "Microphone published in 2017"


Journal ArticleDOI
TL;DR: It is found that training on different noise environments and different microphones barely affects the ASR performance, especially when several environments are present in the training data: only the number of microphones has a significant impact.

345 citations


Proceedings ArticleDOI
16 Jun 2017
TL;DR: BackDoor, a system that develops the technical building blocks for harnessing non-linearities in microphone hardware and achieves upwards of 4kbps for proximate data communication, as well as room-level privacy protection against electronic eavesdropping.
Abstract: Consider sounds, say at 40kHz, that are completely outside the human's audible range (20kHz), as well as a microphone's recordable range (24kHz). We show that these high frequency sounds can be designed to become recordable by unmodified microphones, while remaining inaudible to humans. The core idea lies in exploiting non-linearities in microphone hardware. Briefly, we design the sound and play it on a speaker such that, after passing through the microphone's non-linear diaphragm and power-amplifier, the signal creates a "shadow" in the audible frequency range. The shadow can be regulated to carry data bits, thereby enabling an acoustic (but inaudible) communication channel to today's microphones. Other applications include jamming spy microphones in the environment, live watermarking of music in a concert, and even acoustic denial-of-service (DoS) attacks. This paper presents BackDoor, a system that develops the technical building blocks for harnessing this opportunity. Reported results achieve upwards of 4kbps for proximate data communication, as well as room-level privacy protection against electronic eavesdropping.

181 citations


Journal ArticleDOI
TL;DR: The ferroelectret nanogenerators' intrinsic properties that allow for the bidirectional conversion of energy between electrical and mechanical domains are reported, extending its potential use in wearable electronics beyond the power generation realm.
Abstract: Ferroelectret nanogenerators were recently introduced as a promising alternative technology for harvesting kinetic energy Here we report the device’s intrinsic properties that allow for the bidirectional conversion of energy between electrical and mechanical domains; thus extending its potential use in wearable electronics beyond the power generation realm This electromechanical coupling, combined with their flexibility and thin film-like form, bestows dual-functional transducing capabilities to the device that are used in this work to demonstrate its use as a thin, wearable and self-powered loudspeaker or microphone patch To determine the device’s performance and applicability, sound pressure level is characterized in both space and frequency domains for three different configurations The confirmed device’s high performance is further validated through its integration in three different systems: a music-playing flag, a sound recording film and a flexible microphone for security applications Self-powered nanogenerators by harvesting energy from the environment are desirable for future portable and wearable electronics Liet al show the use of ferroelectret nanogenerators to build microphone or loudspeaker, which convert electrical signals to mechanical motions in a reversible manner

165 citations


Patent
21 Feb 2017
TL;DR: In this paper, various aspects of systems and methods for voice control and related features and functionality for various embodiments of media playback devices, networked microphone devices, microphone-equipped media playback device, and speaker-equipped networked microphones devices are disclosed and described.
Abstract: Multiple aspects of systems and methods for voice control and related features and functionality for various embodiments of media playback devices, networked microphone devices, microphone-equipped media playback devices, and speaker-equipped networked microphone devices are disclosed and described herein, including but not limited to designating and managing default networked devices, audio response playback, room-corrected voice detection, content mixing, music service selection, metadata exchange between networked playback systems and networked microphone systems, handling loss of pairing between networked devices, actions based on user identification, and other voice control of networked devices.

136 citations


Journal ArticleDOI
TL;DR: A first proof of concept for EEG-informed attended speaker extraction and denoising is provided, showing that AAD-based speaker extraction from microphone array recordings is feasible and robust, even in noisy acoustic environments, and without access to the clean speech signals to perform EEG-based AAD.
Abstract: Objective : We aim to extract and denoise the attended speaker in a noisy two-speaker acoustic scenario, relying on microphone array recordings from a binaural hearing aid, which are complemented with electroencephalography (EEG) recordings to infer the speaker of interest. Methods : In this study, we propose a modular processing flow that first extracts the two speech envelopes from the microphone recordings, then selects the attended speech envelope based on the EEG, and finally uses this envelope to inform a multichannel speech separation and denoising algorithm. Results : Strong suppression of interfering (unattended) speech and background noise is achieved, while the attended speech is preserved. Furthermore, EEG-based auditory attention detection (AAD) is shown to be robust to the use of noisy speech signals. Conclusions : Our results show that AAD-based speaker extraction from microphone array recordings is feasible and robust, even in noisy acoustic environments, and without access to the clean speech signals to perform EEG-based AAD. Significance : Current research on AAD always assumes the availability of the clean speech signals, which limits the applicability in real settings. We have extended this research to detect the attended speaker even when only microphone recordings with noisy speech mixtures are available. This is an enabling ingredient for new brain–computer interfaces and effective filtering schemes in neuro-steered hearing prostheses. Here, we provide a first proof of concept for EEG-informed attended speaker extraction and denoising.

119 citations


Journal ArticleDOI
TL;DR: Common approaches for source localization in WASNs that are focused on different types of acoustic features, namely, the energy of the incoming signals, their time of arrival or time difference of arrival, the direction of arrival (DOA), and the steered response power (SRP) resulting from combining multiple microphone signals are reviewed.
Abstract: Wireless acoustic sensor networks (WASNs) are formed by a distributed group of acoustic-sensing devices featuring audio playing and recording capabilities. Current mobile computing platforms offer great possibilities for the design of audio-related applications involving acoustic-sensing nodes. In this context, acoustic source localization is one of the application domains that have attracted the most attention of the research community along the last decades. In general terms, the localization of acoustic sources can be achieved by studying energy and temporal and/or directional features from the incoming sound at different microphones and using a suitable model that relates those features with the spatial location of the source (or sources) of interest. This paper reviews common approaches for source localization in WASNs that are focused on different types of acoustic features, namely, the energy of the incoming signals, their time of arrival (TOA) or time difference of arrival (TDOA), the direction of arrival (DOA), and the steered response power (SRP) resulting from combining multiple microphone signals. Additionally, we discuss methods not only aimed at localizing acoustic sources but also designed to locate the nodes themselves in the network. Finally, we discuss current challenges and frontiers in this field.

117 citations


Proceedings ArticleDOI
02 May 2017
TL;DR: In this paper, a convolution neural network (CNN) based classification method for broadband DOA estimation is proposed, where the phase component of the short-time Fourier transform coefficients of the received microphone signals are directly fed into the CNN and the features required for DoA estimation are learned during training.
Abstract: A convolution neural network (CNN) based classification method for broadband DOA estimation is proposed, where the phase component of the short-time Fourier transform coefficients of the received microphone signals are directly fed into the CNN and the features required for DOA estimation are learned during training. Since only the phase component of the input is used, the CNN can be trained with synthesized noise signals, thereby making the preparation of the training data set easier compared to using speech signals. Through experimental evaluation, the ability of the proposed noise trained CNN framework to generalize to speech sources is demonstrated. In addition, the robustness of the system to noise, small perturbations in microphone positions, as well as its ability to adapt to different acoustic conditions is investigated using experiments with simulated and real data.

104 citations


Journal ArticleDOI
TL;DR: An approach to the design of any desired symmetric directivity pattern, where the deduced beampattern is almost frequency invariant and its main beam can be pointed to any wanted direction in the sensor plane.
Abstract: This paper deals with two critical issues about uniform circular arrays (UCAs): frequency-invariant response and steering flexibility. It focuses on some optimal design of frequency-invariant beampatterns in any desired direction along the sensor plane. The major contributions are as follows. 1) We explain how to include the steering information in the desired directivity pattern. 2) We show that the optimal approximation of the beamformer's beampattern with a UCA from a least-squares error perspective is the Jacobi–Anger expansion. 3) We develop an approach to the design of any desired symmetric directivity pattern, where the deduced beampattern is almost frequency invariant and its main beam can be pointed to any wanted direction in the sensor plane. 4) With the proposed approach, we derive an explicit form of the white noise gain (WNG) and the directivity factor (DF), and explain clearly the white noise amplification problem at low frequencies and the DF degradation at high frequencies. The analysis also indicates that increasing the number of microphones can always improve the WNG. We show that the proposed method is a generalization of circular differential microphone arrays. The relationship between the proposed method and the so-called circular harmonics beamformers is also discussed.

94 citations


Proceedings ArticleDOI
TL;DR: In this paper, a convolution neural network (CNN) based classification method for broadband DOA estimation is proposed, where the phase component of the short-time Fourier transform coefficients of the received microphone signals are directly fed into the CNN and the features required for DOA estimations are learnt during training.
Abstract: A convolution neural network (CNN) based classification method for broadband DOA estimation is proposed, where the phase component of the short-time Fourier transform coefficients of the received microphone signals are directly fed into the CNN and the features required for DOA estimation are learnt during training. Since only the phase component of the input is used, the CNN can be trained with synthesized noise signals, thereby making the preparation of the training data set easier compared to using speech signals. Through experimental evaluation, the ability of the proposed noise trained CNN framework to generalize to speech sources is demonstrated. In addition, the robustness of the system to noise, small perturbations in microphone positions, as well as its ability to adapt to different acoustic conditions is investigated using experiments with simulated and real data.

91 citations


Proceedings ArticleDOI
05 Mar 2017
TL;DR: The core of the algorithm estimates a time-frequency mask which represents the target speech and use masking-based beamforming to enhance corrupted speech and propose a masked-based post-filter to further suppress the noise in the output of beamforming.
Abstract: We propose a speech enhancement algorithm based on single- and multi-microphone processing techniques. The core of the algorithm estimates a time-frequency mask which represents the target speech and use masking-based beamforming to enhance corrupted speech. Specifically, in single-microphone processing, the received signals of a microphone array are treated as individual signals and we estimate a mask for the signal of each microphone using a deep neural network (DNN). With these masks, in multi-microphone processing, we calculate a spatial covariance matrix of noise and steering vector for beamforming. In addition, we propose a masking-based post-filter to further suppress the noise in the output of beamforming. Then, the enhanced speech is sent back to DNN for mask re-estimation. When these steps are iterated for a few times, we obtain the final enhanced speech. The proposed algorithm is evaluated as a frontend for automatic speech recognition (ASR) and achieves a 5.05% average word error rate (WER) on the real environment test set of CHiME-3, outperforming the current best algorithm by 13.34%.

71 citations


Journal ArticleDOI
TL;DR: In this article, a fiber-optic microphone (FOM) based on graphene oxide (GO) membrane is presented. And the results indicate the excellent suitability of this FOM for acoustic detection in the audible range with high sensitivity and high fidelity.
Abstract: We present a fiber-optic microphone (FOM) based on graphene oxide (GO) membrane in this study. A Fabry–Perot cavity consisting of a single-mode fiber and a piece of GO membrane works as the acoustic sensing structure. Using the GO as the core acoustic sensing component, the fabricating process of the FOM is demonstrated to be simple and efficient. Acoustic tests show that this FOM achieves an average minimum detectable pressure of 10.2 μPa/Hz 1/2, while maintaining a linear acoustic pressure response and a flat frequency response in the range of 100 Hz to 20 kHz. These results indicate the excellent suitability of this FOM for acoustic detection in the audible range with high sensitivity and high fidelity.

Journal ArticleDOI
01 Apr 2017
TL;DR: A new acoustic cryptanalysis key extraction attack, applicable to GnuPG’s implementation of RSA, that can extract full 4096-bit RSA decryption keys from laptop computers, within an hour, using the sound generated by the computer during the decryption of some chosen ciphertexts.
Abstract: Many computers emit a high-pitched noise during operation, due to vibration in some of their electronic components. These acoustic emanations are more than a nuisance: They can convey information about the software running on the computer and, in particular, leak sensitive information about security-related computations. In a preliminary presentation (Eurocrypt'04 rump session), we have shown that different RSA keys induce different sound patterns, but it was not clear how to extract individual key bits. The main problem was the very low bandwidth of the acoustic side channel (under 20 kHz using common microphones, and a few hundred kHz using ultrasound microphones), and several orders of magnitude below the GHz-scale clock rates of the attacked computers. In this paper, we describe a new acoustic cryptanalysis key extraction attack, applicable to GnuPG's implementation of RSA. The attack can extract full 4096-bit RSA decryption keys from laptop computers (of various models), within an hour, using the sound generated by the computer during the decryption of some chosen ciphertexts. We experimentally demonstrate such attacks, using a plain mobile phone placed next to the computer, or a more sensitive microphone placed 10 meters away.

Patent
08 Feb 2017
TL;DR: In this paper, a system, method and one or more wireless earpieces for performing self-configuration is presented, where a user is identified utilizing the one/more wireless earpiece.
Abstract: A system, method and one or more wireless earpieces for performing self-configuration. A user is identified utilizing the one or more wireless earpieces. Noises from an environment of the user are received utilizing the one or more wireless earpieces. An audio profile associated with the noises of the environment of the user is determined. The components of the one or more wireless earpieces are automatically configured in response to the audio profile and the user identified as utilizing the one or more wireless earpieces.

Journal ArticleDOI
TL;DR: Object-oriented design based on Python allows for easy-to-use scripting and graphical user interfaces, the practical combination with other data handling and scientific computing libraries, and the possibility to extend the software by implementing new processing methods with minimal effort.

Journal ArticleDOI
TL;DR: A hybrid impedance control architecture for an electroacoustic absorber that combines an improved microphone-based feedforward control with a current-driven electrodynamic loudspeaker system is proposed, and an application to damping of resonances in a duct is presented.
Abstract: This paper proposes a hybrid impedance control architecture for an electroacoustic absorber that combines an improved microphone-based feedforward control with a current-driven electrodynamic loudspeaker system. Feedforward control architecture enables stable control to be achieved, and current driving method discards the effect of the voice coil inductance. A method is given for designing the transfer function to be implemented in the controller, according to a target-specific acoustic impedance and mechanical parameters of the transducer. Numerical simulations present the expected acoustic performance, introducing global performance indicators such as the bandwidth of efficient absorption. Experimental assessments in a waveguide confirmed the accuracy of the model and the efficiency of the hybrid control technique for achieving broadband stable low-frequency electroacoustic absorbers. An application to damping of resonances in a duct is also presented, and the application to the modal equalization in actual listening rooms is finally discussed.

Journal ArticleDOI
TL;DR: Simulations show that the best RIR interpolation is obtained when combining the novel time-domain acoustic model with the spatio-temporal sparsity regularization, outperforming the results of the plane wave decomposition model even when far fewer microphone measurements are available.
Abstract: Room Impulse Responses (RIRs) are typically measured using a set of microphones and a loudspeaker. When RIRs spanning a large volume are needed, many microphone measurements must be used to spatially sample the sound field. In order to reduce the number of microphone measurements, RIRs can be spatially interpolated. In the present study, RIR interpolation is formulated as an inverse problem. This inverse problem relies on a particular acoustic model capable of representing the measurements. Two different acoustic models are compared: the plane wave decomposition model and a novel time-domain model, which consists of a collection of equivalent sources creating spherical waves. These acoustic models can both approximate any reverberant sound field created by a far-field sound source. In order to produce an accurate RIR interpolation, sparsity regularization is employed when solving the inverse problem. In particular, by combining different acoustic models with different sparsity promoting regularizations, spatial sparsity, spatio-spectral sparsity, and spatio-temporal sparsity are compared. The inverse problem is solved using a matrix-free large-scale optimization algorithm. Simulations show that the best RIR interpolation is obtained when combining the novel time-domain acoustic model with the spatio-temporal sparsity regularization, outperforming the results of the plane wave decomposition model even when far fewer microphone measurements are available.

Patent
14 Mar 2017
TL;DR: In this paper, a first earpiece having an earpiece housing configured to isolate an ambient environment from a tympanic membrane by physically blocking ambient sound, a microphone disposed within the housing and configured to receive a first ambient audio signal from the ambient environment, a processor operatively connected to the microphone wherein the processor is configured to determine if the first ambient signal exceeds a threshold sound level, and a speaker operating with the processor.
Abstract: A system includes a first earpiece having an earpiece housing configured to isolate an ambient environment from a tympanic membrane by physically blocking ambient sound, a microphone disposed within the housing and configured to receive a first ambient audio signal from the ambient environment, a processor operatively connected to the microphone wherein the processor is configured to receive the first ambient audio signal from the microphone and determine if the first ambient signal exceeds a threshold sound level, and a speaker operatively connected to the processor In a first mode of operation the processor determines that the first ambient audio signal exceeds the threshold sound level and processes the first ambient audio signal to modify the first ambient audio signal In a second mode of operation the processor determines that the first ambient audio signal does not exceed the threshold sound level and reproduces the first ambient audio signal at the speaker

Patent
21 Mar 2017
TL;DR: In this paper, a first acoustic response of a room in which the audio playback device is located is determined based on the received indication of first audio content, and a mapping is applied to the first acoustic responses to determine a second acoustic response.
Abstract: An audio playback device comprises a microphone, a speaker, and a processor. The processor is arranged to output by the speaker first audio content and receive by the microphone an indication of the first audio content. A first acoustic response of a room in which the audio playback device is located is determined based on the received indication of first audio content. A mapping is applied to the first acoustic response to determine a second acoustic response. The second acoustic response is indicative of an approximated acoustic response of the room at a spatial location different from a spatial location of the microphone. The second audio content output by the speaker is adjusted based on the second response.

Journal ArticleDOI
TL;DR: In this article, the authors proposed to use the time-frequency processing approach, which formulates a spatial filter that can enhance a target direction based on local direction of arrival estimates at individual timefrequency bins.
Abstract: When a micro aerial vehicle (MAV) captures sounds emitted by a ground or aerial source, its motors and propellers are much closer to the microphone(s) than the sound source, thus leading to extremely low signal-to-noise ratios (SNR), e.g., −15 dB. While microphone-array techniques have been investigated intensively, their application to MAV-based ego-noise reduction has been rarely reported in the literature. To fill this gap, we implement and compare three types of microphone-array algorithms to enhance the target sound captured by an MAV. These algorithms include a recently emerged technique, time-frequency spatial filtering, and two well-known techniques, beamforming and blind source separation. In particular, based on the observation that the target sound and the ego-noise usually have concentrated energy at sparsely isolated time-frequency bins, we propose to use the time-frequency processing approach, which formulates a spatial filter that can enhance a target direction based on local direction of arrival estimates at individual time-frequency bins. By exploiting the time-frequency sparsity of the acoustic signal, this spatial filter works robustly for sound enhancement in the presence of strong ego-noise. We analyze in details the three techniques and conduct a comparative evaluation with real-recorded MAV sounds. Experimental results show the superiority of blind source separation and time-frequency filtering in low-SNR scenarios.

Journal ArticleDOI
TL;DR: In this article, the authors report on the relevancy and accuracy of using mobile phones in participatory noise pollution monitoring studies in an urban context, where 60 participants used the same smartphone model to measure environmental noise at 28 different locations in Paris.

Journal ArticleDOI
TL;DR: In this paper, a non-contacting measurement technique based on acoustic monitoring is proposed to detect cracks or damage within a structure by observing sound radiation using a single microphone or a b...
Abstract: This article proposes a non-contacting measurement technique based on acoustic monitoring to detect cracks or damage within a structure by observing sound radiation using a single microphone or a b...

Patent
21 Feb 2017
TL;DR: In this article, the authors present a method to disable the microphone of the first networked microphone device via one of the microphones of the networked device or a network interface of the device.
Abstract: Systems and methods disclosed herein include, while a microphone of a first networked microphone device is enabled, determining whether a first reference device is in a specific state, and in response to determining that the first reference device is in the specific state, disabling the microphone of the first networked microphone device. Some embodiments further include, while the microphone of the first networked microphone device is enabled, receiving a command to disable the microphone of the first networked microphone device via one of the microphone of the networked microphone device or a network interface of the networked microphone device, and in response to receiving the command to disable the microphone of the networked microphone device via one of the microphone of the networked microphone device or the network interface of the networked microphone device, disabling the microphone of the networked microphone device.

Journal ArticleDOI
TL;DR: An approach that combines different rings of microphones together with appropriate radii can mitigate both white noise amplification and deep nulls problems, and Simulation results justify the superiority of the robust CCDMA approach over the traditional CDMAs and robust CDAs.
Abstract: Circular differential microphone arrays (CDMAs) have been extensively studied in speech and audio applications for their steering flexibility, potential to achieve frequency-invariant directivity patterns, and high directivity factors (DFs) However, CDMAs suffer from both white noise amplification and deep nulls in the DF and in the white noise gain (WNG) due to spatial aliasing, which considerably restricts their use in practical systems The minimum-norm filter can improve the WNG by using more microphones than required for a given differential array order; but this filter increases the array aperture (radius), which exacerbates the spatial aliasing problem and worsens the nulls problem in the DF Through theoretical analysis, this research finds that the nulls of the CDMAs are caused by the zeros in the denominators of the filters' coefficients, ie, the zeros of the Bessel function To deal with both the white noise amplification and deep nulls problems, this paper develops an approach that combines

Journal ArticleDOI
TL;DR: A multiple sound source localization and counting method based on a relaxed sparsity of speech signal that achieves a higher accuracy of DOA estimation and source counting compared with the existing techniques has higher efficiency and lower complexity, which makes it suitable for real-time applications.
Abstract: In this work, a multiple sound source localization and counting method based on a relaxed sparsity of speech signal is presented. A soundfield microphone is adopted to overcome the redundancy and complexity of microphone array in this paper. After establishing an effective measure, the relaxed sparsity of speech signals is investigated. According to this relaxed sparsity, we can obtain an extensive assumption that “single-source” zones always exist among the soundfield microphone signals, which is validated by statistical analysis. Based on “single-source” zone detecting, the proposed method jointly estimates the number of active sources and their corresponding DOAs by applying a peak searching approach to the normalized histogram of estimated DOA. The cross distortions caused by multiple simultaneously occurring sources are solved by estimating DOA in these “single-source” zones. The evaluations reveal that the proposed method achieves a higher accuracy of DOA estimation and source counting compared with the existing techniques. Furthermore, the proposed method has higher efficiency and lower complexity, which makes it suitable for real-time applications.

Patent
14 Jul 2017
TL;DR: In this article, the authors present a method for detecting, via a microphone, calibration sounds as emitted by one or more playback devices of one or multiple zones during a calibration sequence.
Abstract: Example techniques may involve calibration with multiple recording devices. An implementation may include detecting, via a microphone, one or more calibration sounds as emitted by one or more playback devices of one or more zones during a calibration sequence. The implementation may further include determining a first response, the first response representing a response of a given environment to the one or more calibration sounds as detected by the first recording device and receiving data indicating a second response, the second response representing a response of the given environment to the one or more calibration sounds as detected by a second recording device. The implementation may also include determining a calibration for the one or more playback devices based on the first response and the second response and sending, to the one or more zones, an instruction that applies the calibration to playback by the one or more playback devices.


Journal ArticleDOI
TL;DR: This paper proposes a new approach to blind SRO estimation for an asynchronous wireless acoustic sensor network, which exploits the phase drift of the coherence between the asynchronous microphones signals and uses the use of the least-squares coherence drift (LCD), which is effective even for short signal segments.
Abstract: Microphone arrays allow to exploit the spatial coherence between simultaneously recorded microphone signals, e.g., to perform speech enhancement, i.e., to extract a speech signal and reduce background noise. However, in systems where the microphones are not sampled in a synchronous fashion, as it is often the case in wireless acoustic sensor networks, a sampling rate offset (SRO) exists between signals recorded in different nodes, which severely affects the speech enhancement performance. To avoid this performance reduction, the SRO should be estimated and compensated for. In this paper, we propose a new approach to blind SRO estimation for an asynchronous wireless acoustic sensor network, which exploits the phase drift of the coherence between the asynchronous microphones signals. We utilize the fact that the SRO causes a linearly increasing time delay between two signals and hence a linearly increasing phase-shift in the short-time Fourier transform domain. The increasing phase shift, observed as a phase drift of the coherence between the signals, is used in a weighted least-squares framework to estimate the SRO. This method is referred to as least-squares coherence drift (LCD). Experimental results in different real-world recording and simulated scenarios show the effectiveness of LCD compared to different benchmark methods. The LCD is effective even for short signal segments. We finally demonstrate that the use of the LCD within a conventional compensation approach eliminates the performance loss due to SRO in a speech enhancement algorithm based on the multichannel Wiener filter.

Journal ArticleDOI
TL;DR: The grid-free compressive beamforming can provide high-resolution and low-contamination imaging, allowing accurate and fast estimation of two-dimensional DOAs and quantification of source strengths, even with non-uniform arrays and noisy measurements.
Abstract: Compressive beamforming realizes the direction-of-arrival (DOA) estimation and strength quantification of acoustic sources by solving an underdetermined system of equations relating microphone pressures to a source distribution via compressive sensing. The conventional method assumes DOAs of sources to lie on a grid. Its performance degrades due to basis mismatch when the assumption is not satisfied. To overcome this limitation for the measurement with plane microphone arrays, a two-dimensional grid-free compressive beamforming is developed. First, a continuum based atomic norm minimization is defined to denoise the measured pressure and thus obtain the pressure from sources. Next, a positive semidefinite programming is formulated to approximate the atomic norm minimization. Subsequently, a reasonably fast algorithm based on alternating direction method of multipliers is presented to solve the positive semidefinite programming. Finally, the matrix enhancement and matrix pencil method is introduced to process the obtained pressure and reconstruct the source distribution. Both simulations and experiments demonstrate that under certain conditions, the grid-free compressive beamforming can provide high-resolution and low-contamination imaging, allowing accurate and fast estimation of two-dimensional DOAs and quantification of source strengths, even with non-uniform arrays and noisy measurements.

Posted Content
TL;DR: In this article, the authors proposed a data-driven approach to select the best sensor subset using a greedy strategy to minimize the transmission cost while constraining the output noise power (or signal-to-noise ratio).
Abstract: In large-scale wireless acoustic sensor networks (WASNs), many of the sensors will only have a marginal contribution to a certain estimation task. Involving all sensors increases the energy budget unnecessarily and decreases the lifetime of the WASN. Using microphone subset selection, also termed as sensor selection, the most informative sensors can be chosen from a set of candidate sensors to achieve a prescribed inference performance. In this paper, we consider microphone subset selection for minimum variance distortionless response (MVDR) beamformer based noise reduction. The best subset of sensors is determined by minimizing the transmission cost while constraining the output noise power (or signal-to-noise ratio). Assuming the statistical information on correlation matrices of the sensor measurements is available, the sensor selection problem for this model-driven scheme is first solved by utilizing convex optimization techniques. In addition, to avoid estimating the statistics related to all the candidate sensors beforehand, we also propose a data-driven approach to select the best subset using a greedy strategy. The performance of the greedy algorithm converges to that of the model-driven method, while it displays advantages in dynamic scenarios as well as on computational complexity. Compared to a sparse MVDR or radius-based beamformer, experiments show that the proposed methods can guarantee the desired performance with significantly less transmission costs.

Journal ArticleDOI
TL;DR: This work assessed whether microphone sensitivity impacted the probability of detecting bird vocalizations by broadcasting a sequence of 12 calls toward an array of commercially available ARUs equipped with microphones of varying sensitivities under three levels of experimentally induced noise conditions selected to reflect the range of noise levels commonly encountered during avian surveys.
Abstract: Autonomous recording units (ARUs) are emerging as an effective tool for avian population monitoring and research. Although ARU technology is being rapidly adopted, there is a need to establish whether variation in ARU components and their degradation with use might introduce detection biases that would affect long-term monitoring and research projects. We assessed whether microphone sensitivity impacted the probability of detecting bird vocalizations by broadcasting a sequence of 12 calls toward an array of commercially available ARUs equipped with microphones of varying sensitivities under three levels (32 dBA, 42 dBA, and 50 dBA) of experimentally induced noise conditions selected to reflect the range of noise levels commonly encountered during avian surveys. We used binomial regression to examine factors influencing probability of detection for each species and used these to examine the impact of microphone sensitivity on the effective detection area (ha) for each species. Microphone sensitivity loss reduced detection probability for all species examined, but the magnitude of the effect varied between species and often interacted with distance. Microphone sensitivity loss reduced the effective detection area by an average of 25% for microphones just beyond manufacturer specifications (-5 dBV) and by an average of 66% for severely compromised microphones (-20 dBV). Microphone sensitivity loss appeared to be more problematic for low frequency calls where reduction in the effective detection area occurred most rapidly. Microphone degradation poses a source of variation in avian surveys made with ARUs that will require regular measurement of microphone sensitivity and criteria for microphone replacement to ensure scientifically reproducible results. We recommend that research and monitoring projects employing ARUs test their microphones regularly, replace microphones with declining sensitivity, and record sensitivity as a potential covariate in statistical analyses of acoustic data. Variabilité et dégradation des microphones: implications pour les programmes de surveillance utilisant des unités d'enregistrement autonomes RÉSUMÉ. Les unités d'enregistrement autonome (UEA) apparaissent comme un outil efficace pour la recherche et le suivi des populations aviaires. Bien que la technologie UEA ait été rapidement adoptée, il est nécessaire d'établir si la variation des composants des UEA et leur dégradation avec l'utilisation pourraient introduire des biais de détection qui affecteraient les projets de recherche et le suivi à long terme. Nous avons évalué si la sensibilité du microphone a une incidence sur la probabilité de détecter les vocalisations d'oiseaux en diffusant une séquence de 12 chants d'oiseaux vers une série d'UEA disponibles dans le commerce, équipées de microphones de sensibilités variables, testés sous trois niveaux (32 dBA, 42 dBA et 50 dBA) de bruit induit expérimentalement. Ces conditions ont été sélectionnées pour reproduire la gamme de niveaux de bruit généralement rencontrés lors des études aviaires. Nous avons utilisé une régression binomiale pour examiner les facteurs influençant la probabilité de détection pour chaque espèce et les avons utilisés pour examiner l'impact de la sensibilité du microphone sur la zone de détection (ha) pour chaque espèce. La perte de sensibilité du microphone a réduit la probabilité de détection de toutes les espèces testées, mais l'ampleur de l'effet variait selon les espèces et interagissait souvent avec la distance. La perte de sensibilité du microphone a réduit la zone de détection d'en moyenne 25% pour les microphones juste en dessous des spécifications du fabricant (-5 dBV) et en moyenne de 66% pour les microphones fortement compromis (-20 dBV). La perte de sensibilité du microphone semble être plus problématique pour les chants d'oiseaux de faible fréquence où la réduction de la zone de détection s'est produite le plus rapidement. La dégradation du microphone induit une source de variation dans les recherches aviaires réalisées avec des ARU qui nécessiteront une mesure régulière de la sensibilité du microphone et des critères de remplacement du microphone afin d'obtenir des résultats scientifiquement reproductibles. Nous recommandons que les projets de recherche et de suivi utilisant des ARU testent régulièrement leurs microphones, remplacent les microphones dont la sensibilité est décroissante, et tiennent compte de la sensibilité comme covariable potentielle dans les analyses statistiques des données acoustiques.