scispace - formally typeset
Search or ask a question

Showing papers on "Microphone published in 2007"


Proceedings ArticleDOI
Chunyi Peng1, Guobin Shen1, Yongguang Zhang1, Yanlin Li1, Kun Tan1 
06 Nov 2007
TL;DR: The design, implementation, and evaluation of BeepBeep is presented, a high-accuracy acoustic-based ranging system that operates in a spontaneous, ad-hoc, and device-to-device context without leveraging any pre-planned infrastructure.
Abstract: We present the design, implementation, and evaluation of BeepBeep, a high-accuracy acoustic-based ranging system. It operates in a spontaneous, ad-hoc, and device-to-device context without leveraging any pre-planned infrastructure. It is a pure software-based solution and uses only the most basic set of commodity hardware -- a speaker, a microphone, and some form of device-to-device communication -- so that it is readily applicable to many low-cost sensor platforms and to most commercial-off-the-shelf mobile devices like cell phones and PDAs. It achieves high accuracy through a combination of three techniques: two-way sensing, self-recording, and sample counting. The basic idea is the following. To estimate the range between two devices, each will emit a specially-designed sound signal ("Beep") and collect a simultaneous recording from its microphone. Each recording should contain two such beeps, one from its own speaker and the other from its peer. By counting the number of samples between these two beeps and exchanging the time duration information with its peer, each device can derive the two-way time of flight of the beeps at the granularity of sound sampling rate. This technique cleverly avoids many sources of inaccuracy found in other typical time-of-arrival schemes, such as clock synchronization, non-real-time handling, software delays, etc. Our experiments on two common cell phone models have shown that we can achieve around one or two centimeters accuracy within a range of more than ten meters, despite a series of technical challenges in implementing the idea.

519 citations


Journal ArticleDOI
TL;DR: The use of classic acoustic beamforming techniques is proposed together with several novel algorithms to create a complete frontend for speaker diarization in the meeting room domain and shows improvements in a speech recognition task.
Abstract: When performing speaker diarization on recordings from meetings, multiple microphones of different qualities are usually available and distributed around the meeting room. Although several approaches have been proposed in recent years to take advantage of multiple microphones, they are either too computationally expensive and not easily scalable or they cannot outperform the simpler case of using the best single microphone. In this paper, the use of classic acoustic beamforming techniques is proposed together with several novel algorithms to create a complete frontend for speaker diarization in the meeting room domain. New techniques we are presenting include blind reference-channel selection, two-step time delay of arrival (TDOA) Viterbi postprocessing, and a dynamic output signal weighting algorithm, together with using such TDOA values in the diarization to complement the acoustic information. Tests on speaker diarization show a 25% relative improvement on the test set compared to using a single most centrally located microphone. Additional experimental results show improvements using these techniques in a speech recognition task.

444 citations


Journal Article
TL;DR: Directional audio coding (DirAC) as discussed by the authors is a method for spatial sound representation, applicable for different sound reproduction systems in the analysis part the diffuseness and direction of arrival of sound are estimated in a single location depending on time and frequency.
Abstract: Directional audio coding (DirAC) is a method for spatial sound representation, applicable for different sound reproduction systems In the analysis part the diffuseness and direction of arrival of sound are estimated in a single location depending on time and frequency In the synthesis part microphone signals are first divided into nondiffuse and diffuse parts, and are then reproduced using different strategies DirAC is developed from an existing technology for impulse response reproduction, spatial impulse response rendering (SIRR), and implementations of DirAC for different applications are described

408 citations


DOI
01 Jan 2007
TL;DR: Novel single- and multimicrophone speech dereverberation algorithms are developed that aim at the suppression of late reverberation, i.e., signal processing techniques to reduce the detrimental effects of reflections.
Abstract: In speech communication systems, such as voice-controlled systems, hands-free mobile telephones, and hearing aids, the received microphone signals are degraded by room reverberation, background noise, and other interferences This signal degradation may lead to total unintelligibility of the speech and decreases the performance of automatic speech recognition systems In the context of this work reverberation is the process of multi-path propagation of an acoustic sound from its source to one or more microphones The received microphone signal generally consists of a direct sound, reflections that arrive shortly after the direct sound (commonly called early reverberation), and reflections that arrive after the early reverberation (commonly called late reverberation) Reverberant speech can be described as sounding distant with noticeable echo and colouration These detrimental perceptual effects are primarily caused by late reverberation, and generally increase with increasing distance between the source and microphone Conversely, early reverberations tend to improve the intelligibility of speech In combination with the direct sound it is sometimes referred to as the early speech component Reduction of the detrimental effects of reflections is evidently of considerable practical importance, and is the focus of this dissertation More specifically the dissertation deals with dereverberation techniques, ie, signal processing techniques to reduce the detrimental effects of reflections In the dissertation, novel single- and multimicrophone speech dereverberation algorithms are developed that aim at the suppression of late reverberation, ie, at estimation of the early speech component This is done via so-called spectral enhancement techniques that require a specific measure of the late reverberant signal This measure, called spectral variance, can be estimated directly from the received (possibly noisy) reverberant signal(s) using a statistical reverberation model and a limited amount of a priori knowledge about the acoustic channel(s) between the source and the microphone(s) In our work an existing single-channel statistical reverberation model serves as a starting point The model is characterized by one parameter that depends on the acoustic characteristics of the environment We show that the spectral variance estimator that is based on this model, can only be used when the source-microphone distance is larger than the so-called critical distance This is, crudely speaking, the distance where the direct sound power is equal to the total reflective power A generalization of the statistical reverberation model in which the direct sound is incorporated is developed This model requires one additional parameter that is related to the ratio between the direct sound energy and the sound energy of all reflections The generalized model is used to derive a novel spectral variance estimator When the novel estimator is used for dereverberation rather than the existing estimator, and the source-microphone distance is smaller than the critical distance, the dereverberation performance is significantly increased Single-microphone systems only exploit the temporal and spectral diversity of the received signal Reverberation, of course, also induces spatial diversity To additionally exploit this diversity, multiple microphones must be used, and their outputs must be combined by a suitable spatial processor such as the so-called delay and sum beamformer It is not a priori evident whether spectral enhancement is best done before or after the spatial processor For this reason we investigate both possibilities, as well as a merge of the spatial processor and the spectral enhancement technique An advantage of the latter option is that the spectral variance estimator can be further improved Our experiments show that the use of multiple microphones affords a significant improvement of the perceptual speech quality The applicability of the theory developed in this dissertation is demonstrated using a hands-free communication system Since hands-free systems are often used in a noisy and reverberant environment, the received microphone signal does not only contain the desired signal but also interferences such as room reverberation that is caused by the desired source, background noise, and a far-end echo signal that results from a sound that is produced by the loudspeaker Usually an acoustic echo canceller is used to cancel the far-end echo Additionally a post-processor is used to suppress background noise and residual echo, ie, echo which could not be cancelled by the echo canceller In this work a novel structure and post-processor for an acoustic echo canceller are developed The post-processor suppresses late reverberation caused by the desired source, residual echo, and background noise The late reverberation and late residual echo are estimated using the generalized statistical reverberation model Experimental results convincingly demonstrate the benefits of the proposed system for suppressing late reverberation, residual echo and background noise The proposed structure and post-processor have a low computational complexity, a highly modular structure, can be seamlessly integrated into existing hands-free communication systems, and affords a significant increase of the listening comfort and speech intelligibility

239 citations


Patent
09 Mar 2007
TL;DR: In this paper, a directional microphone array having at least two microphones generates forward and backward cardioid signals from two (e.g., omnidirectional) microphone signals, and an adaptation factor is applied to the backward signal, and the resulting adjusted backward signal is subtracted from the forward cardioid signal to generate a (firstorder) output audio signal corresponding to a beampattern having no nulls for negative values of the adaptation factor.
Abstract: In one embodiment, a directional microphone array having (at least) two microphones generates forward and backward cardioid signals from two (e.g., omnidirectional) microphone signals. An adaptation factor is applied to the backward cardioid signal, and the resulting adjusted backward cardioid signal is subtracted from the forward cardioid signal to generate a (first-order) output audio signal corresponding to a beampattern having no nulls for negative values of the adaptation factor. After low-pass filtering, spatial noise suppression can be applied to the output audio signal. Microphone arrays having one (or more) additional microphones can be designed to generate second- (or higher-) order output audio signals.

187 citations


Journal ArticleDOI
TL;DR: Analysis of aliasing for spherical microphone arrays, which have been recently studied for a range of applications, is presented, showing how high-order spherical harmonic coefficients are aliased into the lower orders.
Abstract: Performance of microphone arrays at the high-frequency range is typically limited by aliasing, which is a result of the spatial sampling process. This paper presents analysis of aliasing for spherical microphone arrays, which have been recently studied for a range of applications. The paper presents theoretical analysis of spatial aliasing for various sphere sampling configurations, showing how high-order spherical harmonic coefficients are aliased into the lower orders. Spatial antialiasing filters on the sphere are then introduced, and the performance of spatially constrained filters is compared to that of the ideal antialiasing filter. A simulation example shows how the effect of aliasing on the beam pattern can be reduced by the use of the antialiasing filters

174 citations


Journal ArticleDOI
TL;DR: The approach presented, a spherical microphone array can have very flexible layouts of microphones on the spherical surface, yet optimally approximate a desired beampattern of higher order within a specified robustness constraint, is described.
Abstract: This paper describes a methodology for designing a flexible and optimal spherical microphone array for beamforming. Using the approach presented, a spherical microphone array can have very flexible layouts of microphones on the spherical surface, yet optimally approximate a desired beampattern of higher order within a specified robustness constraint. Depending on the specified beampattern order, our approach automatically achieves optimal performances in two cases: when the specified beampattern order is reachable within the robustness constraint we achieve a beamformer with optimal approximation of the desired beampattern; otherwise we achieve a beamformer with maximum directivity, both robustly. For efficient implementation, we also developed an adaptive algorithm for computing the beamformer weights. It converges to the optimal performance quickly while exactly satisfying the specified frequency response and robustness constraint in each step. One application of the method is to allow the building of a real-world system, where microphones may not be placeable on regions, such as near cable outlets and/or a mounting base, while having a minimal effect on the performance. Simulation results are presented

172 citations


Journal ArticleDOI
TL;DR: This paper treats a microphone array as a multiple-input multiple-output (MIMO) system and study its signal-enhancement performance, and develops a general framework for analyzing performance of beamforming algorithms based on the acoustic MIMO channel impulse responses.
Abstract: Although many microphone-array beamforming algorithms have been developed over the past few decades, most such algorithms so far can only offer limited performance in practical acoustic environments. The reason behind this has not been fully understood and further research on this matter is indispensable. In this paper, we treat a microphone array as a multiple-input multiple-output (MIMO) system and study its signal-enhancement performance. Our major contribution is fourfold. First, we develop a general framework for analyzing performance of beamforming algorithms based on the acoustic MIMO channel impulse responses. Second, we study the bounds for the length of the beamforming filter, which in turn shows the performance bounds of beamforming in terms of speech dereverberation and interference suppression. Third, we address the connection between beamforming and the multiple-input/output inverse theorem (MINT). Finally, we discuss the intrinsic relationships among different classical beamforming techniques and explain, from the channel condition perspective, what the prerequisites are for those techniques to work

144 citations


Patent
Carlos Avendano1
29 Jan 2007
TL;DR: In this article, an inter-microphone level difference (ILD) was used to attenuate noise and enhance speech. But the ILD was not used to enhance the speech of the primary acoustic signal.
Abstract: Systems and methods for utilizing inter-microphone level differences (ILD) to attenuate noise and enhance speech are provided. In exemplary embodiments, primary and secondary acoustic signals are received by omni-directional microphones, and converted into primary and secondary electric signals. A differential microphone array module processes the electric signals to determine a cardioid primary signal and a cardioid secondary signal. The cardioid signals are filtered through a frequency analysis module which takes the signals and mimics a cochlea implementation (i.e., cochlear domain). Energy levels of the signals are then computed, and the results are processed by an ILD module using a non-linear combination to obtain the ILD. In exemplary embodiments, the non-linear combination comprises dividing the energy level associated with the primary microphone by the energy level associated with the secondary microphone. The ILD is utilized by a noise reduction system to enhance the speech of the primary acoustic signal.

144 citations



Journal ArticleDOI
TL;DR: This correspondence presents a binaural extension of a monaural multichannel noise reduction algorithm for hearing aids based on Wiener filtering that preserves the interaural time delay (ITD) cues of the speech component, thus allowing the user to correctly localize the speech source.
Abstract: Binaural hearing aids use microphone inputs from both the left and right hearing aid to generate an output for each ear. On the other hand, a monaural hearing aid generates an output by processing only its own microphone inputs. This correspondence presents a binaural extension of a monaural multichannel noise reduction algorithm for hearing aids based on Wiener filtering. In addition to significantly suppressing the noise interference, the algorithm preserves the interaural time delay (ITD) cues of the speech component, thus allowing the user to correctly localize the speech source. Unfortunately, binaural multichannel Wiener filtering distorts the ITD cues of the noise source. By adding a parameter to the cost function the amount of noise reduction performed by the algorithm can be controlled, and traded off for the preservation of the noise ITD cues

Patent
05 Jan 2007
TL;DR: In this article, a single prong, multiple signal conducting plug and plug detection circuitry is provided to detect whether a microphone type of plug (e.g., a four region plug including a microphone region and two audio regions) or a three region plug, or a non-microphone type of Plug is inserted into the jack of an electronic device.
Abstract: A single prong, multiple signal conducting plug and plug detection circuitry is provided. The plug may be electrically coupled to a stereo headset including a microphone. The plug may include four signal conducting regions arranged in a predetermined order along the length of the prong. Detection circuitry may be operative to determine whether a microphone type of plug (e.g., a four region plug including a microphone region and two audio regions, or a three region plug including microphone region and only one audio region) or a non-microphone type of plug (e.g., stereo plug) is inserted into the jack of an electronic device (e.g., mobile phone). Detection circuitry may also detect user activated functions performed in response to user activation of one or more switches included with the headset. For example, the headset may include a single switch for performing a function with respect to a microphone (e.g., end-call function).

Patent
20 Dec 2007
TL;DR: In this paper, a system determines the position of a sound source with a microphone in a fixed coordinate system by measuring audio signals that are analyzed and processed to determine the location of the sound source.
Abstract: A system determines the position of a sound source with a microphone in a fixed coordinate system. The microphone measures audio signals that are analyzed and processed to determine the position of the sound source in the fixed coordinate system. The system may adjust the direction of the microphone in the fixed coordinate system based on the processed audio signals and the position of the sound source. The microphone direction may be identified through an optical source that may be adjusted based on the processed audio signals and the position of the sound source.

Journal ArticleDOI
TL;DR: This paper discusses the design of robust superdirective beamformers by taking into account the statistics of the microphone characteristics, and shows how to determine a suitable parameter range for the other design procedures such that both a high directivity and a high level of robustness are obtained.
Abstract: Fixed superdirective beamformers using small-sized microphone arrays are known to be highly sensitive to errors in the assumed microphone array characteristics (gain, phase, position). This paper discusses the design of robust superdirective beamformers by taking into account the statistics of the microphone characteristics. Different design procedures are considered: applying a white noise gain constraint, trading off the mean noise and distortion energy, minimizing the mean deviation from the desired superdirective directivity pattern, and maximizing the mean or the worst case directivity factor. When computational complexity is not an issue, maximizing the mean or the worst case directivity factor is the preferred design procedure. In addition, it is shown how to determine a suitable parameter range for the other design procedures such that both a high directivity and a high level of robustness are obtained

Patent
19 Nov 2007
TL;DR: An electronic audio device for use with at least one earpiece or a pair of earpieces in a headphone, each earpiece having a microphone and a speaker located therein, including circuitry operatively coupled to the microphone and speaker, and a processor that evaluates a seal quality of the earpiece to a user's ear based on the seal quality measurements made while driving or exciting a signal into the speaker located in an earpiece as mentioned in this paper.
Abstract: An electronic audio device for use with at least one earpiece or a pair of earpieces, or a pair of earpieces in a headphone, each earpiece having a microphone and a speaker located therein, including circuitry operatively coupled to the microphone and speaker, and a processor operatively coupled to evaluate a seal quality of the earpiece to a user's ear based on seal quality measurements made while driving or exciting a signal into the speaker located in the earpiece and where the processor is configure to generate a visual or audio message identifying whether at least one earpiece is properly sealed based on the seal quality measurements.

Proceedings ArticleDOI
21 May 2007
TL;DR: In this paper, four different approaches are used to determine experimentally the sources of jet noise: spectral and directional information measured by a single microphone in the far field, fine scale turbulence, large turbulence structures of the jet flow, and a mirror microphone is used to measure the noise source distribution along the lengths of high speed jets.
Abstract: *† ‡ § The primary object of this investigation is to determine experimentally the sources of jet noise. In the present study, four different approaches are used. It is reasonable to assume that the characteristics of the noise sources are imprinted on their radiation fields. Under this assumption, it becomes possible to analyze the characteristics of the far field sound and then infer back on the characteristics of the sources. The first approach is to make use of the spectral and directional information measured by a single microphone in the far field. A detailed analysis of a large collection of far field noise data has been carried out. The purpose is to identify special characteristics that can be linked directly to those of the sources. The second approach is to measure the coherence of the sound field using two microphones. The autocorrelations and cross-correlations of these measurements offer not only valuable information on the spatial structure of the noise field in the radial and polar angle directions, but also on the sources inside the jet. The third approach involves measuring the correlation between turbulence fluctuations inside a jet and the radiated noise in the far field. This is the most direct and unambiguous way of identifying the sources of jet noise. In the fourth approach, a mirror microphone is used to measure the noise source distribution along the lengths of high-speed jets. Features and trends observed in noise source strength distributions are expected to shed light on the source mechanisms. It will be shown that all four types of data indicate clearly the existence of two distinct noise sources in jets. One source of noise is the fine scale turbulence and the other source is the large turbulence structures of the jet flow. Some of the salient features of the sound field associated with the two noise sources are reported in this paper.

Patent
Hideyuki Kirigaya1
11 May 2007
TL;DR: In this paper, a hands-free telephone conversation apparatus can reduce negative influence of an echo on a sound to be transmitted to a far-end user without deteriorating the sound without degrading the sound.
Abstract: A hands-free telephone conversation apparatus can reduce negative influence of an echo on a sound to be transmitted to a far-end user without deteriorating the sound. When a MPU determines that a mobile phone receives an incoming call, and a user starts to talk on the mobile phone, in accordance with an operation of an input device connected to an in-vehicle bus line, a MPU controls the mobile phone to have the mobile phone output, to the DSP, a sound signal indicative of a sound to be transmitted to the far-end user, and to have a switch select a loudspeaker, installed in an vehicle without being directed to the microphone, as a loudspeaker for outputting a sound received from the far-end user.

Journal ArticleDOI
TL;DR: A method for overcoming the problem of numerical ill-conditioning at frequencies which correspond to the nodal values of the spatial spherical modes with the result of excessive noise at these frequencies is proposed and investigated.
Abstract: Spherical microphone arrays have been studied for a wide range of applications, one of which is acoustic measurement and analysis. Since a minimal interaction between the array and the measured sound field is an advantage in this case, open-sphere arrays are preferable compared to rigid-sphere arrays. However, it has been shown that open-sphere arrays suffer from numerical ill-conditioning at frequencies which correspond to the nodal values of the spatial spherical modes with the result of excessive noise at these frequencies. A method for overcoming this problem using an open dual-sphere array is proposed in this correspondence and then investigated and compared to an array configured around a rigid sphere and an array composed of cardioid microphones. An optimal value for the ratio of the two spheres is derived, and simulation examples illustrating the advantage of the dual-sphere array are finally presented

Patent
28 Mar 2007
TL;DR: In this paper, the authors used a microphone mounted in the ear of the patient for detecting breathing sounds and a second external microphone together with an oximetric sensor to detect periods of apnea and/or hypopnea.
Abstract: Apparatus for use detection of apnea includes a microphone mounted in the ear of the patient for detecting breathing sounds and a second external microphone together with an oximetric sensor. A transmitter at the patient compresses and transmits the signals to a remote location where there is provided a detector module for receiving and analyzing the signals to extract data relating to the breathing. The detector uses the entropy or range of the signal to generate an estimate of air flow while extracting extraneous snoring and heart sounds and to analyze the estimate of air flow using Otsu's threshold to detect periods of apnea and/or hypopnea. A display provides data of the detected apnea/hypopnea episodes and related information for a clinician.

Journal ArticleDOI
TL;DR: Theoretical considerations and a simple but realistic model of the function of the cantilever-based photo-acoustic trace gas system are presented in this article, where the essential features of the model, including the dynamics, thermal characteristics, and noise models are derived.
Abstract: Theoretical considerations and a simple but realistic model of the function of the cantilever‐based photoacoustic trace gas system are presented. The essential features of the cantilever dynamics, thermal characteristics, and noise models are derived. Some other related constructions are shown with the practical implementations of the real system.

Journal ArticleDOI
TL;DR: This paper describes the design, fabrication, and characterization of a bulk-micromachined piezoelectric microphone for aeroacoustic applications and indicates a sensitivity of 1.66 microVPa, dynamic range greater than six orders of magnitude (35.7-169 dB, re 20 microPa), a capacitance of 10.8 nF, and a resonant frequency of 59.0 kHz.
Abstract: This paper describes the design, fabrication, and characterization of a bulk-micromachined piezoelectric microphone for aeroacoustic applications. Microphone design was accomplished through a combination of piezoelectric composite plate theory and lumped element modeling. The device consists of a 1.80-mm-diam, 3-microm-thick, silicon diaphragm with a 267-nm-thick ring of piezoelectric material placed near the boundary of the diaphragm to maximize sensitivity. The microphone was fabricated by combining a sol-gel lead zirconate-titanate deposition process on a silicon-on-insulator wafer with deep-reactive ion etching for the diaphragm release. Experimental characterization indicates a sensitivity of 1.66 microVPa, dynamic range greater than six orders of magnitude (35.7-169 dB, re 20 microPa), a capacitance of 10.8 nF, and a resonant frequency of 59.0 kHz.

PatentDOI
TL;DR: In this article, a method for personalized listening which can be used with an earpiece is provided that can include capturing ambient sound from an Ambient Sound Microphone (ASM), partially or fully occluded in an ear canal, monitoring the ambient sound for a target sound, and adjusting by way of an Ear Canal Receiver (ECR) in the earpiece a delivery of audio to the ear canal based on a detected target sound.
Abstract: At least one exemplary embodiment is directed to a method for personalized listening which can be used with an earpiece is provided that can include capturing ambient sound from an Ambient Sound Microphone (ASM) of an earpiece partially or fully occluded in an ear canal, monitoring the ambient sound for a target sound, and adjusting by way of an Ear Canal Receiver (ECR) in the earpiece a delivery of audio to an ear canal based on a detected target sound. A volume of audio content can be adjusted upon the detection of a target sound, and an audible notification can be presented to provide a warning.

Journal ArticleDOI
TL;DR: Findings provide direct evidence that bats adjust pulse intensity to compensate for changes in echo intensity to maintain a constant intensity of the echo returned from the approaching target at an optimal range.
Abstract: An onboard microphone (Telemike) was developed to examine changes in the basic characteristics of echolocation sounds of small frequency-modulated echolocating bats, Pipistrellus abramus. Using a dual high-speed video camera system, spatiotemporal observations of echolocation characteristics were conducted on bats during a landing flight task in the laboratory. The Telemike allowed us to observe emitted pulses and returning echoes to which the flying bats listened during flight, and the acoustic parameters could be precisely measured without traditional problems such as the directional properties of the recording microphone and the emitted pulse, or traveling loss of the sound in the air. Pulse intensity in bats intending to land exhibited a marked decrease by 30dB within 2m of the target wall, and the reduction rate was approximately 6.5dB per halving of distance. The intensity of echoes returning from the target wall indicated a nearly constant intensity (−42.6±5.5dB weaker than the pulse emitted in sea...

Journal ArticleDOI
TL;DR: The first-order analytical results show that the proposed solution reaches the Cramer-Rao lower bound (CRLB) accuracy for Gaussian noise as the signal-to-noise ratio tends to infinity.
Abstract: The localization of an acoustic source can be based on the energy measurements at a number of spatially separated microphones. This is because the amount of source energy attenuation at a microphone is proportional to the square of the distance between the source and the microphone. This paper develops an algebraic closed-form solution for the acoustic source localization problem using energy measurements, under the condition of direct line-of-sight and free space propagation. First-order analysis is applied to the proposed solution to study its performance, where only the linear noise terms are kept in obtaining the mean-square localization error. The first-order analytical results show that the proposed solution reaches the Cramer-Rao lower bound (CRLB) accuracy for Gaussian noise as the signal-to-noise ratio tends to infinity. In addition, the proposed solution provides much better accuracy than other closed-form solutions available in literature. Improvement on the proposed solution that extends its operating range beyond the threshold noise level was made by imposing nonnegative constraints. Simulations are included to corroborate the performance of the proposed method.

Journal ArticleDOI
TL;DR: In this article, the authors describe a system that gives a mobile robot the ability to perform automatic speech recognition with simultaneous speakers, where a microphone array is used along with a real-time implementation of geometric source separation (GSS) and a postfilter that gives further reduction of interference from other sources.
Abstract: This paper describes a system that gives a mobile robot the ability to perform automatic speech recognition with simultaneous speakers. A microphone array is used along with a real-time implementation of geometric source separation (GSS) and a postfilter that gives a further reduction of interference from other sources. The postfllter is also used to estimate the reliability of spectral features and compute a missing feature mask. The mask is used in a missing feature theory-based speech recognition system to recognize the speech from simultaneous Japanese speakers in the context of a humanoid robot. Recognition rates are presented for three simultaneous speakers located at 2 m from the robot. The system was evaluated on a 200-word vocabulary at different azimuths between sources, ranging from 10deg to 90deg. Compared to the use of the microphone array source separation alone, we demonstrate an average reduction in relative recognition error rate of 24% with the postfllter and of 42% when the missing features approach is combined with the postfllter. We demonstrate the effectiveness of our multisource microphone array postfilter and the improvement it provides when used in conjunction with the missing features theory.

Journal ArticleDOI
TL;DR: In this paper, a dual-backplate capacitive microphone for aeroacoustic measurements is presented, which consists of a 0.46mm-diameter 2.25mum-thick circular diaphragm and two circular backplates.
Abstract: This paper presents the development of a micro-machined dual-backplate capacitive microphone for aeroacoustic measurements. The device theory, fabrication, and characterization are discussed. The microphone is fabricated using the five-layer planarized-polysilicon SUMMiT V process at Sandia National Laboratories. The microphone consists of a 0.46-mm-diameter 2.25-mum-thick circular diaphragm and two circular backplates. The diaphragm is separated from each backplate by a 2-mum air gap. Experimental characterization of the microphone shows a sensitivity of 390 muV/Pa. The dynamic range of the microphone interfaced with a charge amplifier extends from the noise floor of 41 dB/ radicHz up to 164 dB and the resonant frequency is 178 kHz.

Patent
29 Nov 2007
TL;DR: A microphone system has a base with at least one electrical port for electrically communicating with an external device as mentioned in this paper, and a solid metal lid coupled to the base to form an internal chamber, and a silicon microphone secured to the lid within the chamber.
Abstract: A microphone system has a base with at least one electrical port for electrically communicating with an external device The system also has a solid metal lid coupled to the base to form an internal chamber, and a silicon microphone secured to the lid within the chamber The lid has an aperture for receiving an audible signal, while the microphone is electrically connected to the electrical port of the base

Patent
02 Apr 2007
TL;DR: In this article, the authors present a voice coder for voice communication that employs a multi-microphone system as part of an improved approach to enhancing signal quality and improving the signal to noise ratio for such voice communications.
Abstract: The present invention provides a voice coder for voice communication that employs a multi-microphone system as part of an improved approach to enhancing signal quality and improving the signal to noise ratio for such voice communications, where there is a special relationship between the position of a first microphone and a second microphone to provide the communication device with certain advantageous physical and acoustic properties In addition, the communication device can have certain physical characteristics, and design features In a two microphone arrangement, the first microphone is located in a location directed toward the speech source, while the second microphone is located in a location that provides a voice signal with significantly lower signal-to-noise ratio (SNR)

Patent
19 Mar 2007
TL;DR: In this article, a wearable surround sound system (300), which includes a processor (310), adapted to receive input signals representative of requested audio signals to be heard by the user (401 ) and in response to generate multiple output signals, is described.
Abstract: A wearable surround sound system (300), that includes: (a) a processor (310), adapted to receive input signals representative of requested audio signals to be heard by the user (401 )and in response to generate multiple output signals; and (b) multiple bone conduction speakers (330), coupled to the processor (310), adapted to convey the multiple output signals to at least one bone of a user (401); wherein the bone conduction speakers (330) are arrayed so as to stimulate an encompassing sound perception of the user (401 ). A wearable ambient sound reduction system (200), that includes: (a) a microphone (220), adapted to detect an ambient sound signal; (b) a processor (210) adapted to generate an output signal in response to the ambient sound signal; wherein the output signal, when conveyed to a bone of the user (403), reduces an affect that an ambient sound signal has upon the user (403); wherein the microphone (220) is coupled to the processor (210); and (c) a bone conduction speaker (230), coupled to the processor (210), adapted to convey the output signal to a bone of a user (403).

Patent
05 Sep 2007
TL;DR: In this paper, a MEMS microphone comprising a back plate and a diaphragm as well as a controllable bias voltage generator providing a DC bias voltage between the back plate of the transducer and the diaphrasal input was presented.
Abstract: A MEMS microphone comprising a MEMS transducer having a back plate and a diaphragm as well as controllable bias voltage generator providing a DC bias voltage between the back plate and the diaphragm. The microphone also has an amplifier with a controllable gain, and a memory for storing information for determining a bias voltage to be provided by the bias voltage generator and the gain of the amplifier.