scispace - formally typeset
Search or ask a question

Showing papers on "Microphone published in 2016"


PatentDOI
TL;DR: In this paper, an automated equalizing system including a microphone unit, a memory for storing characteristic data of the loudspeaker units and further for storing the frequency response measurements is described.
Abstract: An automated process for equalizing an audio system and an apparatus for implementing the process. An audio system includes a microphone unit, for receiving the sound waves radiated from a plurality of speakers, acoustic measuring circuitry, for calculating frequency response measurements; a memory, for storing characteristic data of the loudspeaker units and further for storing the frequency response measurements; and equalization calculation circuitry, for calculating an equalization pattern responsive to the digital data and responsive to the characteristic data of the plurality of loudspeaker units. Also described is an automated equalizing system including a acoustic measuring circuitry including a microphone for measuring frequency response at a plurality of locations; a memory, for storing the frequency responses at the plurality of locations; and equalization calculation circuitry, for calculating, from the frequency responses, an optimized equalization pattern.

198 citations


Proceedings ArticleDOI
20 Mar 2016
TL;DR: This work proposes to represent the stages of acoustic processing including beamforming, feature extraction, and acoustic modeling, as three components of a single unified computational network that obtained a 3.2% absolute word error rate reduction compared to a conventional pipeline of independent processing stages.
Abstract: Despite the significant progress in speech recognition enabled by deep neural networks, poor performance persists in some scenarios. In this work, we focus on far-field speech recognition which remains challenging due to high levels of noise and reverberation in the captured speech signals. We propose to represent the stages of acoustic processing including beamforming, feature extraction, and acoustic modeling, as three components of a single unified computational network. The parameters of a frequency-domain beam-former are first estimated by a network based on features derived from the microphone channels. These filter coefficients are then applied to the array signals to form an enhanced signal. Conventional features are then extracted from this signal and passed to a second network that performs acoustic modeling for classification. The parameters of both the beamforming and acoustic modeling networks are trained jointly using back-propagation with a common cross-entropy objective function. In experiments on the AMI meeting corpus, we observed improvements by pre-training each sub-network with a network-specific objective function before joint training of both networks. The proposed method obtained a 3.2% absolute word error rate reduction compared to a conventional pipeline of independent processing stages.

190 citations


Patent
08 Mar 2016
Abstract: At a first electronic device with a display and a microphone: sampling audio input using the first microphone; in accordance with the sampling of audio input using the first microphone, sending stop instructions to a second electronic device with a second microphone, the second electronic device external to the first electronic device, wherein the second electronic device is configured to respond to audio input received using the second microphone, and wherein the stop instructions instruct the second electronic device to forgo responding to audio input received using the second microphone, wherein responding to audio input received using the second microphone comprises providing perceptible output.

137 citations


Patent
08 Mar 2016
TL;DR: In this article, an electronic device with a display, a microphone, and an input device is considered, where the display is on, the user input is received via the input device, and the user inputs meeting a predetermined condition are sampled via the microphone.
Abstract: At an electronic device with a display, a microphone, and an input device: while the display is on, receiving user input via the input device, the user input meeting a predetermined condition; in accordance with receiving the user input meeting the predetermined condition, sampling audio input received via the microphone; determining whether the audio input comprises a spoken trigger; and in accordance with a determination that audio input comprises the spoken trigger, triggering a virtual assistant session.

132 citations


Patent
15 Sep 2016
TL;DR: In this paper, a virtual assistant is used to arbitrate among and/or control a set of electronic devices, such that the first electronic device broadcasts a first set of one or more values based on the sampled audio input.
Abstract: This relates to systems and processes for using a virtual assistant to arbitrate among and/or control electronic devices. In one example process, a first electronic device samples an audio input using a microphone. The first electronic device broadcasts a first set of one or more values based on the sampled audio input. Furthermore, the first electronic device receives a second set of one or more values, which are based on the audio input, from a second electronic device. Based on the first set of one or more values and the second set of one or more values, the first electronic device determines whether to respond to the audio input or forego responding to the audio input.

124 citations


Proceedings ArticleDOI
12 Sep 2016
TL;DR: The results show that AudioGest can detect six hand gestures with an accuracy up to 96%, and by distinguishing the gesture attributions, it can provide up to 162 control commands for various applications.
Abstract: Hand gesture is becoming an increasingly popular means of interacting with consumer electronic devices, such as mobile phones, tablets and laptops. In this paper, we present AudioGest, a device-free gesture recognition system that can accurately sense the hand in-air movement around user's devices. Compared to the state-of-the-art, AudioGest is superior in using only one pair of built-in speaker and microphone, without any extra hardware or infrastructure support and with no training, to achieve fine-grained hand detection. Our system is able to accurately recognize various hand gestures, estimate the hand in-air time, as well as average moving speed and waving range. We achieve this by transforming the device into an active sonar system that transmits inaudible audio signal and decodes the echoes of hand at its microphone. We address various challenges including cleaning the noisy reflected sound signal, interpreting the echo spectrogram into hand gestures, decoding the Doppler frequency shifts into the hand waving speed and range, as well as being robust to the environmental motion and signal drifting. We implement the proof-of-concept prototype in three different electronic devices and extensively evaluate the system in four real-world scenarios using 3,900 hand gestures that collected by five users for more than two weeks. Our results show that AudioGest can detect six hand gestures with an accuracy up to 96%, and by distinguishing the gesture attributions, it can provide up to 162 control commands for various applications.

120 citations


Journal ArticleDOI
Abstract: An Austrian start-up describes how its membrane-free optical microphone technology is being put to good use in ultrasonic non-destructive testing and process control.

89 citations


Journal ArticleDOI
TL;DR: This hydrogel device responds with a transient modulation of electric double layers, resulting in an extraordinary sensitivity, which can detect static loads and air breezes from different angles, as well as underwater acoustic signals from 20 Hz to 3 kHz at amplitudes as low as 4 Pa.
Abstract: Exploring the abundant resources in the ocean requires underwater acoustic detectors with a high-sensitivity reception of low-frequency sound from greater distances and zero reflections. Here we address both challenges by integrating an easily deformable network of metal nanoparticles in a hydrogel matrix for use as a cavity-free microphone. Since metal nanoparticles can be densely implanted as inclusions, and can even be arranged in coherent arrays, this microphone can detect static loads and air breezes from different angles, as well as underwater acoustic signals from 20 Hz to 3 kHz at amplitudes as low as 4 Pa. Unlike dielectric capacitors or cavity-based microphones that respond to stimuli by deforming the device in thickness directions, this hydrogel device responds with a transient modulation of electric double layers, resulting in an extraordinary sensitivity (217 nF kPa−1 or 24 μC N−1 at a bias of 1.0 V) without using any signal amplification tools. Conventional ceramic SONAR suffers from large acoustic impedance mismatch with water, rendering them easily detectable. Here, Gao et al. report a new design using a hydrogel filled with stimuli-responsive metal nanoparticles, which detects low-frequency sound with a high-sensitivity and zero-reflection.

86 citations


Proceedings ArticleDOI
06 Nov 2016
TL;DR: TapSkin is presented, an interaction technique that recognizes up to 11 distinct tap gestures on the skin around the watch using only the inertial sensors and microphone on a commodity smartwatch without requiring further on-body instrumentation.
Abstract: The touchscreen has been the dominant input surface for smartphones and smartwatches. However, its small size compared to a phone limits the richness of the input gestures that can be supported. We present TapSkin, an interaction technique that recognizes up to 11 distinct tap gestures on the skin around the watch using only the inertial sensors and microphone on a commodity smartwatch. An evaluation with 12 participants shows our system can provide classification accuracies from 90.69% to 97.32% in three gesture families -- number pad, d-pad, and corner taps. We discuss the opportunities and remaining challenges for widespread use of this technique to increase input richness on a smartwatch without requiring further on-body instrumentation.

82 citations


Journal ArticleDOI
TL;DR: It is shown that respiratory rates from the nasal sound can be accurately estimated even if a smartphone's microphone is as far as 30 cm away from the nose, and that tracheal breath sounds recorded by the built-in microphone of a smartphone placed on the paralaryngeal space can be used to estimate different respiratory rates.
Abstract: This paper proposes accurate respiratory rate estimation using nasal breath sound recordings from a smartphone. Specifically, the proposed method detects nasal airflow using a built-in smartphone microphone or a headset microphone placed underneath the nose. In addition, we also examined if tracheal breath sounds recorded by the built-in microphone of a smartphone placed on the paralaryngeal space can also be used to estimate different respiratory rates ranging from as low as 6 breaths/min to as high as 90 breaths/min. The true breathing rates were measured using inductance plethysmography bands placed around the chest and the abdomen of the subject. Inspiration and expiration were detected by averaging the power of nasal breath sounds. We investigated the suitability of using the smartphone-acquired breath sounds for respiratory rate estimation using two different spectral analyses of the sound envelope signals: The Welch periodogram and the autoregressive spectrum. To evaluate the performance of the proposed methods, data were collected from ten healthy subjects. For the breathing range studied (6–90 breaths/min), experimental results showed that our approach achieves an excellent performance accuracy for the nasal sound as the median errors were less than 1% for all breathing ranges. The tracheal sound, however, resulted in poor estimates of the respiratory rates using either spectral method. For both nasal and tracheal sounds, significant estimation outliers resulted for high breathing rates when subjects had nasal congestion, which often resulted in the doubling of the respiratory rates. Finally, we show that respiratory rates from the nasal sound can be accurately estimated even if a smartphone's microphone is as far as 30 cm away from the nose.

80 citations


Journal ArticleDOI
TL;DR: It is shown that airborne signals can be measured consistently and that healthy left and right knees often produce a similar pattern in acoustic emissions, and is recommended to use air microphones for wearable joint sound sensing.
Abstract: Objective: We present the framework for wearable joint rehabilitation assessment following musculoskeletal injury. We propose a multimodal sensing (i.e., contact based and airborne measurement of joint acoustic emission) system for at-home monitoring. Methods: We used three types of microphones—electret, MEMS, and piezoelectric film microphones—to obtain joint sounds in healthy collegiate athletes during unloaded flexion/extension, and we evaluated the robustness of each microphone's measurements via: 1) signal quality and 2) within-day consistency. Results: First, air microphones acquired higher quality signals than contact microphones (signal-to-noise-and-interference ratio of 11.7 and 12.4 dB for electret and MEMS, respectively, versus 8.4 dB for piezoelectric). Furthermore, air microphones measured similar acoustic signatures on the skin and 5 cm off the skin (∼4.5× smaller amplitude). Second, the main acoustic event during repetitive motions occurred at consistent joint angles (intra-class correlation coefficient ICC(1, 1) = 0.94 and ICC(1, k) = 0.99). Additionally, we found that this angular location was similar between right and left legs, with asymmetry observed in only a few individuals. Conclusion: We recommend using air microphones for wearable joint sound sensing; for practical implementation of contact microphones in a wearable device, interface noise must be reduced. Importantly, we show that airborne signals can be measured consistently and that healthy left and right knees often produce a similar pattern in acoustic emissions. Significance: These proposed methods have the potential for enabling knee joint acoustics measurement outside the clinic/lab and permitting long-term monitoring of knee health for patients rehabilitating an acute knee joint injury.

Patent
04 Mar 2016
TL;DR: In this paper, a first electronic device with a display and a microphone, receiving audio input via the microphone, sending data representing the request to a service; receiving a token from the service, wherein the token permits lookup, from a service, of at least one of: the request, and result responsive to the request; and sending the token to a second electronic device external to the first device.
Abstract: At a first electronic device with a display and a microphone, receiving audio input via the microphone, wherein the audio input comprises a request; sending data representing the request to a service; receiving a token from the service, wherein the token permits lookup, from the service, of at least one of: the request, and result responsive to the request; and sending the token to a second electronic device external to the first electronic device.

Patent
15 Aug 2016
TL;DR: In this paper, a computing device is configured to perform functions such as receiving via a network microphone device of a media playback system, a voice command detected by at least one microphone of the network microphone devices, wherein the media playback systems comprises a plurality of zones, and the network microphones device may be a member of a default playback zone.
Abstract: A computing device is configured to perform functions comprising: receiving via a network microphone device of a media playback system, a voice command detected by at least one microphone of the network microphone device, wherein the media playback system comprises a plurality of zones, and the network microphone device may be a member of a default playback zone. The computing device may be further configured to perform functions comprising: dynamically selecting an audio response zone from the plurality of zones to play an audio response to the voice input and foregoing selection of the default playback zone. The selected zone may comprise a playback device, and the dynamically selecting may comprise determining that the network microphone device is paired with the playback device. The computing device may cause the playback device of the selected zone to play the audio response.

Proceedings ArticleDOI
01 Sep 2016
TL;DR: A Speaker Localization algorithm based on Neural Networks for multi-room domestic scenarios is proposed and outperforms the reference one, providing an average localization error, expressed in terms of RMSE, equal to 525 mm against 1465 mm.
Abstract: A Speaker Localization algorithm based on Neural Networks for multi-room domestic scenarios is proposed in this paper. The approach is fully data-driven and employs a Neural Network fed by GCC-PHAT (Generalized Cross Correlation Phase Transform) Patterns, calculated by means of the microphone signals, to determine the speaker position in the room under analysis. In particular, we deal with a multi-room case study, in which the acoustic scene of each room is influenced by sounds emitted in the other rooms. The algorithm is tested against the home recorded DIRHA dataset, characterized by multiple wall and ceiling microphone signals for each room. In particular, we focused on the speaker localization problem in two distinct neighbouring rooms. We assumed the presence of an Oracle multi-room Voice Activity Detector (VAD) in our experiments. A three-stage optimization procedure has been adopted to find the best network configuration and GCC-PHAT Patterns combination. Moreover, an algorithm based on Time Difference of Arrival (TDOA), recently proposed in literature for the addressed applicative context, has been considered as term of comparison. As result, the proposed algorithm outperforms the reference one, providing an average localization error, expressed in terms of RMSE, equal to 525 mm against 1465 mm. Concluding, we also assessed the algorithm performance when a real VAD, recently proposed by some of the authors, is used. Even though a degradation of localization capability is registered (an average RMSE equal to 770 mm), still a remarkable improvement with respect to the state of the art performance is obtained.

Journal ArticleDOI
TL;DR: This article provides an application-oriented, comprehensive survey of existing methods for microphone position self-calibration, which will be categorized by the measurements they use and the scenarios they can calibrate.
Abstract: Today, we are often surrounded by devices with one or more microphones, such as smartphones, laptops, and wireless microphones. If they are part of an acoustic sensor network, their distribution in the environment can be beneficially exploited for various speech processing tasks. However, applications like speaker localization, speaker tracking, and speech enhancement by beamforming avail themselves of the geometrical configuration of the sensors. Therefore, acoustic microphone geometry calibration has recently become a very active field of research. This article provides an application-oriented, comprehensive survey of existing methods for microphone position self-calibration, which will be categorized by the measurements they use and the scenarios they can calibrate. Selected methods will be evaluated comparatively with real-world recordings.

Proceedings ArticleDOI
20 Mar 2016
TL;DR: This work introduces a simple, easy to analyze estimator, and proves that the sequence of room and trajectory estimates converges to the true values, and proposes a more general solution which does not require any assumptions on motion and measurement model of the robot.
Abstract: We address the problem of jointly localizing a robot in an unknown room and estimating the room geometry from echoes. Unlike earlier work using echoes, we assume a completely autonomous setup with (near) collocated microphone and the acoustic source. We first introduce a simple, easy to analyze estimator, and prove that the sequence of room and trajectory estimates converges to the true values. Next, we approach the problem from a Bayesian point of view, and propose a more general solution which does not require any assumptions on motion and measurement model of the robot. In addition to theoretical analysis, we validate both estimators numerically.

Patent
19 Dec 2016
TL;DR: In this paper, a system for voice dictation includes an earpiece, the earpiece may include a speaker, a microphone, a processor, and a software application executing on a computing device which provides for receiving the first voice audio stream into a first position of a record and receiving the second voice audio streams into a second position of the record.
Abstract: A system for voice dictation includes an earpiece, the earpiece may include an earpiece housing sized to fit into an external auditory canal of a user and block the external auditory canal, a first microphone operatively connected to the earpiece housing and positioned to be isolated from ambient sound when the earpiece housing is fitted into the external auditory canal, a second microphone operatively connected to earpiece housing and positioned to sound external from the user, and a processor disposed within the earpiece housing and operatively connected to the first microphone and the second microphone. The system may further include a software application executing on a computing device which provides for receiving the first voice audio stream into a first position of a record and receiving the second voice audio stream into a second position of the record.

Patent
28 Nov 2016
TL;DR: In this paper, the authors present a system, method, and wireless earpiece that consists of a graphene speaker and a graphene microphone with a sleeve portion of the frame configured to fit in to an ear canal of a user.
Abstract: A system, method, and wireless earpiece. The wireless earpiece includes a frame supporting circuitry of the wireless earpiece. The frame includes a graphene speaker and a graphene microphone with a sleeve portion of the frame configured to fit in to an ear canal of a user. The wireless earpiece may further include a processor for executing a set of instructions and a memory for storing the set of instructions, wherein the set of instructions are executed to process the first electronic signals for playback by the graphene speaker; and process the second electronic signals received from the graphene microphone.

Patent
12 Aug 2016
TL;DR: In this paper, the first user's location correlates with the source's location such that the source location and first user location are within a predetermined distance of each other, and in response, the voice-controlled system performs at least one security action associated with the first users providing the audio signal.
Abstract: Systems and methods for associating audio signals in an environment surrounding a voice-controlled system include receiving by a voice-controlled system through a microphone, an audio signal from a user of a plurality of users within an environment surrounding the microphone. The voice-controlled system determines a source location of the audio signal. The voice-controlled system determines a first user location of a first user and a second user location of a second user. The voice-controlled system then determines that the first user location correlates with the source location such that the source location and the first user location are within a predetermined distance of each other. In response, the voice-controlled system performs at least one security action associated with the first user providing the audio signal.

Patent
25 Nov 2016
TL;DR: In this article, an earpiece is configured to connect with a vehicle using the wireless transceiver and after connection with the vehicle automatically enter a driving mode, the earpiece senses ambient sound with the microphone and reproduces the ambient sound at the speaker.
Abstract: An earpiece includes an earpiece housing, a speaker associated with the ear piece housing, a microphone associated with the ear piece housing a wireless transceiver disposed within the ear piece housing and a processor disposed within the ear piece housing The earpiece is configured to connect with a vehicle using the wireless transceiver and after connection with the vehicle automatically enter a driving mode In the driving mode, the earpiece senses ambient sound with the microphone and reproduces the ambient sound at the speaker

Patent
19 Dec 2016
Abstract: An earpiece for use in voice dictation includes a speaker disposed within the earpiece housing, a microphone, and a processor disposed within the earpiece housing and operatively connected to the microphone and the speaker, wherein the processor is adapted to capture a voice stream from the microphone. The earpiece may further include a wireless transceiver disposed within the earpiece housing, the wireless transceiver operatively connected to the processor. The earpiece is configured to be controlled by a user through a plurality of different user interfaces to perform voice dictation.

Journal ArticleDOI
TL;DR: In this article, a flat plate behind a cylinder in a low-speed wind tunnel (flow speed from 10 to 17m/s) was recorded continuously by fast pressure-sensitive paint (PSP) and a microphone array.
Abstract: Fast pressure-sensitive paint (PSP) is very useful in flow diagnostics due to its fast response and high spatial resolution, but its applications in low-speed flows are usually challenging due to limitations of paint’s pressure sensitivity and the capability of high-speed imagers. The poor signal-to-noise ratio in low-speed cases makes it very difficult to extract useful information from the PSP data. In this study, unsteady PSP measurements were made on a flat plate behind a cylinder in a low-speed wind tunnel (flow speed from 10 to 17 m/s). Pressure fluctuations (ΔP) on the plate caused by vortex–plate interaction were recorded continuously by fast PSP (using a high-speed camera) and a microphone array. Power spectrum of pressure fluctuations and phase-averaged ΔP obtained from PSP and microphone were compared, showing good agreement in general. Proper orthogonal decomposition (POD) was used to reduce noise in PSP data and extract the dominant pressure features. The PSP results reconstructed from selected POD modes were then compared to the pressure data obtained simultaneously with microphone sensors. Based on the comparison of both instantaneous ΔP and root-mean-square of ΔP, it was confirmed that POD analysis could effectively remove noise while preserving the instantaneous pressure information with good fidelity, especially for flows with strong periodicity. This technique extends the application range of fast PSP and can be a powerful tool for fundamental fluid mechanics research at low speed.

Patent
09 Feb 2016
TL;DR: In this article, a beamforming system performs audio beamforming to separate audio input into multiple directions (e.g., target signals) and generates multiple audio outputs using two acoustic echo cancellation (AEC) circuits.
Abstract: An echo cancellation system performs audio beamforming to separate audio input into multiple directions (e.g., target signals) and generates multiple audio outputs using two acoustic echo cancellation (AEC) circuits. A first AEC removes a playback reference signal (generated from a signal sent a loudspeaker) to isolate speech included in the target signals. A second AEC removes an adaptive reference signal (generated from microphone inputs corresponding to audio received from the loudspeaker) to isolate speech included in the target signals. A beam selector receives the multiple audio outputs and selects the first AEC or the second AEC based on a linearity of the system. When linear (e.g., no distortion or variable delay between microphone input and playback signal), the beam selector selects an output from the first AEC based on signal to noise (SNR) ratios. When nonlinear, the beam selector selects an output from the second AEC.

Patent
09 Feb 2016
TL;DR: An echo cancellation system that uses a combined reference signal using a playback reference signal and an adaptive reference signal was proposed in this paper, where the adaptive signal is generated using beamforming on microphone inputs corresponding to audio received from the loudspeaker.
Abstract: An echo cancellation system that uses a combined reference signal using a playback reference signal and an adaptive reference signal The playback reference signal is generated from a playback signal sent to a loudspeaker and the adaptive reference signal is generated using beamforming on microphone inputs corresponding to audio received from the loudspeaker The system applies a low pass filter to the playback reference signal and applies a high pass filter to the adaptive reference signal to generate the combined reference signal The system may remove the combined reference signal from target signals associated with the microphone inputs to isolate speech included in the target signals

Journal ArticleDOI
TL;DR: Two sensors were assembled with a canted angle similar to that employed in radar bearing locators allowing direction finding with no requirement of knowing the incident sound pressure level and indicate the great potential to use dual MEMS direction finding sensor assemblies to locate sound sources with high accuracy.
Abstract: A narrowband MEMS direction finding sensor has been developed based on the mechanically coupled ears of the Ormia Ochracea fly. The sensor consists of two wings coupled at the middle and attached to a substrate using two legs. The sensor operates at its bending resonance frequency and has cosine directional characteristics similar to that of a pressure gradient microphone. Thus, the directional response of the sensor is symmetric about the normal axis making the determination of the direction ambiguous. To overcome this shortcoming two sensors were assembled with a canted angle similar to that employed in radar bearing locators. The outputs of two sensors were processed together allowing direction finding with no requirement of knowing the incident sound pressure level. At the bending resonant frequency of the sensors (1.69 kHz) an output voltage of about 25 V/Pa was measured. The angle uncertainty of the bearing of sound ranged from less than 0.3° close to the normal axis (0°) to 3.4° at the limits of coverage (±60°) based on the 30° canted angle used. These findings indicate the great potential to use dual MEMS direction finding sensor assemblies to locate sound sources with high accuracy.

Proceedings ArticleDOI
23 Oct 2016
TL;DR: A mobile phone app that alerts deaf and hard-of-hearing people to sounds they care about is designed, and the viability of a basic machine learning algorithm for sound detection is explored.
Abstract: Sounds provide informative signals about the world around us. In situations where non-auditory cues are inaccessible, it can be useful for deaf and hard-of-hearing people to be notified about sounds. Through a survey, we explored which sounds are of interest to deaf and hard-of-hearing people, and which means of notification are appropriate. Motivated by these findings, we designed a mobile phone app that alerts deaf and hard-of-hearing people to sounds they care about. The app uses training examples of personally relevant sounds recorded by the user to learn a model of those sounds. It then screens the incoming audio stream from the phone's microphone for those sounds. When it detects a sound, it alerts the user by vibrating and providing a pop-up notification. To evaluate the interface design independent of sound detection errors, we ran a Wizard-of-Oz user study, and found that the app design successfully facilitated deaf and hard-of-hearing users recording training examples. We also explored the viability of a basic machine learning algorithm for sound detection.

Journal ArticleDOI
TL;DR: This paper approximate the sampling rate offset with a linear-phase drift model in the short-time Fourier transform (STFT) domain and shows that the correlation coefficient between two microphones signals tends to present the highest value when the sampling of the two microphone signals is synchronized.
Abstract: In this paper, we investigate the sampling rate mismatch problem in distributed microphone arrays and propose a correlation maximization algorithm to blindly estimate the sampling rate offset between two asynchronously sampled microphone signals. We approximate the sampling rate offset with a linear-phase drift model in the short-time Fourier transform (STFT) domain and show that the correlation coefficient between two microphone signals tends to present the highest value when the sampling of the two microphone signals is synchronized. Based on this finding we propose the correlation maximization algorithm, which performs sampling rate compensation on two microphone signals with different possible offset values and calculates their correlation coefficient after compensation. The offset value that leads to the largest correlation coefficient is chosen as the optimal estimate. Since the precision of the STFT linear-phase drift model used in the algorithm degrades as the sampling rate offset or the signal length is increased, we further propose a two-stage exhaustive search scheme to detect the optimal sampling rate offset. This scheme is able to minimize the influence of the linear-phase drift model error in order to improve the sampling rate offset estimation accuracy. Both simulated as well as real-world experiments confirm the effectiveness of the proposed algorithm.

Patent
08 Aug 2016
TL;DR: In this article, the authors describe an approach for calibrating a speaker in a device having a microphone by inputting an original audio signal to a processing component and an output stage of the device for playback through the speakers, receiving playback sound output from the speakers through the microphone to generate a microphone signal, and inputting the microphone signal into the processing component to calibrate the speakers for optimal playback of the original audio signals.
Abstract: Embodiments are described for calibrating a speaker in a device having a microphone by inputting an original audio signal to a processing component and an output stage of the device for playback through the speakers, receiving playback sound output from the speakers through the microphone to generate a microphone signal, and inputting the microphone signal into the processing component to calibrate the speakers for optimal playback of the original audio signal, wherein the processing component is configured to compare the original audio signal to the microphone signal and correct the microphone signal by one or more audio processing functions in accordance with a refresh schedule.

Proceedings ArticleDOI
20 Jun 2016
TL;DR: It is shown that the vibrating mass inside the motor -- designed to oscillate to changing magnetic fields -- also responds to air vibrations from nearby sounds, to the extent that humans can understand the vibra-motor recorded words with greater than 80% average accuracy.
Abstract: This paper demonstrates the feasibility of using the vibration motor in mobile devices as a sound sensor, almost like a microphone. We show that the vibrating mass inside the motor -- designed to oscillate to changing magnetic fields -- also responds to air vibrations from nearby sounds. With appropriate processing, the responses become intelligible, to the extent that humans can understand the vibra-motor recorded words with greater than 80% average accuracy. Even off-the-shelf speech recognition softwares are able to decode at 60% accuracy, without any training or machine learning. We present our overall techniques and results through a system called VibraPhone, and discuss implications to both sensing and security.

Patent
04 May 2016
TL;DR: In this paper, a full-screen mobile phone is described, which consists of a mobile phone body, a shell, a touch screen and a controller, and a chute is arranged behind the upper part of the touch screen, a lifting panel is arranged in the chute in an up and down sliding manner.
Abstract: The invention relates to the technical field of smartphones, and particularly relates to a full-screen mobile phone. The full-screen mobile phone comprises a mobile phone body, the mobile phone body comprises a shell, a touch screen and a controller, and the shell and the touch screen are combined together to form a totally enclosed structure; a chute is arranged in the shell behind the upper part of the touch screen, a lifting panel is arranged in the chute in an up and down sliding manner, a distance sensor, a telephone receiver and a camera device are arranged on the lifting panel from left to right in sequence, the camera device comprises a camera and a flashlight, and a light sensor is arranged at the top of the lifting panel above the distance sensor; a microphone, a loudspeaker and an interface are arranged at the bottom of the shell from left to right in sequence; side keys are arranged at the upper part of one side of the shell; and the distance sensor, the telephone receiver, the camera device, the light sensor, the microphone, the loudspeaker, the interface and the side keys are electrically connected with the controller. The full-screen mobile phone is reasonable in structure, convenient and practical and is flexible to operate, and the screen occupation ratio is higher than 95%.