scispace - formally typeset
Search or ask a question
Topic

Throat microphone

About: Throat microphone is a research topic. Over the lifetime, 131 publications have been published within this topic receiving 1190 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In continuous-speech recognition experiments using SRI International's DECIPHER recognition system, both using artificially added noise and using recorded noisy speech, the combined-microphone approach significantly outperforms the single- microphone approach.
Abstract: We present a method to combine the standard and throat microphone signals for robust speech recognition in noisy environments. Our approach is to use the probabilistic optimum filter (POF) mapping algorithm to estimate the standard microphone clean-speech feature vectors, used by standard speech recognizers, from both microphones' noisy-speech feature vectors. A small untranscribed "stereo" database (noisy and clean simultaneous recordings) is required to train the POF mappings. In continuous-speech recognition experiments using SRI International's DECIPHER recognition system, both using artificially added noise and using recorded noisy speech, the combined-microphone approach significantly outperforms the single-microphone approach.

111 citations

Proceedings ArticleDOI
30 Nov 2003
TL;DR: A novel hardware device that combines a regular microphone with a bone-conductive microphone that is able to detect very robustly whether the speaker is talking and remove background speech significantly, even when the background speaker speaks at the same time as the speaker wearing the headset.
Abstract: We present a novel hardware device that combines a regular microphone with a bone-conductive microphone. The device looks like a regular headset and it can be plugged into any machine with a USB port. The bone-conductive microphone has an interesting property: it is insensitive to ambient noise and captures the low frequency portion of the speech signals. Thanks to the signals from the bone-conductive microphone, we are able to detect very robustly whether the speaker is talking, eliminating more than 90% of background speech. Furthermore, by combining both channels, we are able to remove background speech significantly, even when the background speaker speaks at the same time as the speaker wearing the headset.

107 citations

Proceedings ArticleDOI
04 Oct 2004
TL;DR: Various adaptation methods applied to recognizing soft whisper recorded with a throat microphone include: maximum likelihood linear regression, feature-space adaptation, and re-training with downsampling, sigmoidal low-pass filter, or linear multivariate regression.
Abstract: This paper describes various adaptation methods applied to recognizing soft whisper recorded with a throat microphone. Since the amount of adaptation data is small and the testing data is very different from the training data, a series of adaptation methods is necessary. The adaptation methods include: maximum likelihood linear regression, feature-space adaptation, and re-training with downsampling, sigmoidal low-pass filter, or linear multivariate regression. With these adaptation methods, the word error rate improves from 99.3% to 32.9%.

59 citations

Proceedings ArticleDOI
18 Mar 2005
TL;DR: With these adaptation methods, articulatory feature detection accuracy improves from 87.82% to 90.52% with corresponding F-measure, while the final word error rate improves from 33.8% to 31.2%.
Abstract: This paper describes our research on adaptation methods applied to articulatory feature detection on soft whispery speech recorded with a throat microphone. Since the amount of adaptation data is small and the testing data is very different from the training data, a series of adaptation methods is necessary. The adaptation methods include: maximum likelihood linear regression, feature-space adaptation, and re-training with downsampling, sigmoidal low-pass filter, and linear multivariate regression. Adapted articulatory feature detectors are used in parallel to standard senone-based HMM models in a stream architecture for decoding. With these adaptation methods, articulatory feature detection accuracy improves from 87.82% to 90.52% with corresponding F-measure from 0.504 to 0.617, while the final word error rate improves from 33.8% to 31.2%.

52 citations

Journal ArticleDOI
Engin Erzin1
TL;DR: The proposed throat-driven multimodal speech recognition system improves phoneme recognition rate to 52.58%, a significant relative improvement with respect to the throat-only speech recognition benchmark system.
Abstract: We present a new framework for joint analysis of throat and acoustic microphone (TAM) recordings to improve throat microphone only speech recognition. The proposed analysis framework aims to learn joint sub-phone patterns of throat and acoustic microphone recordings through a parallel branch HMM structure. The joint sub-phone patterns define temporally correlated neighborhoods, in which a linear prediction filter estimates a spectrally rich acoustic feature vector from throat feature vectors. Multimodal speech recognition with throat and throat-driven acoustic features significantly improves throat-only speech recognition performance. Experimental evaluations on a parallel TAM database yield benchmark phoneme recognition rates for throat-only and multimodal TAM speech recognition systems as 46.81% and 60.69%, respectively. The proposed throat-driven multimodal speech recognition system improves phoneme recognition rate to 52.58%, a significant relative improvement with respect to the throat-only speech recognition benchmark system.

44 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
65% related
Feature vector
48.8K papers, 954.4K citations
65% related
Signal processing
73.4K papers, 983.5K citations
65% related
Noise
110.4K papers, 1.3M citations
64% related
Feature (machine learning)
33.9K papers, 798.7K citations
63% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20214
20204
20196
201810
20177
201611