Patent
Acoustic signature building for a speaker from multiple sessions
Reads0
Chats0
TLDR
In this paper, the authors present methods of diarizing audio data using first-pass blind diarization and second-passblind diarisation that generate speaker statistical models.Abstract:
Disclosed herein are methods of diarizing audio data using first-pass blind diarization and second-pass blind diarization that generate speaker statistical models, wherein the first pass-blind diarization is on a per-frame basis and the second pass-blind diarization is on a per-word basis, and methods of creating acoustic signatures for a common speaker based only on the statistical models of the speakers in each audio session.read more
Citations
More filters
Proceedings ArticleDOI
Fully Supervised Speaker Diarization
TL;DR: A fully supervised speaker diarization approach, named unbounded interleaved-state recurrent neural networks (UIS-RNN), given extracted speaker-discriminative embeddings, which decodes in an online fashion while most state-of-the-art systems rely on offline clustering.
Patent
Language model speech endpointing
TL;DR: In this paper, an automatic speech recognition (ASR) system detects an endpoint of an utterance using the active hypotheses under consideration by a decoder, and the ASR system calculates the amount of non-speech detected by a plurality of hypotheses and weights the nonspeech duration by the probability of each hypotheses.
Patent
Dimensionality reduction of baum-welch statistics for speaker recognition
Elie Khoury,Matthew Garland +1 more
TL;DR: In this article, a deep neural network reduces a dimensionality of the normalized first order Gaussian mixture model statistics, and outputs a voiceprint corresponding to the recognition speech signal, which is used for speaker recognition.
Patent
Direction-based speech endpointing
TL;DR: In this article, a system for determining an endpoint of an utterance during automatic speech recognition (ASR) processing that accounts for the direction and duration of the incoming speech is presented.
Patent
Channel-Compensated Low-Level Features For Speaker Recognition
Elie Khoury,Matthew Garland +1 more
TL;DR: In this article, a system for generating channel-compensated features of a speech signal includes a channel noise simulator that degrades the speech signal, a feed forward convolutional neural network (CNN), and a loss function that computes a difference between the channel compensated features and handcrafted features for the same raw speech signal.
References
More filters
Journal ArticleDOI
Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
TL;DR: The upper bound is obtained for a specific probabilistic nonsequential decoding algorithm which is shown to be asymptotically optimum for rates above R_{0} and whose performance bears certain similarities to that of sequential decoding algorithms.
Journal ArticleDOI
A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains
Journal ArticleDOI
Perceptual linear predictive (PLP) analysis of speech
TL;DR: A new technique for the analysis of speech, the perceptual linear predictive (PLP) technique, which uses three concepts from the psychophysics of hearing to derive an estimate of the auditory spectrum, and yields a low-dimensional representation of speech.
Book
Statistical Digital Signal Processing and Modeling
TL;DR: The main thrust is to provide students with a solid understanding of a number of important and related advanced topics in digital signal processing such as Wiener filters, power spectrum estimation, signal modeling and adaptive filtering.
Patent
Secure data interchange
TL;DR: A secure data interchange system enables information about bilateral and multilateral interactions between multiple persistent parties to be exchanged and leveraged within an environment that uses a combination of techniques to control access to information, release of information, and matching of information back to parties as mentioned in this paper.