scispace - formally typeset
Patent

Acoustic signature building for a speaker from multiple sessions

Reads0
Chats0
TLDR
In this paper, the authors present methods of diarizing audio data using first-pass blind diarization and second-passblind diarisation that generate speaker statistical models.
Abstract
Disclosed herein are methods of diarizing audio data using first-pass blind diarization and second-pass blind diarization that generate speaker statistical models, wherein the first pass-blind diarization is on a per-frame basis and the second pass-blind diarization is on a per-word basis, and methods of creating acoustic signatures for a common speaker based only on the statistical models of the speakers in each audio session.

read more

Citations
More filters
Proceedings ArticleDOI

Fully Supervised Speaker Diarization

TL;DR: A fully supervised speaker diarization approach, named unbounded interleaved-state recurrent neural networks (UIS-RNN), given extracted speaker-discriminative embeddings, which decodes in an online fashion while most state-of-the-art systems rely on offline clustering.
Patent

Language model speech endpointing

TL;DR: In this paper, an automatic speech recognition (ASR) system detects an endpoint of an utterance using the active hypotheses under consideration by a decoder, and the ASR system calculates the amount of non-speech detected by a plurality of hypotheses and weights the nonspeech duration by the probability of each hypotheses.
Patent

Dimensionality reduction of baum-welch statistics for speaker recognition

TL;DR: In this article, a deep neural network reduces a dimensionality of the normalized first order Gaussian mixture model statistics, and outputs a voiceprint corresponding to the recognition speech signal, which is used for speaker recognition.
Patent

Direction-based speech endpointing

TL;DR: In this article, a system for determining an endpoint of an utterance during automatic speech recognition (ASR) processing that accounts for the direction and duration of the incoming speech is presented.
Patent

Channel-Compensated Low-Level Features For Speaker Recognition

TL;DR: In this article, a system for generating channel-compensated features of a speech signal includes a channel noise simulator that degrades the speech signal, a feed forward convolutional neural network (CNN), and a loss function that computes a difference between the channel compensated features and handcrafted features for the same raw speech signal.
References
More filters
Journal ArticleDOI

Error bounds for convolutional codes and an asymptotically optimum decoding algorithm

TL;DR: The upper bound is obtained for a specific probabilistic nonsequential decoding algorithm which is shown to be asymptotically optimum for rates above R_{0} and whose performance bears certain similarities to that of sequential decoding algorithms.
Journal ArticleDOI

Perceptual linear predictive (PLP) analysis of speech

TL;DR: A new technique for the analysis of speech, the perceptual linear predictive (PLP) technique, which uses three concepts from the psychophysics of hearing to derive an estimate of the auditory spectrum, and yields a low-dimensional representation of speech.
Book

Statistical Digital Signal Processing and Modeling

TL;DR: The main thrust is to provide students with a solid understanding of a number of important and related advanced topics in digital signal processing such as Wiener filters, power spectrum estimation, signal modeling and adaptive filtering.
Patent

Secure data interchange

TL;DR: A secure data interchange system enables information about bilateral and multilateral interactions between multiple persistent parties to be exchanged and leveraged within an environment that uses a combination of techniques to control access to information, release of information, and matching of information back to parties as mentioned in this paper.