scispace - formally typeset
Patent

Method and apparatus for large population speaker identification in telephone interactions

Reads0
Chats0
TLDR
In this paper, a method and apparatus for determining whether a speaker uttering an utterance belongs to a predetermined set comprising known speakers, wherein a training utterance is available for each known speaker.
Abstract
A method and apparatus for determining whether a speaker uttering an utterance belongs to a predetermined set comprising known speakers, wherein a training utterance is available for each known speaker. The method and apparatus test whether features extracted from the tested utterance provide a score exceeding a threshold when matched against one or more of models constructed upon voice samples of each known speaker. The method and system further provide optional enhancements such as determining, using, and updating model normalization parameters, a fast scoring algorithm, summed calls handling, or quality evaluation for the tested utterance.

read more

Citations
More filters
Patent

Fraud detection in interactive voice response systems

TL;DR: In this article, the authors use call detail record (CDR) analysis to determine a risk score for a call and identify fraudulent activity and for fraud detection in Interactive Voice Response (IVR) systems.
Patent

User registration for intelligent assistant computer

TL;DR: In this article, a spoken command to register the initially unregistered person is received via one or more microphones, and the spoken command originated from the registered person having the pre-established registration privilege.
Patent

Voice Activity Detection Using A Soft Decision Mechanism

Ron Wein
TL;DR: In this article, a robust VAD algorithm that is also language independent is presented, where instead of classifying short segments of the audio as either speech or silence, the VAD as disclosed herein employees a soft-decision mechanism.
Patent

Vehicular threat detection based on image analysis

TL;DR: In this article, an ability enhancement facilitator system (AEFS) is configured to enhance a user's ability to operate or function in a transportation-related context as a pedestrian or a vehicle operator.
Patent

Intelligent digital assistant system

TL;DR: In this paper, an intelligent digital assistant system is provided to handle conversations with multiple users, which includes at least one microphone configured to receive an audio input, a speaker configured to emit an audio output, and a processor.
References
More filters
Journal ArticleDOI

Speaker Verification Using Adapted Gaussian Mixture Models

TL;DR: The major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs) are described.
Journal ArticleDOI

Robust text-independent speaker identification using Gaussian mixture speaker models

TL;DR: The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are effective for modeling speaker identity and is shown to outperform the other speaker modeling techniques on an identical 16 speaker telephone speech task.
Journal ArticleDOI

Comparison of four approaches to automatic language identification of telephone speech

TL;DR: Four approaches for automatic language identification of speech utterances are compared: Gaussian mixture model (GMM) classification; single-language phone recognition followed by languaged dependent, interpolated n-gram language modeling (PRLM); parallel PRLM, which uses multiple single- language phone recognizers, each trained in a different language; and languagedependent parallel phone recognition (PPR).
Patent

Method, apparatus and system for capturing and analyzing interaction based content

TL;DR: In this article, a method and apparatus for capturing and analyzing customer interactions is described, the apparatus comprising a multi-segment interaction capture device (324), an initial set up and calibration device (326), a pre-processing and context extraction device (328) and a rule-based analysis engine (300).