scispace - formally typeset
Proceedings ArticleDOI

Voice Recognition and Voice Comparison using Machine Learning Techniques: A Survey

TLDR
An elaborated literature survey on both traditional and deep learning-based methods of speaker recognition and voice comparison is focused on, which would provide substantial input to beginners and researchers for understanding the domain of voice recognition andVoice comparison.
Abstract
Voice comparison is a variant of speaker recognition or voice recognition. Voice comparison plays a significant role in the forensic science field and security systems. Precise voice comparison is a challenging problem. Traditionally, different classification and comparison models were used by the researchers to solve the speaker recognition and the voice comparison, respectively but deep learning is gaining popularity because of its strength in accuracy when trained with large amounts of data. This paper focuses on an elaborated literature survey on both traditional and deep learning-based methods of speaker recognition and voice comparison. This paper also discusses publicly available datasets that are used for speaker recognition and voice comparison by researchers. This concise paper would provide substantial input to beginners and researchers for understanding the domain of voice recognition and voice comparison.

read more

Citations
More filters
Journal ArticleDOI

Dendritic Computing: Branching Deeper into Machine Learning

TL;DR: In this article, the authors discuss the nonlinear computational power provided by dendrites in biological and artificial neurons and discuss examples of how dendritic computations have been used to solve real-world classification problems with performance reported on well known data sets used in machine learning.
Journal ArticleDOI

Improved Frame-Wise Segmentation of Audio Signals for Smart Hearing Aid Using Particle Swarm Optimization-Based Clustering

TL;DR: An efficient particle swarm optimization (PSO)-based clustering algorithm is proposed to classify the speech classes, i.e., voiced, unvoiced, and silence, and it is revealed that the proposed algorithm outperforms the competitive algorithms.
Journal ArticleDOI

Tracking droplets in soft granular flows with deep learning techniques.

TL;DR: In this paper, the state-of-the-art deep learning-based object recognition YOLO algorithm and object tracking DeepSORT algorithm are combined to analyze digital images from fluid dynamic simulations of multi-core emulsions and soft flowing crystals and to track moving droplets within these complex flows.
Journal ArticleDOI

A fast and efficient deep learning procedure for tracking droplet motion in dense microfluidic emulsions

TL;DR: In this paper, a deep learning-based object detection and object tracking algorithm was proposed to study droplet motion in dense microfluidic emulsions, which can correctly predict the droplets' shape and track their motion at competitive rates as compared to standard clustering algorithms, even in the presence of significant deformations.

Polycost : A telephone-speech database for speaker recognition

TL;DR: An overview of the POLYCOST database dedicated to speaker recognition applications over the telephone network, with main characteristics of medium mixed speech corpus size, English spoken by foreigners, mainly digits with some free speech, collected through international telephone lines.
References
More filters
Journal ArticleDOI

Robust text-independent speaker identification using Gaussian mixture speaker models

TL;DR: The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are effective for modeling speaker identity and is shown to outperform the other speaker modeling techniques on an identical 16 speaker telephone speech task.
Proceedings Article

Signature Verification using a "Siamese" Time Delay Neural Network

TL;DR: An algorithm for verification of signatures written on a pen-input tablet based on a novel, artificial neural network called a "Siamese" neural network, which consists of two identical sub-networks joined at their outputs.

Siamese Neural Networks for One-shot Image Recognition

TL;DR: A method for learning siamese neural networks which employ a unique structure to naturally rank similarity between inputs and is able to achieve strong results which exceed those of other deep learning models with near state-of-the-art performance on one-shot classification tasks.
Journal ArticleDOI

Speaker identification and verification using Gaussian mixture speaker models

TL;DR: High performance speaker identification and verification systems based on Gaussian mixture speaker models: robust, statistically based representations of speaker identity, evaluated on four publically available speech databases.
Proceedings Article

Unsupervised feature learning for audio classification using convolutional deep belief networks

TL;DR: In this paper, the authors apply convolutional deep belief networks to audio data and empirically evaluate them on various audio classification tasks and show that the learned features correspond to phones/phonemes.
Related Papers (5)