Voice Recognition and Voice Comparison using Machine Learning Techniques: A Survey

doi:10.1109/ICACCS48705.2020.9074184

Proceedings ArticleDOI

Voice Recognition and Voice Comparison using Machine Learning Techniques: A Survey

- pp 459-465

TLDR

An elaborated literature survey on both traditional and deep learning-based methods of speaker recognition and voice comparison is focused on, which would provide substantial input to beginners and researchers for understanding the domain of voice recognition andVoice comparison.

Abstract:

Voice comparison is a variant of speaker recognition or voice recognition. Voice comparison plays a significant role in the forensic science field and security systems. Precise voice comparison is a challenging problem. Traditionally, different classification and comparison models were used by the researchers to solve the speaker recognition and the voice comparison, respectively but deep learning is gaining popularity because of its strength in accuracy when trained with large amounts of data. This paper focuses on an elaborated literature survey on both traditional and deep learning-based methods of speaker recognition and voice comparison. This paper also discusses publicly available datasets that are used for speaker recognition and voice comparison by researchers. This concise paper would provide substantial input to beginners and researchers for understanding the domain of voice recognition and voice comparison.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Dendritic Computing: Branching Deeper into Machine Learning

Jyotibdha Acharya, +5 more

- 14 Oct 2021 -

Neuroscience

TL;DR: In this article, the authors discuss the nonlinear computational power provided by dendrites in biological and artificial neurons and discuss examples of how dendritic computations have been used to solve real-world classification problems with performance reported on well known data sets used in machine learning.

...read moreread less

Journal ArticleDOI

Improved Frame-Wise Segmentation of Audio Signals for Smart Hearing Aid Using Particle Swarm Optimization-Based Clustering

Tushar Mehrotra, +5 more

- 05 May 2022 -

Mathematical Problems in Engineering

TL;DR: An efficient particle swarm optimization (PSO)-based clustering algorithm is proposed to classify the speech classes, i.e., voiced, unvoiced, and silence, and it is revealed that the proposed algorithm outperforms the competitive algorithms.

...read moreread less

Journal ArticleDOI

Tracking droplets in soft granular flows with deep learning techniques.

Mihir Durve, +7 more

- 01 Jan 2021 -

European Physical Journal Plus

TL;DR: In this paper, the state-of-the-art deep learning-based object recognition YOLO algorithm and object tracking DeepSORT algorithm are combined to analyze digital images from fluid dynamic simulations of multi-core emulsions and soft flowing crystals and to track moving droplets within these complex flows.

...read moreread less

Journal ArticleDOI

A fast and efficient deep learning procedure for tracking droplet motion in dense microfluidic emulsions

Mihir Durve, +8 more

- 02 Mar 2021 -

arXiv: Soft Condensed Matter

TL;DR: In this paper, a deep learning-based object detection and object tracking algorithm was proposed to study droplet motion in dense microfluidic emulsions, which can correctly predict the droplets' shape and track their motion at competitive rates as compared to standard clustering algorithms, even in the presence of significant deformations.

...read moreread less

Polycost : A telephone-speech database for speaker recognition

D. Petrovska, +3 more

TL;DR: An overview of the POLYCOST database dedicated to speaker recognition applications over the telephone network, with main characteristics of medium mixed speech corpus size, English spoken by foreigners, mainly digits with some free speech, collected through international telephone lines.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Robust text-independent speaker identification using Gaussian mixture speaker models

Douglas A. Reynolds, +1 more

- 01 Jan 1995 -

IEEE Transactions on Speech and Audio Pr...

TL;DR: The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are effective for modeling speaker identity and is shown to outperform the other speaker modeling techniques on an identical 16 speaker telephone speech task.

...read moreread less

Proceedings Article

Signature Verification using a "Siamese" Time Delay Neural Network

Jane Bromley, +4 more

TL;DR: An algorithm for verification of signatures written on a pen-input tablet based on a novel, artificial neural network called a "Siamese" neural network, which consists of two identical sub-networks joined at their outputs.

...read moreread less

Siamese Neural Networks for One-shot Image Recognition

Gregory Koch, +2 more

TL;DR: A method for learning siamese neural networks which employ a unique structure to naturally rank similarity between inputs and is able to achieve strong results which exceed those of other deep learning models with near state-of-the-art performance on one-shot classification tasks.

...read moreread less

Journal ArticleDOI

Speaker identification and verification using Gaussian mixture speaker models

Douglas A. Reynolds

- 01 Aug 1995 -

Speech Communication

TL;DR: High performance speaker identification and verification systems based on Gaussian mixture speaker models: robust, statistically based representations of speaker identity, evaluated on four publically available speech databases.

...read moreread less

Proceedings Article

Unsupervised feature learning for audio classification using convolutional deep belief networks

Honglak Lee, +3 more

TL;DR: In this paper, the authors apply convolutional deep belief networks to audio data and empirically evaluate them on various audio classification tasks and show that the learned features correspond to phones/phonemes.

...read moreread less

Collapse

Voice Recognition and Voice Comparison using Machine Learning Techniques: A Survey

Citations

Dendritic Computing: Branching Deeper into Machine Learning

Improved Frame-Wise Segmentation of Audio Signals for Smart Hearing Aid Using Particle Swarm Optimization-Based Clustering

Tracking droplets in soft granular flows with deep learning techniques.

A fast and efficient deep learning procedure for tracking droplet motion in dense microfluidic emulsions

Polycost : A telephone-speech database for speaker recognition

References

Robust text-independent speaker identification using Gaussian mixture speaker models

Signature Verification using a "Siamese" Time Delay Neural Network

Siamese Neural Networks for One-shot Image Recognition

Speaker identification and verification using Gaussian mixture speaker models

Unsupervised feature learning for audio classification using convolutional deep belief networks

Related Papers (5)

On automatic voice casting for expressive speech: Speaker recognition vs. speech classification

Voice-Based Gender Recognition Using Neural Network

TechWare: Speaker and Spoken Language Recognition Resources [Best of the Web]

Fifty years of progress in speech and speaker recognition

Comparing Audio and Visual Information for Speech Processing