Speaker Identification Using a Hybrid CNN-MFCC Approach

doi:10.1109/ICETST49965.2020.9080730

Proceedings ArticleDOI

Speaker Identification Using a Hybrid CNN-MFCC Approach

- pp 1-4

TLDR

A novel architecture is proposed using a convolutional neural network (CNN) and mel frequency cepstral coefficient (MFCC) to identify the speaker in a noisy environment to verify the effectiveness of this hybrid architecture.

Abstract:

In this paper, a novel architecture is proposed using a convolutional neural network (CNN) and mel frequency cepstral coefficient (MFCC) to identify the speaker in a noisy environment. This architecture is used in a text-independent setting. The most important task in any text-independent speaker identification is the capability of the system to learn features that are useful for classification. We are using a hybrid feature extraction technique using CNN as a feature extractor combines with MFCC as a single set. For classification, we used a deep neural network which shows very promising results in classifying speakers. We made our dataset containing 60 speakers, each speaker has 4 voice samples. Our best hybrid model achieved an accuracy of 87.5%. To verify the effectiveness of this hybrid architecture, we use parameters such as accuracy and precision.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Speech Emotion Recognition Using Quaternion Convolutional Neural Networks

Aneesh Muppidi, +1 more

TL;DR: This paper proposed a quaternion convolutional neural network (QCNN) based speech emotion recognition (SER) model in which Mel-spectrogram features of speech signals are encoded in an RGB quaternions domain.

...read moreread less

Journal ArticleDOI

Speaker identification based on Radon transform and CNNs in the presence of different types of interference for Robotic Applications

Amira Shafik, +11 more

- 01 Jun 2021 -

Applied Acoustics

TL;DR: A new approach to improve the accuracy of speaker identification in the presence of interference for robot control applications with a convolutional neural network (CNN) that achieves a high classification accuracy up to 97.5%, which is more than double the performance reported for some traditional methods that are used for speaker identification.

...read moreread less

Journal ArticleDOI

Enhanced Indonesian Ethnic Speaker Recognition using Data Augmentation Deep Neural Network

Kristiawan Nugroho, +4 more

- 23 Apr 2021 -

Journal of King Saud University - Comput...

TL;DR: After seeing the performance of this model, it can be concluded that Data Augmentation Deep Neural Network can improve the speaker's recognition performance using the Indonesian ethnic dataset.

...read moreread less

Journal ArticleDOI

An optimum end-to-end text-independent speaker identification system using convolutional neural network

Shabnam Farsiani, +2 more

- 01 May 2022 -

Computers & Electrical Engineering

TL;DR: In this article , the authors proposed a new CNN for text-independent speaker identification inspired by the VGG-13 architecture with fewer parameters but an acceptable accuracy, which reduced the time complexity and memory cost of network training by using a short segment of each audio sample.

...read moreread less

Journal ArticleDOI

A strong hybrid AdaBoost classification algorithm for speaker recognition

Karthikeyan, +1 more

- 01 Sep 2021 -

Sadhana-academy Proceedings in Engineeri...

TL;DR: In this article, a hybrid adaptive boosting (AdaBoost) combined with a powerful ML classifier (Random Forest) is proposed to handle multi-class imbalanced speaker data classification.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article

Rectified Linear Units Improve Restricted Boltzmann Machines

Vinod Nair, +1 more

TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.

...read moreread less

Journal ArticleDOI

Front-End Factor Analysis for Speaker Verification

Najim Dehak, +4 more

- 01 May 2011 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.

...read moreread less

Journal ArticleDOI

Convolutional neural networks for speech recognition

Ossama Abdel-Hamid, +5 more

- 01 Oct 2014 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: It is shown that further error rate reduction can be obtained by using convolutional neural networks (CNNs), and a limited-weight-sharing scheme is proposed that can better model speech features.

...read moreread less

Proceedings ArticleDOI

A vector quantization approach to speaker recognition

F.K. Soong, +3 more

TL;DR: A vector quantization (VQ) codebook was used as an efficient means of characterizing the short-time spectral features of a speaker and was used to recognize the identity of an unknown speaker from his/her unlabelled spoken utterances based on a minimum distance (distortion) classification rule.

...read moreread less

Proceedings ArticleDOI

Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-level Multimodal Sentiment Analysis

Soujanya Poria, +2 more

TL;DR: A novel way of extracting features from short texts, based on the activation values of an inner layer of a deep convolutional neural network, is presented and a parallelizable decision-level data fusion method is presented, which is much faster, though slightly less accurate.

...read moreread less

Related Papers (5)

Kurdish speaker identification based on one dimensional convolutional neural network

Zrar Khalid Abdul

- 01 Aug 2019 -

Computational Methods for Differential E...

Speaker Identification Using a Hybrid CNN-MFCC Approach

Citations

Speech Emotion Recognition Using Quaternion Convolutional Neural Networks

Speaker identification based on Radon transform and CNNs in the presence of different types of interference for Robotic Applications

Enhanced Indonesian Ethnic Speaker Recognition using Data Augmentation Deep Neural Network

An optimum end-to-end text-independent speaker identification system using convolutional neural network

A strong hybrid AdaBoost classification algorithm for speaker recognition

References

Rectified Linear Units Improve Restricted Boltzmann Machines

Front-End Factor Analysis for Speaker Verification

Convolutional neural networks for speech recognition

A vector quantization approach to speaker recognition

Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-level Multimodal Sentiment Analysis

Related Papers (5)

Kurdish speaker identification based on one dimensional convolutional neural network

A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network

Text independent speaker recognition using the Mel frequency cepstral coefficients and a neural network classifier

Speaker Recognition using fusion of features with Feedforward Artificial Neural Network and Support Vector Machine

Comparative study of different classifiers based speaker recognition system using modified MFCC for noisy environment