Age and gender recognition for telephone applications based on GMM supervectors and support vector machines
read more
Citations
Fuzzy support vector machines for age and gender classification
Dimension reduction approaches for SVM based speaker age estimation.
An i-Vector PLDA based gender identification approach for severely distorted and multilingual DARPA RATS data
Gender identification and performance analysis of speech signals
QMOS - A Robust Visualization Method for Speaker Dependencies with Different Microphones
References
Maximum likelihood from incomplete data via the EM algorithm
A Tutorial on Support Vector Machines for Pattern Recognition
Speaker Verification Using Adapted Gaussian Mixture Models
Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
Emotion recognition in human-computer interaction
Related Papers (5)
Automatic speaker age and gender recognition using acoustic and prosodic level information fusion
Speaker Verification Using Adapted Gaussian Mixture Models
Frequently Asked Questions (9)
Q2. What is the scenario of the corpus?
The scenario of the corpus is telephone speech, where the speakers called an automatic recording system and read a set of words, sentences and digits.
Q3. What is the result of the GMM-UBM system?
But with 512 Gaussian densities, MAP adaptation, full covariance matrices and a linear kernel the authors achieved a recall of 74% and a precision of 77%.
Q4. What is the suc-cessful system for identifying a speaker?
The most suc-cessful systems used Mel Frequency Cepstral Coefficients (MFFCs) and either performed multiple phoneme recognition or modeled the different age classes with Gaussian Mixture Models (GMMs).
Q5. What is the simplest way to extract the MFCCs?
After extraction of the MFCCs a Universal Background Model (UBM) is created by employing all the available training data, using the ExpectationMaximization (EM) algorithm [8].
Q6. how did the authors use the GMM supervector-based SVM approach?
The authors applied the GMM supervector-based SVM approach to the field of automatic age recognition in combination with gender recognition.
Q7. What was the purpose of the study?
In order to simulate a mismatched condition of training and test data the authors also evaluated the system on a 23 speaker subset of the VoiceClass corpus.
Q8. What other examples are used to adapt the ASR system to a certain customer?
Other examples are the adaptation of the waiting queue music, the offer of age dependent advertisements to callers in the waiting queue or to change the speaking habits of the text-to-speech module of the ASR system.
Q9. What is the simplest way to classify a speaker?
The GMM supervectors can be regarded as a mapping from the utterance of a speaker (in their case the MFCCs) to a high-dimensional feature vector.