Age and gender recognition for telephone applications based on GMM supervectors and support vector machines
read more
Citations
Speech Recognition Using Deep Neural Networks: A Systematic Review
Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
Paralinguistics in speech and language-State-of-the-art and the challenge
Automatic speaker age and gender recognition using acoustic and prosodic level information fusion
Writer Identification Using GMM Supervectors and Exemplar-SVMs
References
Support vector machines using GMM supervectors for speaker verification
An overview of automatic speaker recognition technology
Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications
Exploiting speech for recognizing elderly users to respond to their special needs.
Related Papers (5)
Automatic speaker age and gender recognition using acoustic and prosodic level information fusion
Speaker Verification Using Adapted Gaussian Mixture Models
Frequently Asked Questions (9)
Q2. What is the scenario of the corpus?
The scenario of the corpus is telephone speech, where the speakers called an automatic recording system and read a set of words, sentences and digits.
Q3. What is the result of the GMM-UBM system?
But with 512 Gaussian densities, MAP adaptation, full covariance matrices and a linear kernel the authors achieved a recall of 74% and a precision of 77%.
Q4. What is the suc-cessful system for identifying a speaker?
The most suc-cessful systems used Mel Frequency Cepstral Coefficients (MFFCs) and either performed multiple phoneme recognition or modeled the different age classes with Gaussian Mixture Models (GMMs).
Q5. What is the simplest way to extract the MFCCs?
After extraction of the MFCCs a Universal Background Model (UBM) is created by employing all the available training data, using the ExpectationMaximization (EM) algorithm [8].
Q6. how did the authors use the GMM supervector-based SVM approach?
The authors applied the GMM supervector-based SVM approach to the field of automatic age recognition in combination with gender recognition.
Q7. What was the purpose of the study?
In order to simulate a mismatched condition of training and test data the authors also evaluated the system on a 23 speaker subset of the VoiceClass corpus.
Q8. What other examples are used to adapt the ASR system to a certain customer?
Other examples are the adaptation of the waiting queue music, the offer of age dependent advertisements to callers in the waiting queue or to change the speaking habits of the text-to-speech module of the ASR system.
Q9. What is the simplest way to classify a speaker?
The GMM supervectors can be regarded as a mapping from the utterance of a speaker (in their case the MFCCs) to a high-dimensional feature vector.