Speech Features Analysis for Tone Language Speaker Discrimination Systems

doi:10.1007/978-3-319-77028-4_57

Book ChapterDOI

Speech Features Analysis for Tone Language Speaker Discrimination Systems

Mercy E. Edoho, +2 more

- pp 433-442

Chats0

TLDR

A speech pattern analysis framework for tone language speaker discrimination systems is proposed that holds the hypothesis that speech feature variability is an efficient means for discriminating speakers and confirms high inter-variability—between speakers, and low intra-Variability—within speakers.

Abstract:

In this paper, a speech pattern analysis framework for tone language speaker discrimination systems is proposed. We hold the hypothesis that speech feature variability is an efficient means for discriminating speakers. To achieve this, we exploit prosody-related acoustic features (pitch, intensity and glottal pulse) of corpus recordings obtained from male and female speakers of varying age categories: children (0–15), youths (16–30), adults (31–50), seniors (above 50)—and captured under suboptimal conditions. The speaker dataset was segmented into three sets: train, validation and test set—in the ratio of 70%, 15% and 15%, respectively. A 41 × 14 self-organizing map (SOM) architecture was then used to model the speech features, thereby determining the relationship between the speech features, segments and patterns. Results of a speech pattern analysis indicated wide F0 variability amongst children speakers compared with other speakers. This gap however closes as the speaker ages. Further, the intensity variability among speakers was similar across all speaker classes/categories, while glottal pulse exhibited significant variation among the different speaker classes. Results of SOM feature visualization confirmed high inter-variability—between speakers, and low intra-variability—within speakers.

Speech Features Analysis for Tone Language Speaker Discrimination Systems

Citations

A Complex Cognitive-Based Technique for Social Tension Detection in the Internet

References

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences

An introduction to biometric recognition

Duration modeling for HMM-based speech synthesis.

Modeling durations of syllables using neural networks

Linguistic properties in the control of segmental duration for speech synthesis

Related Papers (5)

Automatic estimation of one's age with his/her speech based upon acoustic modeling techniques of speakers

Efficient Acoustic Parameters for Speaker Recognition

The Prototype Model in Speaker Identification by Human Listeners

Text‐independent speaker recognition with short utterances

Characterization of inter-speaker articulatory variability: A two-level multi-speaker modelling approach based on MRI data.