Spectral and textural features for automatic classification of fricatives

doi:10.1109/PVC.2014.6845422

Proceedings ArticleDOI

Spectral and textural features for automatic classification of fricatives

Alex Frid, +1 more

- pp 1-4

Chats0

TLDR

Two dimensionality reduction algorithms, namely, t-distributed Stochastic Neighbor Embedding and Sequential Forward Floating Selection were used to obtain a compact representation of the data and it is shown that representing the data by a feature vector with as few as 3 dimensions, yields a classification rate of almost 90% which outperforms most of the results obtained in previous studies.

Abstract:

Classification of unvoiced fricatives is an important stage in applications such as spoken term detection and audio-video synchronization, and in technologies for the hearing impaired Due to their acoustic similarity, extraction of multiple features and construction of high-dimensional feature vectors are required for successful classification of these phonemes In this study two dimensionality reduction algorithms, namely, t-distributed Stochastic Neighbor Embedding (t-SNE) and Sequential Forward Floating Selection (SFFS) were used to obtain a compact representation of the data A classification stage (kNN or SVM) was then applied, in which we compared the identification rates between the original feature vector and the low-dimensional representation A total of 1000 unvoiced fricatives (/s/ /sh/ /f/ and /th/) derived from the TIMIT speech database, containing 25000 short frames of 8 ms each, were used for the evaluation We show that representing the data by a feature vector with as few as 3 dimensions, yields a classification rate of almost 90% which outperforms most of the results obtained in previous studies

Spectral and textural features for automatic classification of fricatives

Citations

Baby Cry Detection: Deep Learning and Classical Approaches

Clustering classification and human perception of automative steering wheel transient vibrations

Analyzing cognitive processes from complex neuro-physiologically based data: some lessons

Analysis of the Influence of the Arabic Fricatives Vocalic Context on Their Spectral Parameters

References

Visualizing Data using t-SNE

Floating search methods in feature selection

The Sounds of the World's Languages

Stochastic Neighbor Embedding

Speech database development at MIT: Timit and beyond

Related Papers (5)

Partitioned Feature-based Classifier model

Audio classification in a weighted SVM

Very short feature vector for music genre classiciation based on distance metric lerning

A partitioned neural network approach for vowel classification using smoothed time/frequency features

An optimal two stage feature selection for speech emotion recognition using acoustic features