Statistics based features for unvoiced sound classification

doi:10.1109/MLSP.2013.6661986

Proceedings ArticleDOI

Statistics based features for unvoiced sound classification

Sunit Sivasankaran, +1 more

- pp 1-6

Chats0

TLDR

This work investigates if statistics obtained by decomposing sounds using a set of filter-banks and computing the moments of the filter responses, along with their correlation values can be used as features for classifying unvoiced sounds.

Abstract:

Unvoiced phonemes have significant presence in spoken English language. These phonemes are hard to classify, due to their weak energy and lack of periodicity. Sound textures such as sound made by a flowing stream of water or falling droplets of rain have similar aperiodic properties in temporal domain as unvoiced phonemes. These sounds are easily differentiated by a human ear. Recent studies on sound texture analysis and synthesis have shown that the human auditory system perceives sound textures using simple statistics. These statistics are obtained by decomposing sounds using a set of filter-banks and computing the moments of the filter responses, along with their correlation values. In this work we investigate if the above mentioned statistics, which are easy to extract, can also be used as features for classifying unvoiced sounds. To incorporate the moments and correlation values as features, a framework containing multiple classifiers is proposed. Experiments conducted on the TIMIT dataset gave an accuracy on par with the latest reported in the literature, with lesser computational cost.

Statistics based features for unvoiced sound classification

Citations

Spectral and textural features for automatic classification of fricatives

Online forecasting method for high-frequency mechanical noise of structure

Piano Multi-Pitch Estimator Using CNN-Stacked LSTM

Piano Multi-Pitch Estimator Using CNN-Stacked LSTM

References

Analysis and synthesis of sound textures

Classifying soundtracks with audio texture features

Selective Phoneme Spotting for Realization of an /s, z, C, t/ Transposer

Related Papers (5)

Exploring Monaural Features for Classification-Based Speech Segregation

Speech perception based algorithm for the separation of overlapping speech signal

Admissible wavelet packet features based on human inner ear frequency response for Hindi consonant recognition

Analysis of Correlation between Audio and Visual Speech Features for Clean Audio Feature Prediction in Noise

Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures