scispace - formally typeset
Search or ask a question

Showing papers by "Samarendra Dandapat published in 2009"


01 Jan 2009
TL;DR: In this article, a speech recognition study is conducted and the performance of the same under stressed condition is evaluated using isolated word recognition and keyword spotting approaches. But, the authors did not quantify the exact amount of degradation caused by each stress condition and instead developed methods to compensate the stress for each condition.
Abstract: The objective of this work is to conduct a speech recognition study and evaluate the performance of the same under stressed condition The speech recognition study is conducted both in isolated word recognition and keyword spotting approaches The word models are built during training using speech collected from neutral condition During testing these models are tested with speech signals collected under different stressed conditions to quantify the amount of degradation in each stress condition It is observed that the performance of the speech recognition system decreases significantly under stressed condition I INTRODUCTION Speech is a complex signal which encodes message as well as paralinguistic information like speaker, emotion, acoustic environment, person's intention, language, accent and dialect (1) Stress refers to the psychological state of the person due to internally induced factors like emotions or externally induced factors like Lombard effect In human-human interaction, listener can recognize message as well as paralinguistic aspects present in the speech At the same time the listener can also effortlessly extract only wanted information from the speech and neglect the rest by what is called selective attention This is not understood well to mimic the same in human- computer interaction Hence in the case of human-computer interaction, the performance of the system degrades because of the inability of the system to deemphasize the paralinguistic information For instance, under stress the speech production varies with respect to neutral condition due to the constriction of various muscle structures present in the speech production system This leads to the change in the characteristic of speech signal compared to the neutral condition Identification of stress and properly compensating the same will give signif- icant improvement in the performance of speech or speaker recognition systems For this it is better to quantify the amount of degradation that will be caused due to the stressed condition The present work deals with the quantification of degradation in the performance of speech recognition system under stressed condition Most of the earlier attempts in the stressed speech process- ing area focused on how to classify and compensate different stress conditions To find the effect of stress in speech, researchers have studied the effects of stress at sentence, word and sound unit levels (1) In these study they have analyzed percentage deviation of duration, intensity, glottal pulse shap- ing and vocal tract spectrum (1) In some of the studies, speech recognizer was trained with neutral speech and during testing the effect of stress was compensated (2) Under this condition, compensation techniques used for such analysis are formant location and bandwidth stress equalization (3), (4), (5), whole word cepstral compensation (6), slope-dependent weighting (7), formant shifting (8), source-generator based codebook stress compensation (9), (10), source-generator based adaptive cepstral compensation (11), (10) The purpose of these studies are to improve the performance of speech recognition system All these studies are based on the fact that under stressed condition the performance of the speech recognition system degrades, but not exactly to quantify how much degradation takes place Even though it is a known fact that under stressed condition performance of the system degrades, it may be better to have first hand quantification of amount of degradation Such quantification will help in the following ways: We will understand the exact amount of degradation caused by each stress condition Accordingly methods may be developed to compensate the stress for each condition This is the motiva- tion for the present work In this study, we quantify the effect of stress in an automatic

7 citations


Proceedings ArticleDOI
01 Nov 2009
TL;DR: This work proposes to evaluate the HOS (Kurtosis) in each Wavelet band to denoise an MECG signal, and shows significant improvement in denoising the M ECG signals.
Abstract: Multichannel Electrocardiogram (MECG) signal de-noising can be described as a process of removing the clinically unimportant contents present from the signal Higher Order Statistics (HOS) can help to retain finer details of an Electrocardiogram (ECG) signal which can effectively reduce the noise levels in MECG signal In this work, it is proposed to evaluate the HOS (Kurtosis) in each Wavelet band to denoise an MECG signal Thresholding levels are derived based on the values of fourth order cumulant, ‘Kurtosis’, of the Wavelet coefficients and Energy Contribution Efficiency (ECE) of Wavelet sub-bands The performance of this method for compressed signals is evaluated using Percentage Root Mean Square Difference (PRD), Weighted PRD (WPRD), and Wavelet Weighted Percentage Root Mean Square Difference (WWPRD) The proposed algorithm is tested with database of CSE Mutlilead Measurement Library The results show significant improvement in denoising the MECG signals

5 citations


01 Jan 2009
TL;DR: The optimum decomposition level and a best suited wavelet filter for the compression of a set of retinal images can be chosen from the results presented in the paper.
Abstract: In this paper, a comparative study of a set of wavelet filters used for wavelet based retinal image compression system has been made. The performance of different wavelet filters is observed by decomposing the retinal image to various levels for a given compression ratio. The visual quality of the reconstructed retinal image is observed at each decomposition level. The statistical measures such as the peak signal to noise ratio (PSNR), laplacian mean squared error (LMSE) and structural similarity (SSIM) index are used to quantify the effect of wavelet filters. The subjective evaluation is also done by examining the quality of reconstructed image. The optimum decomposition level and a best suited wavelet filter for the compression of a set of retinal images can be chosen from the results presented in the paper.

3 citations