scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A Kepstrum based approach for enhancement of dysarthric speech

29 Nov 2010-Vol. 7, pp 3474-3478
TL;DR: A novel speech processing algorithm based on Kepstrum analysis procedure is proposed in this paper, which provides very good speech enhancement for Dysarthric speech.
Abstract: A novel speech processing algorithm based on Kepstrum analysis procedure is proposed in this paper, which provides very good speech enhancement for Dysarthric speech. Kepstrum approach has so far been used in communication applications like two microphone noise cancellation. The other applications are derivation of Kalman filter and wiener filter equations. So an attempt to use kepstrum approach to enhance the dysarthric speech is made in this paper. The algorithm is tested on various monosyllabic and bisyllabic (Consonant-Vowel pattern and Consonant-Vowel-Consonant-Vowel pattern) dysarthric speech samples of cerebral palsy patients between the age group of 40–60 years and it was found that there was considerable formant shift and modification in the energy of the output signal. Also the results obtained by kepstrum approach is compared with the results obtained by Linear Prediction Coefficients (LPC) method and it is found that kepstrum approach gives better results.
Citations
More filters
Journal ArticleDOI
TL;DR: This study reviews current trends in access solution development for children with cerebral palsy, with particular emphasis on the access technology that harnesses a control signal from the user and the output device whose behavior is modulated by the user’s control signal.
Abstract: Access solutions may facilitate communication in children with limited functional speech and motor control. This study reviews current trends in access solution development for children with cerebral palsy, with particular emphasis on the access technology that harnesses a control signal from the user (eg, movement or physiological change) and the output device (eg, augmentative and alternative communication system) whose behavior is modulated by the user’s control signal. Access technologies have advanced from simple mechanical switches to machine vision (eg, eye-gaze trackers), inertial sensing, and emerging physiological interfaces that require minimal physical effort. Similarly, output devices have evolved from bulky, dedicated hardware with limited configurability, to platform-agnostic, highly personalized mobile applications. Emerging case studies encourage the consideration of access technology for all nonverbal children with cerebral palsy with at least nascent contingency awareness. However, esta...

34 citations

01 Jan 2013
TL;DR: This work refined the formant trajectories of the dysarthric speech to improve its intelligibility and observed that the quality of speech improved significantly, and should be encouraged to communicate more effectively and improve the pace of their rehabilitation.
Abstract: Dysarthria is a motor-neuro disorder that affects the quality of articulation required to produce speech. Also, there are temporal inconsistencies in the speech produced by people with dysarthria, leading to inconsistent formant trajectories. The trajectories change slowly in dysarthric speech, when compared to normal speech. In this work, we refine the formant trajectories of the dysarthric speech to improve its intelligibility. We use the P.563, P.862, standards along with composite measures to evaluate the quality of speech, before and after the refinement; we used NEMOURS database for the experiments involving mild dysarthria. For the proposed work, we attempt to emphasize the fast variations in the formant trajectories to enhance speech quality. It was observed that the quality of speech improved significantly. Our method will therefore encourage the dysarthric people to communicate more effectively and improve the pace of their rehabilitation. To the best of our knowledge, this type of work is not reported elsewhere.

3 citations


Cites background from "A Kepstrum based approach for enhan..."

  • ...The intelligibility can be improved at the signal level by considering the speech of the dysarthric people[7]....

    [...]

Proceedings ArticleDOI
01 Oct 2013
TL;DR: By investigating the difference of frequency characteristics between pathological and normal voices, this paper proposes an enhancement algorithm which can efficiently reduce the breathiness of the pathological voice while maintaining the identity of the speaker.
Abstract: This paper proposes a speech enhancement algorithm for pathological voices using a time-frequency trajectory excitation (TFTE) modeling. The TFTE model has a capability of delicately controlling the periodic and non-periodic excitation components by taking a single pitch based decomposition process. By investigating the difference of frequency characteristics between pathological and normal voices, this paper proposes an enhancement algorithm which can efficiently reduce the breathiness of the pathological voice while maintaining the identity of the speaker. Subjective test results are presented to verify the effectiveness of the proposed algorithm.

2 citations


Cites methods from "A Kepstrum based approach for enhan..."

  • ...In order to enhance the voice of dysarthria, a formant synthesizer based and a cepstrum based approach are introduced....

    [...]

Proceedings ArticleDOI
01 Dec 2019
TL;DR: In this work, the formants of dysarthric vowels are modified by using three different formant transformation techniques available and it is found that use of probabilistic least square regression results in smaller RMSE compared to formants transformation using target features and joint density estimation.
Abstract: Speech communication is the fundamental of an individual's participation in society. But this type of communicating medium is often disrupted by various types of physical disorder. Dysarthria is one such disorder that encapsulates various neuro-motor disorders that disrupt or impair the physical production of speech. Speech of dysarthric speaker is less intelligible when compared to normal speaker, which creates difficulty during communication. One way to improve the intelligibility of dysarthric speech is to modify the formants of dysarthric speech and resynthesize the speech using modified formants. In this work, we modify the formants of dysarthric vowels by using three different formant transformation techniques available. When tested on the same data with root mean square as evaluation parameter, it is found that use of probabilistic least square regression results in smaller RMSE compared to formant transformation using target features and joint density estimation.

Cites methods from "A Kepstrum based approach for enhan..."

  • ...The authors in [10] used Kepstrum analysis to improve the dysarthric speech quality in cerebral palsy patients....

    [...]

References
More filters
Book
01 Jan 1992
TL;DR: This paper presents a meta-analysis of the Z-Transform and its application to the Analysis of LTI Systems, and its properties and applications, as well as some of the algorithms used in this analysis.
Abstract: 1. Introduction. 2. Discrete-Time Signals and Systems. 3. The Z-Transform and Its Application to the Analysis of LTI Systems. 4. Frequency Analysis of Signals and Systems. 5. The Discrete Fourier Transform: Its Properties and Applications. 6. Efficient Computation of the DFT: Fast Fourier Transform Algorithms. 7. Implementation of Discrete-Time Systems. 8. Design of Digital Filters. 9. Sampling and Reconstruction of Signals. 10. Multirate Digital Signal Processing. 11. Linear Prediction and Optimum Linear Filters. 12. Power Spectrum Estimation. Appendix A. Random Signals, Correlation Functions, and Power Spectra. Appendix B. Random Numbers Generators. Appendix C. Tables of Transition Coefficients for the Design of Linear-Phase FIR Filters. Appendix D. List of MATLAB Functions. References and Bibliography. Index.

3,911 citations

Proceedings Article
01 Jan 2006
TL;DR: The main result is that the widely used subset of the MFCCs is robust at bit rates equal or higher than 128 kbits/s, for the implementations the authors have investigated.
Abstract: In large MP3 databases, files are typically generated with different parameter settings, i.e., bit rate and sampling rates. This is of concern for MIR applications, as encoding difference can potentially confound meta-data estimation and similarity evaluation. In this paper we will discuss the influence of MP3 coding for the Mel frequency cepstral coeficients (MFCCs). The main result is that the widely used subset of the MFCCs is robust at bit rates equal or higher than 128 kbits/s, for the implementations we have investigated. However, for lower bit rates, e.g., 64 kbits/s, the implementation of the Mel filter bank becomes an issue.

307 citations


"A Kepstrum based approach for enhan..." refers background or methods in this paper

  • ...Fried-Oken, J Staehely, “Improving the intelligibility of dysarthric speech”, Journal of speech communication, April 2007 [4] S....

    [...]

  • ...The filter equation is given by the following set of equations [4]...

    [...]

Journal ArticleDOI
TL;DR: This study significantly improved the intelligibility of dysarthric vowels of one speaker from 48% to 54%, as evaluated by a vowel identification task using 64 CVC stimuli judged by 24 listeners.

161 citations

Journal ArticleDOI
TL;DR: A kepstrum approach to minimum–phase Wiener filtering of stationary scalar processes is proposed and solved for the case of signal plus coloured noise, where the noise possibly includes a white–noise component.
Abstract: A kepstrum (or complex–cepstrum) approach to minimum–phase Wiener filtering of stationary scalar processes is proposed and solved for the case of signal plus coloured noise, where the noise possibly includes a white–noise component A general solution is found in an innovations form The spectral factorization of the noise model and of the signal–plus–noise model required for the solution are determined from data using the kepstrum technique with the fast Fourier transform This approach avoids dependence on any form of multidimensional state–space or polynomial–based model and so avoids use of recursive parameter estimation or of Diophantine equations

19 citations


"A Kepstrum based approach for enhan..." refers background or methods in this paper

  • ...Lehn- SchiФler,”Mel frequency cepstral coefficients: An evaluation of robustness of mp3 encoded music”, ISMIR 2006, Vicoria, Canada, 2006 [5] T....

    [...]

  • ...(7) Where, σV and σE are the noise variance and the innovations variance respectively [5][6]....

    [...]

  • ...Then the innovations of wiener filter are estimated [5]....

    [...]

Journal ArticleDOI
TL;DR: The Fourier transform of the logarithm of spectral density is a useful tool for spectral analysis of random signals which are highly resonant as discussed by the authors, and it is shown how a smooth frequency response (Bode plot) can be found to identify the signal generating process.
Abstract: The Fourier transform of the logarithm of spectral density is a useful tool for spectral analysis of random signals which are highly resonant. This is because the logarithm compresses the large peaks of the spectrum and a resulting power series expansion (kepstrum) can be truncated at a suitable length to suppress the higher frequencies. This paper utilizes the FFT in a similar form in order to obtain spectral smoothing. Several examples show the advantages of the method including an analysis on the pitch and roll data of a container ship. It is also shown how a smooth frequency response (Bode plot) can be found to identify the signal generating process. This technique is extended to systems with signal plus noise and the identification then becomes equivalent to spectral factorization, a technique particularly useful in the determi nation of Kalman filters.

16 citations


"A Kepstrum based approach for enhan..." refers background in this paper

  • ...(7) Where, σV and σE are the noise variance and the innovations variance respectively [5][6]....

    [...]

  • ...Barrett, “A kepstrum approach for filtering, smoothing and prediction”, Research letters information for maths and science, volume 3, pp-135-147, April 2002 [6] J....

    [...]