A Quantitative Assessment of Group Delay Methods for Identifying Glottal Closures in Voiced Speech
read more
Citations
Epoch Extraction From Speech Signals
Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm
Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review
Inference of Room Geometry From Acoustic Impulse Responses
References
Digital Processing of Speech Signals
The sliding DFT
Least squares glottal inverse filtering from the acoustic speech waveform
Modeling of the glottal flow derivative waveform with application to speaker identification
Related Papers (5)
Least squares glottal inverse filtering from the acoustic speech waveform
Frequently Asked Questions (12)
Q2. What is the effect of noise on the median of a window?
For impulses near the centre of the window, the summation in (9) lies on or near the negative real axis and so for positive SNR values, the noise has little effect on the median of .
Q3. What is the definition of the identification rate of a measure?
The authors define the identification rate of a measure to be the fraction of larynx cycles that contain exactly one NZC and the detection rate to be the fraction that contain either one or two NZCs.
Q4. What is the effect of noise on the median value of a measure?
It follows that the noise will not affect the median value of unless the noise amplitude is large enough to cause the value of the summation to cross the positive real axis where there is a discontinuity in the function.
Q5. How does the detection rate of a measure change as the window length increases?
As the window length in increased the accuracy steadily worsens but the identification rate improves and reaches a peak of over 90% at a window length of 10 ms.
Q6. How many repetitions of the following sentences were recorded?
The database includes ten repetitions from each of ten British English speakers (five male, five female) of the following sentences:
Q7. What is the way to identify a measure?
To take a specific example, the measure is identified by circles and the authors see from the first point on the graph that for a 4 ms window, its identification accuracy is 0.34 ms but its identification rate is only 36%.
Q8. How many larynx cycles contain exactly one NZC?
For this example, the standard deviation of these “closest” NZCs is 0.97 ms and if the authors combine these with the single-NZC cycles, the authors can detect the GCI in over 97% of larynx cycles with a standard deviation of 0.6 ms.
Q9. Why is it possible to have multiple impulses in the analysis window?
It is possible for the analysis window to contain multiple impulses either because the window is longer than the pulse period or because, as is often the case with the LPC residual, the signal includes additional pulses or other impulsive features.
Q10. How can the authors reduce the computational cost of the measures?
The authors have shown how the computational cost of all the measures can be reduced greatly by calculating them recursively provided that a suitable window function is used.
Q11. How many NZCs are in the larynx?
Of the remaining 12% of larynx cycles, over three quarters contain exactly two NZCs; in most cases these occur at glottal opening and closure, respectively, giving rise to the histogram shown in Fig. 8(b).
Q12. What is the detection rate for the and measures?
The and measures again show the best performance and reach a detection rate of 97.1% for window lengths of 8 ms and 7 ms, respectively.