Syllable nucleus Durations Estimation using Linear Regression based ensemble model

doi:10.1109/ICASSP.2009.4960717

Proceedings ArticleDOI

Syllable nucleus Durations Estimation using Linear Regression based ensemble model

Jingli Lu, +3 more

- pp 4849-4852

Chats0

TLDR

An interval-data-based Linear Regression Model for syllable nucleus Durations Estimation (LRM-DE), which treats syllable boundary time-marks in pairs makes it more suitable for estimating syllable durations for English sentences, which can be used for sentence stress detection.

Abstract:

Unlike conventional automatic continuous speech segmentation models that deal with each boundary time-mark individually, in this paper, we propose an interval-data-based Linear Regression Model for syllable nucleus Durations Estimation (LRM-DE), which treats syllable boundary time-marks in pairs. This characteristic of LRM-DE makes it more suitable for estimating syllable durations for English sentences, which can be used for sentence stress detection. LRM-DE combines the outcomes of multiple base automatic speech segmentation machines (ASMs) to generate final boundary time-marks that miminize the average distance of the predicted and reference boundary-pairs of syllable nuclei. Experimental results show that on TIMIT dataset, LRM-DE reduces the average difference between the predicted syllable nucleus durations and their reference ones from 13.64ms (the best result of a single ASM) to 11.81ms. Also, LRM-DE improves the syllable nucleus segmentation accuracy from 81.59% to 83.98% within a tolerance of 20ms.

Syllable nucleus Durations Estimation using Linear Regression based ensemble model

Citations

CASTLE: a computer-assisted stress teaching and learning environment for learners of English as a second language.

Computer Assisted Language Learning for Syllable-time Language Exposed Adults who are Learning a new Stress-Time Language

References

Speech emotion recognition using hidden Markov models

Automatic phonetic segmentation

Refining segmental boundaries for TTS database using fine contextual-dependent boundary models

MLP-based phone boundary refining for a TTS database

On Using Multiple Models for Automatic Speech Segmentation

Related Papers (5)

Speech rhythm guided syllable nuclei detection

Selection of Suitable Features for Modeling the Durations of Syllables

A mode-shape classification technique for robust speech rate estimation and syllable nuclei detection

Automatic segmentation of speech into syllabic units

A multi-pass linear fold algorithm for sentence boundary detection using prosodic cues