scispace - formally typeset
Proceedings ArticleDOI

Syllable nucleus Durations Estimation using Linear Regression based ensemble model

Reads0
Chats0
TLDR
An interval-data-based Linear Regression Model for syllable nucleus Durations Estimation (LRM-DE), which treats syllable boundary time-marks in pairs makes it more suitable for estimating syllable durations for English sentences, which can be used for sentence stress detection.
Abstract
Unlike conventional automatic continuous speech segmentation models that deal with each boundary time-mark individually, in this paper, we propose an interval-data-based Linear Regression Model for syllable nucleus Durations Estimation (LRM-DE), which treats syllable boundary time-marks in pairs. This characteristic of LRM-DE makes it more suitable for estimating syllable durations for English sentences, which can be used for sentence stress detection. LRM-DE combines the outcomes of multiple base automatic speech segmentation machines (ASMs) to generate final boundary time-marks that miminize the average distance of the predicted and reference boundary-pairs of syllable nuclei. Experimental results show that on TIMIT dataset, LRM-DE reduces the average difference between the predicted syllable nucleus durations and their reference ones from 13.64ms (the best result of a single ASM) to 11.81ms. Also, LRM-DE improves the syllable nucleus segmentation accuracy from 81.59% to 83.98% within a tolerance of 20ms.

read more

Citations
More filters
Proceedings Article

CASTLE: a computer-assisted stress teaching and learning environment for learners of English as a second language.

TL;DR: The principle and functionality of the Computer-Assisted Stress Teaching and Learning Environment (CASTLE) that is proposed and developed to help learners of English as a Second Language (ESL) to learn stress patterns of English language are described.
Proceedings ArticleDOI

Computer Assisted Language Learning for Syllable-time Language Exposed Adults who are Learning a new Stress-Time Language

TL;DR: A preliminary study is discussed which is one of a series of studies conducted to design a computer software system which helps self-educate spoken English learners and indicates that there is a considerable amount of sentence stress problems among the students of spoken English.
References
More filters
Journal ArticleDOI

Speech emotion recognition using hidden Markov models

TL;DR: This paper proposes a text independent method of emotion classification of speech that makes use of short time log frequency power coefficients (LFPC) to represent the speech signals and a discrete hidden Markov model (HMM) as the classifier.
Journal ArticleDOI

Automatic phonetic segmentation

TL;DR: The most frequently used approach-based on a modified Hidden Markov Model (HMM) phonetic recognizer is analyzed, and a general framework for the local refinement of boundaries is proposed, and the performance of several pattern classification approaches is compared within this framework.
Proceedings ArticleDOI

Refining segmental boundaries for TTS database using fine contextual-dependent boundary models

TL;DR: This paper proposed a post-refining method with fine contextual-dependent GMM for the auto-segmentation task with an accuracy of 90% when only 250 manually labeled sentences are provided to train the refining models.
Journal ArticleDOI

MLP-based phone boundary refining for a TTS database

TL;DR: A method for the automatic labeling of speech signals, which mainly involves the construction of a large database for a TTS synthesis system, and shows that the database constructed using the proposed method produced results that were perceptually comparable to a hand-labeled database, based on subjective listening tests.
Journal ArticleDOI

On Using Multiple Models for Automatic Speech Segmentation

TL;DR: A novel approach to automatic speech segmentation for unit-selection based text-to-speech systems that makes use of multiple independent ASMs to produce a final boundary time-mark, and improves the percentage of boundaries that deviate less than 20 ms with respect to the reference boundary.
Related Papers (5)