scispace - formally typeset
Search or ask a question

Showing papers in "Computer Speech & Language in 1998"


Journal ArticleDOI
TL;DR: The paper compares the two possible forms of model-based transforms: unconstrained, where any combination of mean and variance transform may be used, and constrained, which requires the variance transform to have the same form as the mean transform.

1,755 citations


Journal ArticleDOI
TL;DR: In this article, a Markov model is used to give the most likely sequence of phrase breaks for the input part-of-speech tags for a text-to-speech synthesizer.

186 citations


Journal ArticleDOI
TL;DR: PARADISE (PARAdigm for DIalogue System Evaluation), a general framework for evaluating and comparing the performance of spoken dialogue agents, is presented and can be used both for making predictions about future versions of an agent, and as feedback to the agent so that the agent can learn to optimize its behaviour based on its experiences with users over time.

173 citations


Journal ArticleDOI
TL;DR: The background, challenges and strategies are discussed, and a detailed methodology for ensuring that the gold standard is not fool's gold is presented.

102 citations


Journal ArticleDOI
TL;DR: Looking across the history of MUC in the context of related evaluations, important lessons are drawn about the need for evaluation to evolve with the technology it evaluates, to balance costs against benefits and to weigh the divergent needs of the multiple stake-holders— developers, funders and users.

91 citations


Journal ArticleDOI
TL;DR: What is involved in natural language generation, and how evaluation has figured in work in this area to date is described; a particular text generation application is examined and the issues that are raised in assessing its performance on a variety of dimensions are looked at.

77 citations


Journal ArticleDOI
TL;DR: This paper argues for a generalized, systematic, and fully automated testing and diagnosis facility as an integral part of the linguistic engineering cycle and gives a practical assessment of existing resources; both a flexible methodology and tools for competence and performance profiling are presented.

75 citations


Journal ArticleDOI
TL;DR: In this article, a cooperative international evaluation of grapheme-to-phoneme (GP) conversion for text to speech synthesis in French is presented, and the results for eight systems are provided and analysed in some detail.

45 citations


Journal ArticleDOI
TL;DR: A data-driven technique of extracting context-dependent grapheme- to-phoneme rules with dynamically minimized context lengths from a training lexicon is proposed and can produce transcriptions with sufficient rapidity to maintain real-time processing in a text-to-speech system.

43 citations


Journal ArticleDOI
TL;DR: This paper describes how the best signal manipulation method was determined using perception tests and reports on further validations of the proposed PURR (Prosody Unveiling through Restricted Representation), which has proven to be suitable for test designs with naive listeners.

42 citations


Journal ArticleDOI
TL;DR: An algorithm is detailed that transforms any higher-order hidden Markov model (HMM) to an equivalent first-order HMM, thereby avoiding the training of redundant parameters and making training of high-order HMMs practical for many applications.

Journal ArticleDOI
TL;DR: The ARPA1CSR programme was initiated in 1984 and the first full-scale evaluation began in 1989 and subsequently developed into two parallel strands in the form of the CSR2 and LVCSR3 programmes.

Journal ArticleDOI
TL;DR: This paper aims to clarify the meaning of Kelvin's phrase "when you can measure what you are speaking about, and express it in numbers, you know something about it"; and to provide an example of the kind of knowledge that can be expressed in numbers.

Journal ArticleDOI
TL;DR: The experimental results showed that modelling both the mean and variance trajectories is consistently superior to modelling only the mean trajectory and results in significant improvements over the conventional HMM.

Journal ArticleDOI
TL;DR: An instantaneous speaker adaptation method that uses N-best decoding for continuous mixture-density hidden-Markov-model-based speech-recognition systems and shows a reduction of 36·4% in the error rates of speakers whose decoding using SI models are error-prone.

Journal ArticleDOI
TL;DR: The basic version of OSTIA is reviewed and a new version is presented in which syntactic restrictions of the domain and/or range of the target transduction can effectively be taken into account.

Journal ArticleDOI
TL;DR: The aim of the work is to maximize the use of data available in English for the ATIS task in the construction of the Italian system, and to adapt to speakers acoustic models and by training a language model on translations of transcriptions.

Journal ArticleDOI
TL;DR: A model of accenting that is based on a combination of syntactic/metrical and semantic/pragmatic considerations and how the interaction of various factors determines the locations of sentence accents in speech is outlined.

Journal ArticleDOI
TL;DR: Preliminary work to explore the viability of the use of a human reference standard to assess the performance of speech recognizers has consisted of recording a suitable database, devising a method of degrading the speech in a controlled way and conducting two set of experiments on listeners to measure their responses to degraded speech to establish a reference.

Journal ArticleDOI
TL;DR: A proportional alignment decoding (PAD) algorithm for retraining the VDHMM/PAD method, which outperforms those widely used state duration modelling methods, such as using Poisson, gamma, Gaussian, bounded and non-parametric probability density functions.

Journal ArticleDOI
TL;DR: An overview of the complete programme, and a brief summary of the content and results of the first campaign (1995–97) for each topic, is given.

Journal ArticleDOI
TL;DR: This paper presents the concept of a scaled random trajectory segment model, which aims to overcome the modelling problem created by the fact that segment realizations of the same phonetic unit differ in length.