scispace - formally typeset
Search or ask a question

Showing papers by "Sebastian Möller published in 2006"


Journal ArticleDOI
TL;DR: A new method is described for quantifying the quality degradation introduced by wide-band speech codecs via a one-dimensional impairment factor, based on auditory listening-only tests, which may be used for predicting speech quality in an instrumental way.
Abstract: A new method is described for quantifying the quality degradation introduced by wide-band speech codecs via a one-dimensional impairment factor. The method is based on auditory listening-only tests, but the resulting impairment factors may be used for predicting speech quality in an instrumental way, e.g., for network planning purposes. Following the method, auditory test results are first transformed to an overall quality rating scale, and then adjusted to rule out test-specific effects. The derived impairment factors fit into the common framework which is defined by the E-model for narrow-band telephone networks, and which is hereby extended towards wide-band speech transmission. This paper presents the necessary auditory test data, describes the derivation and adjustment methodology, and provides numerical values for a range of wide-band speech codecs. The values are tested for their robustness in case of codec tandems and adjusted to represent the effects of packet loss

77 citations


Proceedings Article
01 Jan 2006
TL;DR: A new approach for facilitating usability evaluations which is based on user error simulations based on empirical observations of users’ erroneous behavior is presented, which will help designers in making choices between system versions and lower testing costs at early phases of development.
Abstract: Proper usability evaluations of spoken dialogue systems are costly and cumbersome to carry out. In this paper, we present a new approach for facilitating usability evaluations which is based on user error simulations. The idea is to replace real users with simulations derived from empirical observations of users’ erroneous behavior. The simulated errors must cover both system-driven errors (e.g., due to poor speech recognition) as well as conceptual errors and slips of the user, because neither alone is predictive of perceived usability. The simulation is integrated into a workbench which produces reports of typical and rare errors, and which allows usability ratings to be predicted. If successful, this workbench will help designers in making choices between system versions and lower testing costs at early phases of development. Challenges to the approach are discussed and solutions proposed.

73 citations


Proceedings Article
01 Jan 2006
TL;DR: A mapping of the obtained dimensions onto the overall listening quality scores by means of a linear model revealed that “continuity” appears to be the most important dimension in terms ofOverall listening quality.
Abstract: It is the aim of the present paper to analyze the perceptual quality dimensions of modern telephone connections. Such connections differ from standard connections in their timevariant characteristics (e.g., due to Voice-over-IP transmission or due to noise reduction algorithms) and their user interfaces (e.g., hands-free terminals). With the help of two independent auditory experiments with subsequent multidimensional analyses, three perceptual dimensions were identified for a diverse set of stimuli. These dimensions were labeled “directness/frequency content”, “continuity”, and “noisiness”. Overall listening quality scores were collected in a separate experiment. A mapping of the obtained dimensions onto the overall listening quality scores by means of a linear model revealed that “continuity” appears to be the most important dimension in terms of overall listening quality. Index Terms: assessment and modeling of speech quality, quality dimensions, multidimensional analyses

34 citations


Journal ArticleDOI
TL;DR: Four experiments which have been carried out to evaluate the speech output component of the INSPIRE spoken dialogue system, providing speech control for devices located in a ‘‘smart’’ home environment show a significant impact of agent and environmental factors, but not of task factors.

30 citations



Proceedings Article
01 May 2006
TL;DR: The problems which occurred during the database set-up, the invested effort, as well as the quality level which can be reached by the unit-selection speech synthesizer are discussed.
Abstract: In this paper, we describe the set-up process and an initial evaluation of a unit-selection speech synthesizer. The synthesizer is specific in that it is intended to speak with a prominent voice. As a consequence, only very limited resources were available for setting up the unit database. These resources have been extracted from an audio book, segmented with the help of an HMM-based wrapper, and then used with the non-uniform unit-selection approach implemented in the Bonn Open Synthesis System (BOSS). In order to adapt the database to the BOSS implementation, the label files were amended by phrase boundaries, converted to XML, amended by prosodic and spectral information, and then further converted to a MySQL relational database structure. The BOSS system selects units on the basis of this information, adding individual unit costs to the concatenation costs given by MFCC and F0 distances. The paper discusses the problems which occurred during the database set-up, the invested effort, as well as the quality level which can be reached by this approach.

6 citations