scispace - formally typeset
Search or ask a question

Showing papers by "Sebastian Möller published in 2007"


Journal ArticleDOI
TL;DR: The results suggest that both questionnaire methods provide valid measurements of a large number of different quality aspects; most of the perceptive dimensions underlying the subjective judgments can also be measured with a high reliability.

68 citations


Proceedings ArticleDOI
27 Aug 2007
TL;DR: The new classification scheme of communication failures and their consequences shows that the failure classification may uncover the causes of interaction problems between user and system, irrespective of system complexity, and that failure consequences can serve as a predictor of user satisfaction.
Abstract: Communication failures are typical for interactions with spoken dialogue systems, in particular when dialogues get less structured and less foreseeable. In this paper, we adopt a new classification scheme of communication failures and their consequences and show its usefulness in three respects: (1) For the systematic analysis of data collected in user testing, (2) for the prediction of user-perceived quality and usability, and (3) for the automatic testing of usability in a simulation testbed. Experimental results are presented for two spoken dialogue systems which differ in their dialogue structure and complexity. They show that the failure classification may uncover the causes of interaction problems between user and system, irrespective of system complexity, and that failure consequences can serve as a predictor of user satisfaction.

14 citations



Proceedings Article
01 Jan 2007
TL;DR: It is shown that predictions according to PARADISE can lead to accurate test results despite the low R 2, which usually is rather low.
Abstract: Automatic evaluation of spoken dialog systems has gained interest among researchers in the past years. In the PARADISE framework (Walker et al. 1997), a linear regression function is trained on a dialog corpus to predict user ratings of satisfaction from interaction parameters. The accuracy of such predictions is generally measured with R 2 , which usually is rather low. In this paper, it is shown that predictions according to PARADISE can lead to accurate test results despite the low R 2 .

13 citations


Proceedings ArticleDOI
01 Jan 2007
TL;DR: Despite efforts devoted to supporting natural, mixed-initiative dialog and to the prevention of communication failures, over one fourth of user utterances were problematic, often leading to stagnation or regression.
Abstract: Despite their basic attractiveness as an interaction paradigm for controlling intelligent environments, the design of spoken dialog systems for this purpose raises some usability challenges that require careful attention. This paper examines closely the communication failures that can occur in the control of one particular type of intelligent environment: a smart home system that provides control for multiple domestic devices through a state-of-the-art mixed-initiative spoken-dialog interface. The 24 participants completed several tasks with the INSPIRE system in a controlled experiment, and interaction failures were categorized with an error taxonomy that is related to more general error taxonomies but specialized to this class of systems. Despite efforts devoted to supporting natural, mixed-initiative dialog and to the prevention of communication failures, over one fourth of user utterances were problematic, often leading to stagnation or regression. The causes and consequences of these problems are discussed, along with their implications for the design of spoken dialog systems for intelligent environments.

12 citations


Patent
11 Sep 2007
TL;DR: In this article, a method for determining a speech quality measure of an output speech signal with respect to an input speech signal, wherein the input signal passes through a signal path of a data transmission system resulting in the output signal, is presented.
Abstract: A method for determining a speech quality measure of an output speech signal with respect to an input speech signal, wherein the input signal passes through a signal path of a data transmission system resulting in the output signal, includes the steps of pre-processing the output signal; determining at least one of an interruption rate of the pre-processed output signal and a measure for an intensity of musical tones present in the pre-processed output signal; and determining the speech quality measure from at least one of the interruption rate and the measure for the intensity of the musical tones.

4 citations


Proceedings ArticleDOI
01 Oct 2007
TL;DR: An instrumental method for estimating the quality-relevant dimension "directness / frequency content" (DF) is presented and is part of a framework for a signal-based approach to measure the quality of transmitted speech on the basis of perceptual dimensions.
Abstract: In this paper, an instrumental method for estimating the quality-relevant dimension "directness / frequency content" (DF) is presented. Apart from the perceptual dimensions "continuity" and "noisiness", DF has been identified to be of crucial importance for the listening-quality of today's telecommunication networks [1], especially if it comes to linear distortions. The presented work is part of a framework for a signal-based approach to measure the quality of transmitted speech on the basis of perceptual dimensions [2].

3 citations


01 Jan 2007
TL;DR: In this paper, a verfahren zur modellierung und Vorhersage wahrgenommener Qualitat von Telefongesprachen evaluiert and bestatigt is presented.
Abstract: Auf Basis auditiver und instrumenteller Bewertung von 5–6 Sekunden langen Sprachsamples wurde ein neuartiges Verfahren zur Modellierung und Vorhersage wahrgenommener Qualitat von Telefongesprachen evaluiert und bestatigt. Zugeschriebene Qualitat fur 1- und 2-Minuten Gesprache kann mit diesem Verfahren deutlich besser als durch das arithmetische Mittel der Einzelbewertungen erfasst werden.

1 citations


01 Jan 2007
TL;DR: In this paper, the authors propose an adaquate Evaluierung zwei Aspekte: the Uberprufung der Leistung der beteiligten systemkomponenten (z.B. Spracherkennung, Sprachverstehen, Dialogfuhrung and Sprachausgabe), and the Quantifizierung verschiedener Qualitatsaspektes aus Benutzersicht, wie bspw. kontrollierter (Labor-) Experimente with
Abstract: Je bessere Moglichkeiten bestehen, Umgebungen durch Einsatz von Sprachtechnologie „intelligent“ zu gestalten, desto starker steigt der Bedarf an einer schnellen und okonomischen Evaluierung. Ublicherweise umfasst eine adaquate Evaluierung zwei Aspekte: Die Uberprufung der Leistung der beteiligten Systemkomponenten (z.B. Spracherkennung, Sprachverstehen, Dialogfuhrung und Sprachausgabe), sowie die Quantifizierung verschiedener Qualitatsaspekte aus Benutzersicht, wie bspw. der Effizienz, des Komforts, der Gebrauchstauglichkeit sowie der Akzeptanz. Da sich Qualitat als Ergebnis eines Wahrnehmungsund Beurteilungsprozesses ergibt, bedarf es zur Messung der oben genannten Qualitatsaspekte i. Allg. kontrollierter (Labor-) Experimente mit Versuchspersonen.