scispace - formally typeset

Proceedings ArticleDOI

Experiences of an In-Service Wizard-of-Oz Data Collection for the Deployment of a Call-Routing Application

Mats Wirén1, Robert Eklund1
26 Apr 2007-pp 56-63

TL;DR: This paper describes the experiences of collecting a corpus of 42,000 dialogues for a call-routing application using a Wizard-of-Oz approach, and provides a detailed exposition of the data collection as such and the application used, and compares the approach to methods previously used.

AbstractThis paper describes our experiences of collecting a corpus of 42,000 dialogues for a call-routing application using a Wizard-of-Oz approach. Contrary to common practice in the industry, we did not use the kind of automated application that elicits some speech from the customers and then sends all of them to the same destination, such as the existing touch-tone menu, without paying attention to what they have said. Contrary to the traditional Wizard-of-Oz paradigm, our data-collection application was fully integrated within an existing service, replacing the existing touch-tone navigation system with a simulated call-routing system. Thus, the subjects were real customers calling about real tasks, and the wizards were service agents from our customer care. We provide a detailed exposition of the data collection as such and the application used, and compare our approach to methods previously used.

Topics: Service (business) (52%)

...read more

Content maybe subject to copyright    Report

Citations
More filters

Journal ArticleDOI
TL;DR: The two-way mimicry target is presented, a model for measuring how well a human-computer dialogue mimics or replicates some aspect of human-human dialogue, including human flaws and inconsistencies.
Abstract: This paper presents an overview of methods that can be used to collect and analyse data on user responses to spoken dialogue system components intended to increase human-likeness, and to evaluate how well the components succeed in reaching that goal. Wizard-of-Oz variations, human-human data manipulation, and micro-domains are discussed in this context, as is the use of third-party reviewers to get a measure of the degree of human-likeness. We also present the two-way mimicry target, a model for measuring how well a human-computer dialogue mimics or replicates some aspect of human-human dialogue, including human flaws and inconsistencies. Although we have added a measure of innovation, none of the techniques is new in its entirety. Taken together and described from a human-likeness perspective, however, they form a set of tools that may widen the path towards human-like spoken dialogue systems.

133 citations


Cites background from "Experiences of an In-Service Wizard..."

  • ...Whittaker and Walker, 2006 ), where a reviewer is asked to listen in on a recorded dialogue and pass judgement on some aspect....

    [...]


Book ChapterDOI
16 Jun 2008
TL;DR: An experiment in the call routing domain that took place during the development of a call routing system for the TeliaSonera residential customer care in Sweden, using a corpus of 42,000 calls as a basis for identifying problematic dialogues and the strategies used by operators to overcome the problems.
Abstract: This paper presents a Wizard-of-Oz (Woz) experiment in the call routing domain that took place during the development of a call routing system for the TeliaSonera residential customer care in Sweden. A corpus of 42,000 calls was used as a basis for identifying problematic dialogues and the strategies used by operators to overcome the problems. A new Woz recording was made, implementing some of these strategies. The collected data is described and discussed with a view to explore the possible benefits of more human-like dialogue behaviour in call routing applications.

17 citations


Proceedings ArticleDOI
Johan Boye1, Mats Wirén1
26 Apr 2007
TL;DR: This paper describes the experiences of using a two-level multi-slot semantics as a way of meeting the problem of maintaining consistency among manually tagged utterances, and explores the ramifications of the approach with respect to classification, evaluation and dialogue design for call routing systems.
Abstract: Statistical classification techniques for natural-language call routing systems have matured to the point where it is possible to distinguish between several hundreds of semantic categories with an accuracy that is sufficient for commercial deployments. For category sets of this size, the problem of maintaining consistency among manually tagged utterances becomes limiting, as lack of consistency in the training data will degrade performance of the classifier. It is thus essential that the set of categories be structured in a way that alleviates this problem, and enables consistency to be preserved as the domain keeps changing. In this paper, we describe our experiences of using a two-level multi-slot semantics as a way of meeting this problem. Furthermore, we explore the ramifications of the approach with respect to classification, evaluation and dialogue design for call routing systems.

13 citations


Cites methods from "Experiences of an In-Service Wizard..."

  • ...However, elsewhere we have applied it to dialogues from the initial Wizard-of-Oz data collection for the TeliaSonera call routing system (Wirén et al. 2007)....

    [...]


01 Jan 2010
TL;DR: This paper describes an experiment where open and directed prompts were alternated when collecting speech data for the deployment of a call-routing application, which is interesting in the light of the “many-options” hypothesis of filled pause production.
Abstract: This paper describes an experiment where open and directed prompts were alternated when collecting speech data for the deployment of a call-routing application. The experiment tested whether open and directed prompts resulted in any differences with respect to the filled pauses exhibited by the callers, which is interesting in the light of the “many-options” hypothesis of filled pause production. The experiment also investigated the effects of the prompts on utterance form and meaning of the callers.

9 citations


Cites background from "Experiences of an In-Service Wizard..."

  • ...As was pointed out in Wirén et al. (2006), however, two other observed differences were that there were no instances following the directed prompt where an already instantiated...

    [...]


01 Jan 2011
Abstract: In the group of people with whom I have worked most closely, we recently attempted to dress our visionary goal in words: “to learn enough about human face-to-face interaction that we are able to ...

9 citations


Cites background or methods from "Experiences of an In-Service Wizard..."

  • ...The method is exemplified by the development of a Swedish natural language call routing system at TeliaSonera (Wirén et al., 2007, Boye & Wirén, 2007)....

    [...]

  • ...During the development of the TeliaSonera customer care call centre (90 200), the open prompt “Vad gäller ditt ärende?”...

    [...]

  • ...…wizards to speak clearly and removing samples from the speech signal (Lathrop et al., 2004); providing the wizards with a limited selection of pre-recorded system prompts (Wirén et al., 2007); and limiting the wizard to make choices within a strictly defined dialogue model (Tenbrink & Hui, 2007)....

    [...]


References
More filters

Proceedings ArticleDOI
01 Feb 1993
TL;DR: It is concluded that empirical studies of the unique qualities of man-machine interaction as distinct from general human discourse are required for the development of user-friendly interactive systems.
Abstract: Current approaches to the development of natural language dialogue systems are discussed, and it is claimed that they do not sufficiently consider the unique qualities of man-machine interaction as distinct from general human discourse. It is concluded that empirical studies of this unique communication situation are required for the development of user-friendly interactive systems. One way of achieving this is through the use of so-called Wizard of Oz studies. The focus of the work described in the paper is on the practical execution of the studies and the methodological conclusions drawn on the basis of the authors' experience. While the focus is on natural language interfaces, the methods used and the conclusions drawn from the results obtained are of relevance also to other kinds of intelligent interfaces.

865 citations


Journal ArticleDOI
TL;DR: This paper focuses on the task of automatically routing telephone calls based on a user's fluently spoken response to the open-ended prompt of “ How may I help you? ”.
Abstract: We are interested in providing automated services via natural spoken dialog systems. By natural , we mean that the machine understands and acts upon what people actually say, in contrast to what one would like them to say. There are many issues that arise when such systems are targeted for large populations of non-expert users . In this paper, we focus on the task of automatically routing telephone calls based on a user's fluently spoken response to the open-ended prompt of “ How may I help you? ”. We first describe a database generated from 10,000 spoken transactions between customers and human agents. We then describe methods for automatically acquiring language models for both recognition and understanding from such data. Experimental results evaluating call-classification from speech are reported for that database. These methods have been embedded within a spoken dialog system, with subsequent processing for information retrieval and formfilling.

647 citations


"Experiences of an In-Service Wizard..." refers background or methods in this paper

  • ...One exception from this is the data collection for the original AT&T “How May I Help You” system (Gorin et al. 1997; Ammicht et al. 1999), which comprised three batches of transactions with live customers, each involving up to 12,000 utterances....

    [...]

  • ...The sole such data collection that we are aware of was made for the original AT&T “How May I Help you” system (Gorin et al. 1997; Ammicht et al. 1999)....

    [...]


Journal ArticleDOI
TL;DR: The focus of the work described in the paper is on the practical execution of the studies and the methodological conclusions drawn on the basis of the authors' experience, and the methods used and the conclusions drawn are of relevance also to other kinds of intelligent interfaces.
Abstract: Current approaches to the development of natural language dialogue systems are discussed, and it is claimed that they do not sufficiently consider the unique qualities of man-machine interaction as...

480 citations


Journal ArticleDOI
TL;DR: The “Wizard of Oz” technique for simulating future interactive technology and a partial taxonomy of such simulations is reviewed and a general Wizard of Oz methodology is suggested.
Abstract: This paper reviews the “Wizard of Oz” technique for simulating future interactive technology and develops a partial taxonomy of such simulations. The issues of particular relevance to Wizard of Oz simulations of speech input/output computer systems are discussed and some experimental variables and confounding factors are reviewed. A general Wizard of Oz methodology is suggested.

408 citations


Proceedings ArticleDOI
21 Mar 1993
TL;DR: This work focuses here on selection of training and test data, evaluation of language understanding, and the continuing search for evaluation methods that will correlate well with expected performance of the technology in applications.
Abstract: The Air Travel Information System (ATIS) domain serves as the common task for DARPA spoken language system research and development The approaches and results possible in this rapidly growing area are structured by available corpora, annotations of that data, and evaluation methods Coordination of this crucial infrastructure is the charter of the Multi-Site ATIS Data COllection Working group (MADCOW) We focus here on selection of training and test data, evaluation of language understanding, and the continuing search for evaluation methods that will correlate well with expected performance of the technology in applications

203 citations


"Experiences of an In-Service Wizard..." refers background in this paper

  • ...Other well-known instances are “Voyager” (Zue et al. 1989) and the individual ATIS collections (Hirschman et al. 1993) which involved up to a hundred subjects or (again) up to 12,000 utterances....

    [...]

  • ...1989) and the individual ATIS collections (Hirschman et al. 1993) which involved up to a hundred subjects or (again) up to 12,000 utterances....

    [...]