scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Towards human-like spoken dialogue systems

01 Aug 2008-Speech Communication (Elsevier Science Publishers B. V.)-Vol. 50, Iss: 8, pp 630-645
TL;DR: The two-way mimicry target is presented, a model for measuring how well a human-computer dialogue mimics or replicates some aspect of human-human dialogue, including human flaws and inconsistencies.
About: This article is published in Speech Communication.The article was published on 2008-08-01 and is currently open access. It has received 145 citations till now.

Summary (2 min read)

Introduction

  • This is the submitted version of a paper published in The Journal of product innovation management.
  • N.B. When citing this work, cite the original published paper.
  • This study explores some of this complexity by theoretically detailing and empirically examining the critical role that synchronization plays in the process of leveraging resources to create innovation.
  • To do so requires that firms “efficiently orchestrate their resources to innovate and outcompete their competitors in the global market” (De Massis, et al., 2018: 136); more specifically, resources and capabilities need to be managed and leveraged in a way that supports innovation (Lichtenstein and Brush, 2001).

2.1. Innovation

  • Prior research has established three concepts key to innovation: risk, slack, and resource recombination.
  • One aspect of BTOF is that organizational goals, or aspirations, are set in the context of the firm’s perceived performance relative to its rivals in the market.
  • According to the resource-based view, a central task of managers is the optimal allocation of scarce and valuable resources (Barney, 1991, Wernerfelt, 1984).

2.2.1. The role of managers

  • RO research concerns managerial strategy, decisions, and actions.
  • RO identifies managerial roles in three broad categories: (1) structuring, (2) bundling, and (3) leveraging.
  • Structuring captures the subprocesses associated with acquiring, accumulating, and divesting the resources; bundling encompasses the subprocesses associated with combining, integrating, and transforming resources into capabilities; and leveraging involves the deployment of the capabilities to create value (Helfat and Peteraf, 2003, Sirmon, et al., 2007).

2.2.2. Leveraging strategies

  • Of the three managerial roles, leveraging is arguably the most crucial.
  • Innovations using this leveraging strategy are most likely to produce improvements in existing products and expand existing markets.
  • Steve Jobs’ well-known pivot from using touch-screen technology for the iPhone to the iPad reconfigured an existing capability to target a new market opportunity he identified.
  • Examples include the creation and proliferation of telemedicine by firms such as CareClix, virtual reality headsets by such firms as Oculus Rift, and even commercial space flight by firms such as SpaceX.
  • In summary, the resource advantage strategy leverages current capabilities in a current market, the market opportunity strategy leverages current capabilities in a new market, and the entrepreneurial strategy leverages new capabilities in a new market.

2.2.3. Synchronization

  • Synchronization refers to the managerial process of integrating and coordinating RO actions within the firm to support and implement a specific leveraging strategy (Sirmon, et al., 2007, Sirmon, et al., 2011).
  • If a firm decides to adopt a new ‘direct to consumer’ distribution model as part of a market opportunity strategy, its managers will have to coordinate changes to internal processes across the organization, which potentially involves changes in human capital, information technology, and/or financial resources, among others to ensure they all support the firm’s overall strategy.
  • This bias potentially leaves certain intrafirm groups with unidentified slack, while other groups may lack the resources needed to fully support the implementation of a firm’s leveraging strategy.
  • With greater synchronization, managers can coordinate resources across processes and between units allocating resources to meet each unit’s specific needs, thereby enabling greater utilization of resources across the firm (Bonaccorsi and Lipparini, 1994, Emden, et al., 2006).
  • The authors predict that synchronization positively moderates the relationship between each leveraging strategy and innovation.

3.1. Differential Impact of Performance

  • Next, the authors suggest that the relationship among synchronization, leveraging strategy, and innovation is also influenced by the firm’s relative performance context, which plays an important role in the choice of leveraging strategy.
  • By increasing efficiencies and reducing friction, synchronization enables a poorly performing firm to build the slack resources it needs, which can then be deployed to build new capabilities as part of an entrepreneurial leveraging strategy.
  • At the same time, the authors ensured that the variance of the levels of each item was balanced so that each item had approximately the same probability of influencing a respondent’s evaluations of the scenarios.

4.2. Sample

  • To invite study participants, the authors used a random subsample of 600 entrepreneurial firms that had been identified by the entrepreneurship center of a large research university located in the southwestern United States.
  • Alternatively, the authors employ subgroup analysis to test their high and low performance predictions, as these hypotheses suggest changes in the strength of the relationship.
  • The authors findings are the first to provide empirical support for the contention that synchronization of the firm’s resources, capabilities, and strategies plays a critical role in the development of innovation (Sirmon, et al., 2011).

Did you find this useful? Give us your feedback

Citations
More filters
Journal ArticleDOI
TL;DR: This paper explores durational aspects of pauses gaps and overlaps in three different conversational corpora with a view to challenge claims about precision timing in turn-taking Distributions of p distributions.

390 citations

Proceedings ArticleDOI
04 Sep 2017
TL;DR: Key challenges are discussed, including: designing for interruptability; reconsideration of the human metaphor; issues of trust and data ownership; and addressing these challenges may lead to more widespread IPA use.
Abstract: Intelligent Personal Assistants (IPAs) are widely available on devices such as smartphones. However, most people do not use them regularly. Previous research has studied the experiences of frequent IPA users. Using qualitative methods we explore the experience of infrequent users: people who have tried IPAs, but choose not to use them regularly. Unsurprisingly infrequent users share some of the experiences of frequent users, e.g. frustration at limitations on fully hands-free interaction. Significant points of contrast and previously unidentified concerns also emerge. Cultural norms and social embarrassment take on added significance for infrequent users. Humanness of IPAs sparked comparisons with human assistants, juxtaposing their limitations. Most importantly, significant concerns emerged around privacy, monetization, data permanency and transparency. Drawing on these findings we discuss key challenges, including: designing for interruptability; reconsideration of the human metaphor; issues of trust and data ownership. Addressing these challenges may lead to more widespread IPA use.

260 citations


Additional excerpts

  • ...This humanness may set unrealistic expectations [17,34]....

    [...]

Proceedings ArticleDOI
30 Mar 2009
TL;DR: This model enables the precise specification of incremental systems and hence facilitates detailed comparisons between systems, as well as giving guidance on designing new systems.
Abstract: We present a general model and conceptual framework for specifying architectures for incremental processing in dialogue systems, in particular with respect to the topology of the network of modules that make up the system, the way information flows through this network, how information increments are 'packaged', and how these increments are processed by the modules. This model enables the precise specification of incremental systems and hence facilitates detailed comparisons between systems, as well as giving guidance on designing new systems.

206 citations


Cites background from "Towards human-like spoken dialogue ..."

  • ...…give some examples of differences in system architectures that we want to capture, with respect to the topology of the network of modules that make up the system, the way information flows through this network and how the modules process information, in particular how they deal with incrementality....

    [...]

Proceedings ArticleDOI
30 Mar 2009
TL;DR: This paper describes a fully incremental dialogue system that can engage in dialogues in a simple domain, number dictation, and shows that naive users preferred this system over a non-incremental version, and that it was perceived as more human-like.
Abstract: This paper describes a fully incremental dialogue system that can engage in dialogues in a simple domain, number dictation. Because it uses incremental speech recognition and prosodic analysis, the system can give rapid feedback as the user is speaking, with a very short latency of around 200ms. Because it uses incremental speech synthesis and self-monitoring, the system can react to feedback from the user as the system is speaking. A comparative evaluation shows that naive users preferred this system over a non-incremental version, and that it was perceived as more human-like.

149 citations


Cites background from "Towards human-like spoken dialogue ..."

  • ...Therefore, in order to make the task more feasible, we have chosen a very limited domain – what might be called a micro-domain (cf. Edlund et al., 2008): the dictation of number sequences....

    [...]

  • ...We will here only briefly describe the parts of the general model that are relevant for the exposition of our system....

    [...]

Book ChapterDOI
20 May 2009
TL;DR: This book throws light on the paradigm shift in the area of HCI that rather than humans interactive directly with machines, computers should observe and understand human interaction, and support humans during their work and interaction in an implicit and proactive manner.
Abstract: This book integrates a wide range of research topics related to and necessary for the development of proactive, smart, computers in the human interaction loop, including the development of audio-visual perceptual components for such environments; the design, implementation and analysis of novel proactive perceptive services supporting humans; the development of software architectures, ontologies and tools necessary for building such environments and services, as well as approaches for the evaluation of such technologies and services. The book is based on a major European Integrated Project, CHLI (Computers in the Human Interaction Loop), and throws light on the paradigm shift in the area of HCI that rather than humans interactive directly with machines, computers should observe and understand human interaction, and support humans during their work and interaction in an implicit and proactive manner.

103 citations


Cites background from "Towards human-like spoken dialogue ..."

  • ...This, however, increases the demands on a number of conversational abilities [32]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this paper, it was shown that people are sometimes unaware of the existence of a stimulus that influenced a response, unaware of its existence, and unaware that the stimulus has affected the response.
Abstract: Evidence is reviewed which suggests that there may be little or no direct introspective access to higher order cognitive processes. Subjects are sometimes (a) unaware of the existence of a stimulus that importantly influenced a response, (b) unaware of the existence of the response, and (c) unaware that the stimulus has affected the response. It is proposed that when people attempt to report on their cognitive processes, that is, on the processes mediating the effects of a stimulus on a response, they do not do so on the basis of any true introspection. Instead, their reports are based on a priori, implicit causal theories, or judgments about the extent to which a particular stimulus is a plausible cause of a given response. This suggests that though people may not be able to observe directly their cognitive processes, they will sometimes be able to report accurately about them. Accurate reports will occur when influential stimuli are salient and are plausible causes of the responses they produce, and will not occur when stimuli are not salient or are not plausible causes.

10,186 citations

Journal ArticleDOI
01 Oct 1950-Mind

7,266 citations

01 Jan 2007
TL;DR: In this article, the authors reveal how smart design is the new competitive frontier, and why some products satisfy customers while others only frustrate them, and how to choose the ones that satisfy customers.
Abstract: Revealing how smart design is the new competitive frontier, this innovative book is a powerful primer on how--and why--some products satisfy customers while others only frustrate them.

7,238 citations

Book
01 Jan 1950
TL;DR: If the meaning of the words “machine” and “think” are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, “Can machines think?” is to be sought in a statistical survey such as a Gallup poll.
Abstract: I propose to consider the question, “Can machines think?”♣ This should begin with definitions of the meaning of the terms “machine” and “think”. The definitions might be framed so as to reflect so far as possible the normal use of the words, but this attitude is dangerous. If the meaning of the words “machine” and “think” are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, “Can machines think?” is to be sought in a statistical survey such as a Gallup poll.

6,137 citations

Book
01 Jan 1988
TL;DR: Revealing how smart design is the new competitive frontier, this innovative book is a powerful primer on how--and why--some products satisfy customers while others only frustrate them.
Abstract: Revealing how smart design is the new competitive frontier, this innovative book is a powerful primer on how--and why--some products satisfy customers while others only frustrate them.

6,027 citations


"Towards human-like spoken dialogue ..." refers background in this paper

  • ...Others have made similar claims, see for example the HMI concept of mental models in (Norman, 1983), popularised in (Norman, 1998), and the discussion on conversational dialogue in (Allen et al., 2001)....

    [...]

  • ...Discussing human-like spoken dialogue systems implicitly requires that we formulate what ‘‘human-like” means, and the next two sections provide background on the concept of human-likeness....

    [...]

Frequently Asked Questions (9)
Q1. What contributions have the authors mentioned in the paper "Towards human-like spoken dialogue systems" ?

In this paper, the authors present an analysis of how users perceive spoken dialogue systems in terms of other, more familiar things. 

In other words, tests for response implicitly test perception and understandingas well, but as response tests are considerably more expensive, it is prudent to test candidates using perception and understanding tests first. 

Examples of such features include different durations, such as the lengths of pauses, overlapping speech and utterances, which can be accessed using speech activity detection; prosodic features, such as intensity and pitch; and turn-taking patterns, which can be accessed from SAD decisions and an interaction model (e.g. Brady, 1968). 

Nigel Ward’s humming machine (Ward & Tsukahara, 2000) is an example of a machine that can potentially make a human believe she is talking to another human, if presented in an appropriate context – like a telephone conversation where one person does most of the talking. 

the lack of realism in traditional Wizard-of-Oz collections, a method which was coined in-service Wizard-of-Oz data collection was introduced, in which the wizards were real customer care operators and the callers were real customers with real problems. 

If the wizards are allowed to use whatever means they are given to the best of their ability and any restrictions imposed on them are encoded in the software, then the wizards’ actions represent the target the component designer should aim at – an idea akin to Paek (2001), who suggests using human wizards’ behaviour as a gold standard for dialogue components. 

Other more novel twists include manipulating human-human dialogues on-line, effectively treating both participants as subjects and recording continuous judgments of some parameter by a panel of reviewers equipped with different kinds of audience response systems. 

There is also a growing interest for human-likeness in spoken dialogue systems amongst researchers (e.g. Philips, 2006, keynote speech, Interspeech, Pittsburgh, PA, US; Zue, 2007), and many researchers have made a case for anthropomorphism in spoken dialogue systems. 

For one thing, no reports of uncanny valley effects from users actually interacting with human-like spoken dialogue systems are known to us, possibly because spoken dialogue systems aiming at human-likeness are yet too immature.