Spoken language interaction with robots: Recommendations for future research
Matthew Marge,Carol Y. Espy-Wilson,Nigel Ward,Abeer Alwan,Yoav Artzi,Mohit Bansal,G.L. Blankenship,Joyce Y. Chai,Hal Daumé,Debadeepta Dey,Mary P. Harper,Thomas M. Howard,Casey Kennington,Ivana Kruijff-Korbayová,Dinesh Manocha,Cynthia Matuszek,Ross Mead,Raymond J. Mooney,Roger K. Moore,Mari Ostendorf,Heather Pon-Barry,Alexander I. Rudnicky,Matthias Scheutz,Robert St. Amant,Tong Sun,Stefanie Tellex,David Traum,Zhou Yu +27 more
Reads0
Chats0
TLDR
This article identifies key scientific and engineering advances needed to enable effective spoken language interaction with robotics, and makes 25 recommendations, involving eight general themes: putting human needs first, better modeling the social and interactive aspects of language, improving robustness, creating new methods for rapid adaptation, and improving research infrastructure and resources.About:
This article is published in Computer Speech & Language.The article was published on 2022-01-01 and is currently open access. It has received 38 citations till now. The article focuses on the topics: Computer science & Spoken language.read more
Citations
More filters
Journal Article
Voice User Interface Design
TL;DR: The voice user interface design is one book that the authors really recommend you to read, to get more solutions in solving this problem.
Proceedings ArticleDOI
Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems
TL;DR: The concept of full-duplex in telecommunication is used to demonstrate what a human-like interactive experience should be and how to achieve smooth turn-taking through three subtasks: user state detection, backchannel selection, and barge-in detection.
Proceedings ArticleDOI
On the Utility of Self-Supervised Models for Prosody-Related Tasks
TL;DR: A new evaluation framework, “SUPERB-prosody,” is presented, consisting of three prosody-related downstream tasks and two pseudo tasks, which concludes that SSL speech models are highly effective for prosodic-related tasks.
Book ChapterDOI
Socially Interactive Agent Dialogue
TL;DR: In this paper , Traum, from the University of Southern California, discusses the importance of questions and answers in dialogue and elaborations, explanations, suggestions, inferences, and explanations.
References
More filters
Journal ArticleDOI
Human-Robot Interaction: Status and Challenges.
TL;DR: The current status of human–robot interaction (HRI) is reviewed, and key current research challenges for the human factors community are described.
Journal ArticleDOI
Social eye gaze in human-robot interaction: a review
Henny Admoni,Brian Scassellati +1 more
TL;DR: In this article, the state of the art in social eye gaze for human-robot interaction (HRI) is reviewed, defined by differences in goals and methods: a human-centered approach, which focuses on people's responses to gaze; a design-centered attention, which addresses the features of robot gaze behavior and appearance that improve interaction; and a technology-centered focus, which is concentrated on the computational tools for implementing social gaze in robots.
Journal ArticleDOI
The Benefits of Interactions with Physically Present Robots over Video-Displayed Agents
TL;DR: Questionnaire data support these behavioral findings and show that participants had an overall more positive interaction with the physically present robot, than when it was shown on live video.
Book
Voice User Interface Design
TL;DR: In this article, the authors present a comprehensive and authoritative guide to voice user interface (VUI) design for automated speech recognition (ASR) systems, which is based on linguistics, psychology, and language technology, and is illustrated by examples drawn from the authors' work at Nuance Communications.
Proceedings Article
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang,Daisy Stanton,Yu Zhang,RJ Skerry-Ryan,Eric Battenberg,Joel Shor,Ying Xiao,Fei Ren,Ye Jia,Rif A. Saurous +9 more
TL;DR: In this article, a bank of embeddings that are jointly trained within Tacotron, a state-of-the-art end-to-end speech synthesis system, is proposed.