A stochastic model of human-machine interaction for learning dialog strategies

doi:10.1109/89.817450

Journal ArticleDOI

A stochastic model of human-machine interaction for learning dialog strategies

Esther Levin, +2 more

- 01 Jan 2000 -

IEEE Transactions on Speech and Audio Pr...

- Vol. 8, Iss: 1, pp 11-23

Chats0

TLDR

The experimental results show that it is indeed possible to find a simple criterion, a state space representation, and a simulated user parameterization in order to automatically learn a relatively complex dialog behavior, similar to one that was heuristically designed by several research groups.

Abstract:

We propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given application. We also show that any dialog system can be formally described as a sequential decision process in terms of its state space, action set, and strategy. With additional assumptions about the state transition probabilities and cost assignment, a dialog system can be mapped to a stochastic model known as Markov decision process (MDP). A variety of data driven algorithms for finding the optimal strategy (i.e., the one that optimizes the criterion) is available within the MDP framework, based on reinforcement learning. For an effective use of the available training data we propose a combination of supervised and reinforcement learning: the supervised learning is used to estimate a model of the user, i.e., the MDP parameters that quantify the user's behavior. Then a reinforcement learning algorithm is used to estimate the optimal strategy while the system interacts with the simulated user. This approach is tested for learning the strategy in an air travel information system (ATIS) task. The experimental results we present in this paper show that it is indeed possible to find a simple criterion, a state space representation, and a simulated user parameterization in order to automatically learn a relatively complex dialog behavior, similar to one that was heuristically designed by several research groups.

A stochastic model of human-machine interaction for learning dialog strategies

Citations

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

A Diversity-Promoting Objective Function for Neural Conversation Models

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models

Partially observable Markov decision processes for spoken dialog systems

Deep Reinforcement Learning for Dialogue Generation

References

Reinforcement Learning: An Introduction

Fundamentals of speech recognition

Reinforcement learning: a survey

Reinforcement Learning: A Survey

Statistical Language Learning

Related Papers (5)

Partially observable Markov decision processes for spoken dialog systems

Optimizing dialogue management with reinforcement learning: experiments with the NJFun system

A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies

Spoken dialogue management using probabilistic reasoning

Reinforcement Learning: An Introduction