scispace - formally typeset
Open AccessProceedings ArticleDOI

Neural Belief Tracker: Data-Driven Dialogue State Tracking

TLDR
This work proposes a novel Neural Belief Tracking (NBT) framework which overcomes past limitations, matching the performance of state-of-the-art models which rely on hand-crafted semantic lexicons and outperforming them when such lexicons are not provided.
Abstract
One of the core components of modern spoken dialogue systems is the belief tracker, which estimates the user’s goal at every step of the dialogue. However, most current approaches have difficulty scaling to larger, more complex dialogue domains. This is due to their dependency on either: a) Spoken Language Understanding models that require large amounts of annotated training data; or b) hand-crafted lexicons for capturing some of the linguistic variation in users’ language. We propose a novel Neural Belief Tracking (NBT) framework which overcomes these problems by building on recent advances in representation learning. NBT models reason over pre-trained word vectors, learning to compose them into distributed representations of user utterances and dialogue context. Our evaluation on two datasets shows that this approach surpasses past limitations, matching the performance of state-of-the-art models which rely on hand-crafted semantic lexicons and outperforming them when such lexicons are not provided.

read more

Citations
More filters
Journal ArticleDOI

A Survey on Dialogue Systems: Recent Advances and New Frontiers

TL;DR: The authors divide existing dialogue systems into task-oriented and nontask-oriented models, then detail how deep learning techniques help them with representative algorithms and finally discuss some appealing research directions that can bring the dialogue system research into a new frontier.
Proceedings ArticleDOI

Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures

TL;DR: A novel, holistic, extendable framework based on a single sequence-to-sequence (seq2seq) model which can be optimized with supervised or reinforcement learning is proposed which significantly outperforms state- of-the-art pipeline-based methods on large datasets and retains a satisfactory entity match rate on out-of-vocabulary (OOV) cases where pipeline-designed competitors totally fail.
Proceedings ArticleDOI

Neural Approaches to Conversational AI

TL;DR: This tutorial surveys neural approaches to conversational AI that were developed in the last few years, and presents a review of state-of-the-art neural approaches, drawing the connection between neural approaches and traditional symbolic approaches.
Posted Content

A Simple Language Model for Task-Oriented Dialogue

TL;DR: SimpleTOD is a simple approach to task-oriented dialogue that uses a single causal language model trained on all sub-tasks recast as a single sequence prediction problem, which allows it to fully leverage transfer learning from pre-trained, open domain, causal language models such as GPT-2.
Journal ArticleDOI

Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset

TL;DR: This work introduces the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains, and presents a schema-guided paradigm for task-oriented dialogue, in which predictions are made over a dynamic set of intents and slots provided as input.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal Article

Dropout: a simple way to prevent neural networks from overfitting

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Journal Article

Visualizing Data using t-SNE

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Proceedings Article

Rectified Linear Units Improve Restricted Boltzmann Machines

TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.
Related Papers (5)