Using recurrent neural networks for slot filling in spoken language understanding

doi:10.1109/TASLP.2014.2383614

Journal ArticleDOI

Using recurrent neural networks for slot filling in spoken language understanding

Grégoire Mesnil, +10 more

- 01 Mar 2015 -

IEEE Transactions on Audio, Speech, and ...

- Vol. 23, Iss: 3, pp 530-539

TLDR

This paper implemented and compared several important RNN architectures, including Elman, Jordan, and hybrid variants, and implemented these networks with the publicly available Theano neural network toolkit and completed experiments on the well-known airline travel information system (ATIS) benchmark.

Abstract:

Semantic slot filling is one of the most challenging problems in spoken language understanding (SLU). In this paper, we propose to use recurrent neural networks (RNNs) for this task, and present several novel architectures designed to efficiently model past and future temporal dependencies. Specifically, we implemented and compared several important RNN architectures, including Elman, Jordan, and hybrid variants. To facilitate reproducibility, we implemented these networks with the publicly available Theano neural network toolkit and completed experiments on the well-known airline travel information system (ATIS) benchmark. In addition, we compared the approaches on two custom SLU data sets from the entertainment and movies domains. Our results show that the RNN-based models outperform the conditional random field (CRF) baseline by 2% in absolute error reduction on the ATIS benchmark. We improve the state-of-the-art by 0.5% in the Entertainment domain, and 6.7% for the movies domain.

Citations

PDF

Open Access

More filters

Posted Content

Deep Reinforcement Learning: An Overview

Yuxi Li

- 25 Jan 2017 -

arXiv: Learning

TL;DR: This work discusses core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration, and important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.

...read moreread less

Posted Content

MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling

Paweł Budzianowski, +6 more

- 29 Sep 2018 -

arXiv: Computation and Language

TL;DR: The Multi-Domain Wizard-of-Oz dataset (MultiWOZ) as discussed by the authors is a fully-labeled collection of human-human written conversations spanning over multiple domains and topics.

...read moreread less

Posted Content

Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces

Alice Coucke, +11 more

- 25 May 2018 -

arXiv: Computation and Language

TL;DR: The machine learning architecture of the Snips Voice Platform is presented, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices that is fast and accurate while enforcing privacy by design, as no personal user data is ever collected.

...read moreread less

Proceedings ArticleDOI

Neural Belief Tracker: Data-Driven Dialogue State Tracking

Nikola Mrkšić, +4 more

TL;DR: This work proposes a novel Neural Belief Tracking (NBT) framework which overcomes past limitations, matching the performance of state-of-the-art models which rely on hand-crafted semantic lexicons and outperforming them when such lexicons are not provided.

...read moreread less

Proceedings ArticleDOI

Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM.

Dilek Hakkani-Tur, +6 more

TL;DR: Experimental results show the power of a holistic multi-domain, multi-task modeling approach to estimate complete semantic frames for all user utterances addressed to a conversational system over alternative methods based on single domain/task deep learning.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal ArticleDOI

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton, +2 more

- 01 Jul 2006 -

Neural Computation

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

...read moreread less

Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +2 more

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less

Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +3 more

Journal ArticleDOI

Finding Structure in Time

Jeffrey L. Elman

- 01 Mar 1990 -

Cognitive Science

TL;DR: A proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory and suggests a method for representing lexical categories and the type/token distinction is developed.

...read moreread less

Collapse

Using recurrent neural networks for slot filling in spoken language understanding

Citations

Deep Reinforcement Learning: An Overview

MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling

Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces

Neural Belief Tracker: Data-Driven Dialogue State Tracking

Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM.

References

ImageNet Classification with Deep Convolutional Neural Networks

A fast learning algorithm for deep belief nets

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Probabilistic Models for Segmenting and Labeling Sequence Data

Finding Structure in Time

Related Papers (5)

Long short-term memory

Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling.

Glove: Global Vectors for Word Representation

The ATIS spoken language systems pilot corpus

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding