scispace - formally typeset
Open AccessProceedings ArticleDOI

Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots

TLDR
Zhang et al. as mentioned in this paper proposed a sequential matching network (SMN) which first matches a response with each utterance in the context on multiple levels of granularity, and distills important matching information from each pair as a vector with convolution and pooling operations.
Abstract
We study response selection for multi-turn conversation in retrieval based chatbots. Existing work either concatenates utterances in context or matches a response with a highly abstract context vector finally, which may lose relationships among the utterances or important information in the context. We propose a sequential matching network (SMN) to address both problems. SMN first matches a response with each utterance in the context on multiple levels of granularity, and distills important matching information from each pair as a vector with convolution and pooling operations. The vectors are then accumulated in a chronological order through a recurrent neural network (RNN) which models relationships among the utterances. The final matching score is calculated with the hidden states of the RNN. Empirical study on two public data sets shows that SMN can significantly outperform state-of-the-art methods for response selection in multi-turn conversation.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A Survey on Dialogue Systems: Recent Advances and New Frontiers

TL;DR: The authors divide existing dialogue systems into task-oriented and nontask-oriented models, then detail how deep learning techniques help them with representative algorithms and finally discuss some appealing research directions that can bring the dialogue system research into a new frontier.
Proceedings ArticleDOI

Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network

TL;DR: This paper investigates matching a response with its multi-turn context using dependency information based entirely on attention using Transformer in machine translation and extends the attention mechanism in two ways, which jointly introduce those two kinds of attention in one uniform neural network.
Book ChapterDOI

The Second Conversational Intelligence Challenge (ConvAI2)

TL;DR: To improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations)—in terms of repetition, consistency and balance of dialogue acts.
Journal ArticleDOI

A Deep Look into neural ranking models for information retrieval

TL;DR: A deep look into the neural ranking models from different dimensions is taken to analyze their underlying assumptions, major design principles, and learning strategies to obtain a comprehensive empirical understanding of the existing techniques.
Book ChapterDOI

An Overview of Chatbot Technology

TL;DR: A historical overview of the evolution of the international community’s interest in chatbots is presented, and the motivations that drive the use of chatbots are discussed, and chatbots’ usefulness in a variety of areas is clarified.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content

Empirical evaluation of gated recurrent neural networks on sequence modeling

TL;DR: These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM.
Posted Content

Theano: A Python framework for fast computation of mathematical expressions

Rami Al-Rfou, +111 more
TL;DR: The performance of Theano is compared against Torch7 and TensorFlow on several machine learning models and recently-introduced functionalities and improvements are discussed.
Related Papers (5)