Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots
Yu Wu,Wei Wu,Chen Xing,Ming Zhou,Zhoujun Li +4 more
- Vol. 1, pp 496-505
TLDR
Zhang et al. as mentioned in this paper proposed a sequential matching network (SMN) which first matches a response with each utterance in the context on multiple levels of granularity, and distills important matching information from each pair as a vector with convolution and pooling operations.Abstract:
We study response selection for multi-turn conversation in retrieval based chatbots. Existing work either concatenates utterances in context or matches a response with a highly abstract context vector finally, which may lose relationships among the utterances or important information in the context. We propose a sequential matching network (SMN) to address both problems. SMN first matches a response with each utterance in the context on multiple levels of granularity, and distills important matching information from each pair as a vector with convolution and pooling operations. The vectors are then accumulated in a chronological order through a recurrent neural network (RNN) which models relationships among the utterances. The final matching score is calculated with the hidden states of the RNN. Empirical study on two public data sets shows that SMN can significantly outperform state-of-the-art methods for response selection in multi-turn conversation.read more
Citations
More filters
Journal ArticleDOI
A Survey on Dialogue Systems: Recent Advances and New Frontiers
TL;DR: The authors divide existing dialogue systems into task-oriented and nontask-oriented models, then detail how deep learning techniques help them with representative algorithms and finally discuss some appealing research directions that can bring the dialogue system research into a new frontier.
Proceedings ArticleDOI
Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network
TL;DR: This paper investigates matching a response with its multi-turn context using dependency information based entirely on attention using Transformer in machine translation and extends the attention mechanism in two ways, which jointly introduce those two kinds of attention in one uniform neural network.
Book ChapterDOI
The Second Conversational Intelligence Challenge (ConvAI2)
Emily Dinan,Varvara Logacheva,Valentin Malykh,Alexander H. Miller,Kurt Shuster,Jack Urbanek,Douwe Kiela,Arthur Szlam,Iulian Vlad Serban,Ryan Lowe,Ryan Lowe,Shrimai Prabhumoye,Alan W. Black,Alexander I. Rudnicky,Jason D. Williams,Joelle Pineau,Joelle Pineau,Mikhail S. Burtsev,Jason Weston +18 more
TL;DR: To improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations)—in terms of repetition, consistency and balance of dialogue acts.
Journal ArticleDOI
A Deep Look into neural ranking models for information retrieval
Jiafeng Guo,Yixing Fan,Liang Pang,Liu Yang,Qingyao Ai,Hamed Zamani,Chen Wu,W. Bruce Croft,Xueqi Cheng +8 more
TL;DR: A deep look into the neural ranking models from different dimensions is taken to analyze their underlying assumptions, major design principles, and learning strategies to obtain a comprehensive empirical understanding of the existing techniques.
Book ChapterDOI
An Overview of Chatbot Technology
TL;DR: A historical overview of the evolution of the international community’s interest in chatbots is presented, and the motivations that drive the use of chatbots are discussed, and chatbots’ usefulness in a variety of areas is clarified.
References
More filters
Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content
Empirical evaluation of gated recurrent neural networks on sequence modeling
TL;DR: These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM.
Posted Content
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-Rfou,Guillaume Alain,Amjad Almahairi,Christof Angermueller,Dzmitry Bahdanau,Nicolas Ballas,Frédéric Bastien,Justin Bayer,Anatoly Belikov,Alexander Belopolsky,Yoshua Bengio,Arnaud Bergeron,James Bergstra,Valentin Bisson,Josh Bleecher Snyder,Nicolas Bouchard,Nicolas Boulanger-Lewandowski,Xavier Bouthillier,Alexandre de Brébisson,Olivier Breuleux,Pierre Luc Carrier,Kyunghyun Cho,Jan Chorowski,Paul F. Christiano,Tim Cooijmans,Marc-Alexandre Côté,Myriam Côté,Aaron Courville,Yann N. Dauphin,Olivier Delalleau,Julien Demouth,Guillaume Desjardins,Sander Dieleman,Laurent Dinh,Mélanie Ducoffe,Vincent Dumoulin,Samira Ebrahimi Kahou,Dumitru Erhan,Ziye Fan,Orhan Firat,Mathieu Germain,Xavier Glorot,Ian Goodfellow,Matthew M. Graham,Caglar Gulcehre,Philippe Hamel,Iban Harlouchet,Jean-Philippe Heng,Balázs Hidasi,Sina Honari,Arjun Jain,Sébastien Jean,Kai Jia,Mikhail Korobov,Vivek Kulkarni,Alex Lamb,Pascal Lamblin,Eric Larsen,César Laurent,Sean Lee,Simon Lefrancois,Simon Lemieux,Nicholas Léonard,Zhouhan Lin,Jesse A. Livezey,Cory Lorenz,Jeremiah Lowin,Qianli Ma,Pierre-Antoine Manzagol,Olivier Mastropietro,Robert T. McGibbon,Roland Memisevic,Bart van Merriënboer,Vincent Michalski,Mehdi Mirza,Alberto Orlandi,Chris Pal,Razvan Pascanu,Mohammad Pezeshki,Colin Raffel,Daniel Renshaw,Matthew Rocklin,Adriana Romero,Markus Roth,Peter Sadowski,John Salvatier,François Savard,Jan Schlüter,John Schulman,Gabriel Schwartz,Iulian Vlad Serban,Dmitriy Serdyuk,Samira Shabanian,Étienne Simon,Sigurd Spieckermann,S. Ramana Subramanyam,Jakub Sygnowski,Jérémie Tanguay,Gijs van Tulder,Joseph Turian,Sebastian Urban,Pascal Vincent,Francesco Visin,Harm de Vries,David Warde-Farley,Dustin J. Webb,Matthew Willson,Kelvin Xu,Lijun Xue,Li Yao,Saizheng Zhang,Ying Zhang +111 more
TL;DR: The performance of Theano is compared against Torch7 and TensorFlow on several machine learning models and recently-introduced functionalities and improvements are discussed.