Learning symmetric collaborative dialogue agents with dynamic knowledge graph embeddings

doi:10.18653/V1/P17-1162

Home
/
Papers
/
Learning symmetric collaborative dialogue agents with dynamic knowledge graph embeddings

Proceedings Article•DOI•

Learning symmetric collaborative dialogue agents with dynamic knowledge graph embeddings

He He¹, Anusha Balakrishnan¹, Mihail Eric¹, Percy Liang¹•Institutions (1)

Stanford University¹

01 Jul 2017-Vol. 1, pp 1766-1776

TL;DR: This paper proposed a neural model with dynamic knowledge graph embeddings that evolve as the dialogue progresses to model both structured knowledge and unstructured language, which exhibited interesting lexical, semantic, and strategic elements.

read less

Abstract: We study a symmetric collaborative dialogue setting in which two agents, each with private knowledge, must strategically communicate to achieve a common goal. The open-ended dialogue state in this setting poses new challenges for existing dialogue systems. We collected a dataset of 11K human-human dialogues, which exhibits interesting lexical, semantic, and strategic elements. To model both structured knowledge and unstructured language, we propose a neural model with dynamic knowledge graph embeddings that evolve as the dialogue progresses. Automatic and human evaluations show that our model is both more effective at achieving the goal and more human-like than baseline neural and rule-based models.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

QuAC: Question Answering in Context

[...]

Eunsol Choi¹, He He², Mohit Iyyer³, Mohit Iyyer⁴, Mark Yatskar⁴, Wen-tau Yih⁴, Yejin Choi¹, Yejin Choi⁴, Percy Liang⁵, Luke Zettlemoyer¹ - Show less +6 more•Institutions (5)

University of Washington¹, New York University², University of Massachusetts Amherst³, Allen Institute for Artificial Intelligence⁴, Stanford University⁵

21 Aug 2018

TL;DR: QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as it shows in a detailed qualitative evaluation.

...read moreread less

Abstract: We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total). The dialogs involve two crowd workers: (1) a student who poses a sequence of freeform questions to learn as much as possible about a hidden Wikipedia text, and (2) a teacher who answers the questions by providing short excerpts from the text. QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as we show in a detailed qualitative evaluation. We also report results for a number of reference models, including a recently state-of-the-art reading comprehension architecture extended to model dialog context. Our best model underperforms humans by 20 F1, suggesting that there is significant room for future work on this data. Dataset, baseline, and leaderboard available at http://quac.ai.

...read moreread less

690 citations

Proceedings Article•

A Knowledge-Grounded Neural Conversation Model

[...]

Marjan Ghazvininejad¹, Chris Brockett², Ming-Wei Chang², Bill Dolan², Jianfeng Gao², Wen-tau Yih², Michel Galley² - Show less +3 more•Institutions (2)

Information Sciences Institute¹, Microsoft²

09 Feb 2017

TL;DR: The authors generalize the Seq2Seq approach by conditioning responses on both conversation history and external "facts", allowing the model to be versatile and applicable in an open-domain setting.

...read moreread less

Abstract: Neural network models are capable of generating extremely natural sounding conversational interactions. However, these models have been mostly applied to casual scenarios (e.g., as “chatbots”) and have yet to demonstrate they can serve in more useful conversational applications. This paper presents a novel, fully data-driven, and knowledge-grounded neural conversation model aimed at producing more contentful responses. We generalize the widely-used Sequence-to-Sequence (Seq2Seq) approach by conditioning responses on both conversation history and external “facts”, allowing the model to be versatile and applicable in an open-domain setting. Our approach yields significant improvements over a competitive Seq2Seq baseline. Human judges found that our outputs are significantly more informative.

...read moreread less

376 citations

Proceedings Article•DOI•

Deal or No Deal? End-to-End Learning of Negotiation Dialogues

[...]

Michael Lewis¹, Denis Yarats², Yann N. Dauphin², Devi Parikh³, Dhruv Batra³ - Show less +1 more•Institutions (3)

University of Pittsburgh¹, Facebook², Georgia Institute of Technology³

16 Jun 2017

TL;DR: For the first time, it is shown it is possible to train end-to-end models for negotiation, which must learn both linguistic and reasoning skills with no annotated dialogue states, and this technique dramatically improves performance.

...read moreread less

Abstract: Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions. Negotiations require complex communication and reasoning skills, but success is easy to measure, making this an interesting task for AI. We gather a large dataset of human-human negotiations on a multi-issue bargaining task, where agents who cannot observe each other’s reward functions must reach an agreement (or a deal) via natural language dialogue. For the first time, we show it is possible to train end-to-end models for negotiation, which must learn both linguistic and reasoning skills with no annotated dialogue states. We also introduce dialogue rollouts, in which the model plans ahead by simulating possible complete continuations of the conversation, and find that this technique dramatically improves performance. Our code and dataset are publicly available.

...read moreread less

299 citations

Proceedings Article•DOI•

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

[...]

Deepak Nathani¹, Jatin Chauhan¹, Charu Sharma¹, Manohar Kaul¹•Institutions (1)

Indian Institute of Technology, Hyderabad¹

04 Jun 2019

TL;DR: The authors proposed an attention-based feature embedding that captures both entity and relation features in any given entity's neighborhood, and encapsulated relation clusters and multi-hop relations in their model.

...read moreread less

Abstract: The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention-based feature embedding that captures both entity and relation features in any given entity’s neighborhood. Additionally, we also encapsulate relation clusters and multi-hop relations in our model. Our empirical study offers insights into the efficacy of our attention-based model and we show marked performance gains in comparison to state-of-the-art methods on all datasets.

...read moreread less

298 citations

Proceedings Article•DOI•

Multi-hop knowledge graph reasoning with reward shaping

[...]

Xi Victoria Lin¹, Richard Socher¹, Caiming Xiong¹•Institutions (1)

Salesforce.com¹

01 Jan 2018

TL;DR: In this paper, a multi-hop knowledge graph reasoning with reward shaping is proposed, where the reasoning module receives a reward when a node corresponding to an observed answer is reached, and the reward is identified by a reward shaping network when the node not corresponding to an answer is not reached.

...read moreread less

Abstract: Approaches for multi-hop knowledge graph reasoning with reward shaping include a system and method of training a system to search relational paths in a knowledge graph The method includes identifying, using an reasoning module, a plurality of first outgoing links from a current node in a knowledge graph, masking, using the reasoning module, one or more links from the plurality of first outgoing links to form a plurality of second outgoing links, rewarding the reasoning module with a reward of one when a node corresponding to an observed answer is reached, and rewarding the reasoning module with a reward identified by a reward shaping network when a node not corresponding to an observed answer is reached In some embodiments, the reward shaping network is pre-trained

...read moreread less

282 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

Collapse

References

PDF

Open Access

More filters

Proceedings Article•

Neural Machine Translation by Jointly Learning to Align and Translate

[...]

Dzmitry Bahdanau¹, Kyunghyun Cho², Yoshua Bengio²•Institutions (2)

Jacobs University Bremen¹, Université de Montréal²

01 Jan 2015

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Abstract: Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

...read moreread less

20,027 citations

Posted Content•

Neural Machine Translation by Jointly Learning to Align and Translate

[...]

Dzmitry Bahdanau¹, Kyunghyun Cho², Yoshua Bengio²•Institutions (2)

Jacobs University Bremen¹, Université de Montréal²

01 Sep 2014-arXiv: Computation and Language

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

14,077 citations

Proceedings Article•

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

[...]

John C. Duchi¹, Elad Hazan², Yoram Singer³•Institutions (3)

University of California, Berkeley¹, IBM², Google³

01 Jan 2010

TL;DR: Adaptive subgradient methods as discussed by the authors dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning, which allows us to find needles in haystacks in the form of very predictive but rarely seen features.

...read moreread less

Abstract: We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning. Metaphorically, the adaptation allows us to find needles in haystacks in the form of very predictive but rarely seen features. Our paradigm stems from recent advances in stochastic optimization and online learning which employ proximal functions to control the gradient steps of the algorithm. We describe and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal function that can be chosen in hindsight. We give several efficient algorithms for empirical risk minimization problems with common and important regularization functions and domain constraints. We experimentally study our theoretical analysis and show that adaptive subgradient methods outperform state-of-the-art, yet non-adaptive, subgradient algorithms.

...read moreread less

7,244 citations

Proceedings Article•DOI•

A Diversity-Promoting Objective Function for Neural Conversation Models

[...]

Jiwei Li¹, Michel Galley², Chris Brockett³, Jianfeng Gao³, Bill Dolan³ - Show less +1 more•Institutions (3)

Stanford University¹, Carnegie Mellon University², Microsoft³

01 Mar 2016

TL;DR: The authors proposed using Maximum Mutual Information (MMI) as the objective function in neural models to generate more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets.

...read moreread less

Abstract: Sequence-to-sequence neural network models for generation of conversational responses tend to generate safe, commonplace responses (e.g., I don’t know) regardless of the input. We suggest that the traditional objective function, i.e., the likelihood of output (response) given input (message) is unsuited to response generation tasks. Instead we propose using Maximum Mutual Information (MMI) as the objective function in neural models. Experimental results demonstrate that the proposed MMI models produce more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets and in human evaluations.

...read moreread less

1,812 citations

Proceedings Article•

Building end-to-end dialogue systems using generative hierarchical neural network models

[...]

Iulian Vlad Serban¹, Alessandro Sordoni¹, Yoshua Bengio¹, Aaron Courville¹, Joelle Pineau² - Show less +1 more•Institutions (2)

Université de Montréal¹, McGill University²

12 Feb 2016

TL;DR: The authors extend the hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and backoff n-gram models.

...read moreread less

Abstract: We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and backoff n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.

...read moreread less

1,533 citations