scispace - formally typeset
Open AccessProceedings Article

GLoMo: Unsupervised Learning of Transferable Relational Graphs

TLDR
This work explores the possibility of learning generic latent relational graphs that capture dependencies between pairs of data units from large-scale unlabeled data and transferring the graphs to downstream tasks, and shows that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have been trained.
Abstract
Modern deep transfer learning approaches have mainly focused on learning generic feature vectors from one task that are transferable to other tasks, such as word embeddings in language and pretrained convolutional features in vision. However, these approaches usually transfer unary features and largely ignore more structured graphical representations. This work explores the possibility of learning generic latent relational graphs that capture dependencies between pairs of data units (e.g., words or pixels) from large-scale unlabeled data and transferring the graphs to downstream tasks. Our proposed transfer learning framework improves performance on various tasks including question answering, natural language inference, sentiment analysis, and image classification. We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

ROMA: Multi-Agent Reinforcement Learning with Emergent Roles

TL;DR: Experiments show that the proposed role-oriented MARL framework (ROMA) can learn specialized, dynamic, and identifiable roles, which help the method push forward the state of the art on the StarCraft II micromanagement benchmark.
Proceedings Article

Learning Dynamic Belief Graphs to Generalize on Text-Based Games

TL;DR: This work proposes a novel graph-aided transformer agent (GATA) that infers and updates latent belief graphs during planning to enable effective action selection by capturing the underlying game dynamics.
Proceedings Article

ROMA: Multi-Agent Reinforcement Learning with Emergent Roles

TL;DR: In this paper, a role-oriented multi-agent reinforcement learning (ROMA) framework is proposed, where roles are emergent and agents with similar roles tend to share their learning and to be specialized on certain sub-tasks.
Posted Content

Learning Dynamic Belief Graphs to Generalize on Text-Based Games

TL;DR: This article propose a graph-aided transformer agent (GATA) that infers and updates latent belief graphs during planning to enable effective action selection by capturing the underlying game dynamics, and demonstrate that the learned graph-based representations help agents converge to better policies than their text-only counterparts and facilitate effective generalization across game configurations.
Journal ArticleDOI

Graph Interaction Networks for Relation Transfer in Human Activity Videos

TL;DR: This work proposes a graph interaction networks (GINs) model for transferring relation knowledge across two graphs, which focuses on a “self-learned” weight matrix, which is a higher-level representation of the input data.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings Article

Attention is All you Need

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Related Papers (5)