Zero-shot Entity Linking by Reading Entity Descriptions

doi:10.18653/V1/P19-1335

Open AccessProceedings ArticleDOI

Zero-shot Entity Linking by Reading Entity Descriptions

- pp 3449-3460

TLDR

It is shown that strong reading comprehension models pre-trained on large unlabeled data can be used to generalize to unseen entities and proposed domain-adaptive pre-training (DAP) is proposed to address the domain shift problem associated with linking unseen entities in a new domain.

Abstract:

We present the zero-shot entity linking task, where mentions must be linked to unseen entities without in-domain labeled data. The goal is to enable robust transfer to highly specialized domains, and so no metadata or alias tables are assumed. In this setting, entities are only identified by text descriptions, and models must rely strictly on language understanding to resolve the new entities. First, we show that strong reading comprehension models pre-trained on large unlabeled data can be used to generalize to unseen entities. Second, we propose a simple and effective adaptive pre-training strategy, which we term domain-adaptive pre-training (DAP), to address the domain shift problem associated with linking unseen entities in a new domain. We present experiments on a new dataset that we construct for this task and show that DAP improves over strong pre-training baselines, including BERT. The data and code are available at https://github.com/lajanugen/zeshel.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

Suchin Gururangan, +8 more

TL;DR: It is consistently found that multi-phase adaptive pretraining offers large gains in task performance, and it is shown that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable.

...read moreread less

Posted Content

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

Xiaozhi Wang, +6 more

- 13 Nov 2019 -

arXiv: Computation and Language

TL;DR: A unified model for Knowledge Embedding and Pre-trained LanguagERepresentation (KEPLER), which can not only better integrate factual knowledge into PLMs but also produce effective text-enhanced KE with the strong PLMs is proposed.

...read moreread less

Proceedings ArticleDOI

Scalable Zero-shot Entity Linking with Dense Entity Retrieval

Ledell Wu, +4 more

TL;DR: This paper introduces a simple and effective two-stage approach for zero-shot linking, based on fine-tuned BERT architectures, and shows that it performs well in the non-zero-shot setting, obtaining the state-of-the-art result on TACKBP-2010.

...read moreread less

Posted Content

Autoregressive Entity Retrieval

Nicola De Cao, +3 more

- 02 Oct 2020 -

arXiv: Computation and Language

TL;DR: This article proposed GENRE, the first system that retrieves entities by generating their unique names, left to right, token-by-token in an autoregressive fashion, effectively cross encoding both context and entity name.

...read moreread less

Journal ArticleDOI

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

Xiaozhi Wang, +6 more

- 11 Mar 2021 -

Transactions of the Association for Comp...

TL;DR: The authors proposed a unified model for knowledge embedding and pre-trained LanguagE representation (KEPLER), which can not only better integrate factual knowledge into PLMs but also produce effective text-enhanced KE with the strong PLMs.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Proceedings ArticleDOI

Deep contextualized word representations

Matthew E. Peters, +6 more

TL;DR: This paper introduced a new type of deep contextualized word representation that models both complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).

...read moreread less

Book ChapterDOI

Domain-adversarial training of neural networks

Yaroslav Ganin, +7 more

- 01 Jan 2016 -

Journal of Machine Learning Research

TL;DR: In this article, a new representation learning approach for domain adaptation is proposed, in which data at training and test time come from similar but different distributions, and features that cannot discriminate between the training (source) and test (target) domains are used to promote the emergence of features that are discriminative for the main learning task on the source domain.

...read moreread less

Proceedings ArticleDOI

Universal Language Model Fine-tuning for Text Classification

Jeremy Howard, +1 more

TL;DR: Universal Language Model Fine-tuning (ULMFiT) as mentioned in this paper is an effective transfer learning method that can be applied to any task in NLP, and introduces techniques that are key for finetuning a language model.

...read moreread less

Collapse

Related Papers (5)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, +9 more

- 26 Jul 2019 -

arXiv: Computation and Language

Zero-shot Entity Linking by Reading Entity Descriptions

Citations

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

Scalable Zero-shot Entity Linking with Dense Entity Retrieval

Autoregressive Entity Retrieval

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

References

Attention is All you Need

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Deep contextualized word representations

Domain-adversarial training of neural networks

Universal Language Model Fine-tuning for Text Classification

Related Papers (5)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Attention is All you Need

Adam: A Method for Stochastic Optimization

Deep contextualized word representations