Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

Chats0

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Identifying inherent disagreement in natural language inference

Xinliang Frederick Zhang, +1 more

TL;DR: This paper investigates how to tease systematic inferences apart from disagreement items, and proposes Artificial Annotators (AAs) to simulate the uncertainty in the annotation process by capturing the modes in annotations.

...read moreread less

Posted Content

Dealing with Typos for BERT-based Passage Retrieval and Ranking

Shengyao Zhuang, +1 more

- 27 Aug 2021 -

arXiv: Information Retrieval

TL;DR: This article proposed a simple typos-aware training framework for Dense Retriever (DR) and BERT re-ranker (BERT) for passage retrieval and ranking.

...read moreread less

Journal ArticleDOI

Exploring the Data Efficiency of Cross-Lingual Post-Training in Pretrained Language Models

Chanhee Lee, +5 more

- 24 Feb 2021 -

Applied Sciences

TL;DR: Quantitative results from intrinsic and extrinsic evaluations show that the novel cross-lingual post-training approach outperforms several massively multilingual and monolingual pretrained language models in most settings and improves the data efficiency by a factor of up to 32 compared tomonolingual training.

...read moreread less

Posted Content

Perceiver IO: A General Architecture for Structured Inputs & Outputs

Andrew Jaegle, +14 more

- 30 Jul 2021 -

arXiv: Learning

TL;DR: Perceiver IO as mentioned in this paper proposes to learn to flexibly query the model's latent space to produce outputs of arbitrary size and semantics, and achieves state-of-the-art results on tasks with highly structured output spaces.

...read moreread less

Proceedings ArticleDOI

Shortformer: Better Language Modeling using Shorter Inputs

Ofir Press, +2 more

TL;DR: This article showed that adding absolute position embeddings to queries and keys instead of to word embedding improves perplexity and speed up the training of a language model with short input length.

...read moreread less