Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

AgreeSum: Agreement-Oriented Multi-Document Summarization

Richard Yuanzhe Pang, +3 more

TL;DR: Li et al. as mentioned in this paper proposed an agreement-oriented multi-document summarization (MDS) task, where the goal is to provide abstractive summaries that represent information common and faithful to all input articles.

...read moreread less

Posted Content

Knowledge Enhanced Sports Game Summarization.

Jiaan Wang, +7 more

- 24 Nov 2021 -

arXiv: Computation and Language

TL;DR: Wang et al. as discussed by the authors proposed a knowledge-enhanced summarizer that utilizes both live commentaries and the knowledge to generate sports news, which achieved state-of-the-art performance.

...read moreread less

Posted Content

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation.

Yunfan Shao, +7 more

- 13 Sep 2021 -

arXiv: Computation and Language

TL;DR: Wang et al. as discussed by the authors proposed a Chinese Pre-trained Unbalanced Transformer (CPT), which consists of three parts: a shared encoder, an understanding decoder, and a generation decoder.

...read moreread less

Posted Content

EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

Frederick Liu, +3 more

- 16 Oct 2021 -

arXiv: Computation and Language

TL;DR: This paper proposed an encoder-decoder transformer architecture for fine-tuning pre-trained T5 models for classification and regression tasks by using the encoder layers, which was shown to be more efficient than BERT for pre-training on language model task.

...read moreread less

Proceedings Article

Case-based Reasoning for Natural Language Queries over Knowledge Bases.

Rajarshi Das, +8 more

TL;DR: This article proposed a neuro-symbolic case-based reasoning (CBR-KBQA) for question answering over large knowledge bases, which consists of a nonparametric memory that stores cases (question and logical forms) and a parametric model that can generate a logical form for a new question by retrieving cases that are relevant to it.

...read moreread less