Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

TruthfulQA: Measuring How Models Mimic Human Falsehoods

TL;DR: This paper proposed a benchmark to measure whether a language model is truthful in generating answers to questions, which consists of 817 questions that span 38 categories, including health, law, finance and politics.

...read moreread less

Proceedings ArticleDOI

Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA?

Cunxiang Wang, +2 more

TL;DR: The authors constructed a new dataset of closed-book QA using SQuAD, and investigated the performance of BART and found that it is challenging for BART to remember training facts in high precision, and also challenging to answer closed book questions even if relevant knowledge is retained.

...read moreread less

Proceedings ArticleDOI

NewsEmbed: Modeling News through Pre-trained Document Representations

Jialu Liu, +2 more

- 01 Jun 2021 -

arXiv: Computation and Language

TL;DR: NewsEmbed as discussed by the authors proposes a novel approach to mine semantically relevant fresh documents, and their topic labels, with little human supervision, and designs a multitask model called NewsEmbed that alternatively trains a contrastive learning with a multi-label classification to derive a universal document encoder.

...read moreread less

Posted Content

$Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering.

Or Honovich, +5 more

- 16 Apr 2021 -

arXiv: Computation and Language

TL;DR: This paper proposed an automatic evaluation metric for factual consistency in knowledge-grounded dialogue models using automatic question generation and question answering, which makes use of co-reference resolution and natural language inference capabilities.

...read moreread less

Proceedings ArticleDOI

Deduplicating Training Data Makes Language Models Better

TL;DR: Lee et al. as discussed by the authors presented a paper at the 60th Annual Meeting of the Association for Computational Linguistics (ACLL), which was entitled "The Future Directions of Computational Language Learning".

...read moreread less