scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

TruthfulQA: Measuring How Models Mimic Human Falsehoods

TL;DR: This paper proposed a benchmark to measure whether a language model is truthful in generating answers to questions, which consists of 817 questions that span 38 categories, including health, law, finance and politics.
Proceedings ArticleDOI

Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA?

TL;DR: The authors constructed a new dataset of closed-book QA using SQuAD, and investigated the performance of BART and found that it is challenging for BART to remember training facts in high precision, and also challenging to answer closed book questions even if relevant knowledge is retained.
Proceedings ArticleDOI

NewsEmbed: Modeling News through Pre-trained Document Representations

TL;DR: NewsEmbed as discussed by the authors proposes a novel approach to mine semantically relevant fresh documents, and their topic labels, with little human supervision, and designs a multitask model called NewsEmbed that alternatively trains a contrastive learning with a multi-label classification to derive a universal document encoder.
Posted Content

$Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering.

TL;DR: This paper proposed an automatic evaluation metric for factual consistency in knowledge-grounded dialogue models using automatic question generation and question answering, which makes use of co-reference resolution and natural language inference capabilities.
Proceedings ArticleDOI

Deduplicating Training Data Makes Language Models Better

TL;DR: Lee et al. as discussed by the authors presented a paper at the 60th Annual Meeting of the Association for Computational Linguistics (ACLL), which was entitled "The Future Directions of Computational Language Learning".
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.