scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Curriculum Learning for Natural Language Understanding

TL;DR: By reviewing the trainset in a crossed way, this work is able to distinguish easy examples from difficult ones, and arrange a curriculum for language models, and obtains significant and universal performance improvements on a wide range of NLU tasks.
Posted Content

Just Ask: Learning to Answer Questions from Millions of Narrated Videos

TL;DR: This work proposes to avoid manual annotation and generate a large-scale training dataset for video question answering making use of automatic cross-modal supervision, and introduces iVQA, a new VideoQA dataset with reduced language biases and high-quality redundant manual annotations.
Proceedings ArticleDOI

RobBERT: a Dutch RoBERTa-based Language Model.

TL;DR: It is found that RobBERT improves state-of-the-art results for various tasks, and especially significantly outperforms other models when dealing with smaller datasets, indicating that it is a powerful pre-trained model for a large variety of Dutch language tasks.
Proceedings ArticleDOI

FUDGE: Controlled Text Generation With Future Discriminators

TL;DR: This work proposes Future Discriminators for Generation (FUDGE), a flexible and modular method for controlled text generation that enables conditioning on a desired attribute a while requiring access only to G’s output logits.
Posted Content

Neural Abstractive Text Summarization with Sequence-to-Sequence Models

TL;DR: This article provides a comprehensive literature survey on different seq2seq models for abstractive text summarization from the viewpoint of network structures, training strategies, and summary generation algorithms.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.