scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

TL;DR: This paper introduces an advanced Russian general language understanding evaluation benchmark – Russian SuperGLUE and presents the first results of comparing multilingual models in the translated diagnostic test set and offers the first steps to further expanding or assessing State-of theart models independently of language.
Proceedings ArticleDOI

Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction

TL;DR: The authors proposed Text2Event, a sequence-to-structure generation paradigm that can directly extract events from the text in an end-toend manner, which can achieve competitive performance using only record-level annotations in both supervised learning and transfer learning settings.
Proceedings ArticleDOI

Improving Neural Topic Models using Knowledge Distillation

TL;DR: This work uses knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers to improve topic quality, and shows that the adaptable framework not only improves performance in the aggregate over all estimated topics, but also in head-to-head comparisons of aligned topics.
Posted Content

Robustness Gym: Unifying the NLP Evaluation Landscape

TL;DR: Robustness Gym as discussed by the authors is a simple and extensible evaluation toolkit that unifies four standard evaluation paradigms: subpopulations, transformations, evaluation sets, and adversarial attacks.
Proceedings Article

Rethinking Positional Encoding in Language Pre-training

TL;DR: The authors proposed a new positional encoding method called \textbf{T}ransformer with Untied positional embeddings (TUPE), which unties the symbol from other positions, making it easier to capture information from all positions.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.