scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Identifying inherent disagreement in natural language inference

TL;DR: This paper investigates how to tease systematic inferences apart from disagreement items, and proposes Artificial Annotators (AAs) to simulate the uncertainty in the annotation process by capturing the modes in annotations.
Posted Content

Dealing with Typos for BERT-based Passage Retrieval and Ranking

TL;DR: This article proposed a simple typos-aware training framework for Dense Retriever (DR) and BERT re-ranker (BERT) for passage retrieval and ranking.
Journal ArticleDOI

Exploring the Data Efficiency of Cross-Lingual Post-Training in Pretrained Language Models

TL;DR: Quantitative results from intrinsic and extrinsic evaluations show that the novel cross-lingual post-training approach outperforms several massively multilingual and monolingual pretrained language models in most settings and improves the data efficiency by a factor of up to 32 compared tomonolingual training.
Posted Content

Perceiver IO: A General Architecture for Structured Inputs & Outputs

TL;DR: Perceiver IO as mentioned in this paper proposes to learn to flexibly query the model's latent space to produce outputs of arbitrary size and semantics, and achieves state-of-the-art results on tasks with highly structured output spaces.
Proceedings ArticleDOI

Shortformer: Better Language Modeling using Shorter Inputs

TL;DR: This article showed that adding absolute position embeddings to queries and keys instead of to word embedding improves perplexity and speed up the training of a language model with short input length.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.