scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

SB_NITK at MEDIQA 2021: Leveraging Transfer Learning for Question Summarization in Medical Domain

TL;DR: In this article, the authors used transfer learning to improve the performance of summarization of consumer health questions in the MEDIQA 2021 Question Summarization shared task and achieved a ROUGE-2 F1 score of 0.139.
Book ChapterDOI

Information Extraction/Entailment of Common Law and Civil Code

TL;DR: This paper evaluated different approaches on handling entailment tasks for small domain-specific data sets provided in the Competition on Legal Information Extraction/Entailment (COLIEE), which focused on legal information processing and finding textual entailment on legal data.
Proceedings ArticleDOI

DuoRAT: Towards Simpler Text-to-SQL Models

TL;DR: This paper begins by building DuoRAT, a re-implementation of the state-of-the-art RAT-SQL model that unlike RAC-SQL is using only relation-aware or vanilla transformers as the building blocks, and performs several ablation experiments using DuoR AT as the baseline model.
Posted Content

Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language.

TL;DR: A new model architecture for learning multi-modal neuro-symbolic representations for video captioning using a dictionary learning-based method that incorporates modality-specific inductive biases for the captioning task is proposed.
Proceedings ArticleDOI

SUPER: SUb-Graph Parallelism for TransformERs

TL;DR: In this article, the authors proposed sub-graph parallelism to accelerate the training of Transformer models and generalize the concept to any neural network with multiple branches, which can be used to speed up training of neural networks.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.