Open AccessJournal Article
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li,Peter J. Liu +8 more
Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.Abstract:
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.read more
Citations
More filters
Proceedings ArticleDOI
What Makes Good In-Context Examples for GPT-3?
TL;DR: The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (DeeLIO 2022) as discussed by the authors was the first workshop dedicated to knowledge extraction and integration for deep learning architectures.
Proceedings ArticleDOI
SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning.
TL;DR: The SemEval-2021 shared task 4: Reading Comprehension of Abstract Meaning (ReCAM) as discussed by the authors was designed to help evaluate the ability of machines in representing and understanding abstract concepts.
Proceedings ArticleDOI
Efficient Meta Lifelong-Learning with Limited Memory
TL;DR: This paper proposed a meta-lifelong framework that combines three common principles of lifelong learning methods and achieved state-of-the-art performance on text classification and question answering benchmarks. But their method suffers from catastrophic forgetting and negative transfer at the same time.
Proceedings ArticleDOI
CoTexT: Multi-task Learning with Code-Text Transformer
TL;DR: CoTexT as discussed by the authors is a transformer-based encoder-decoder model that learns the representative context between natural language (NL) and programming language (PL) using self-supervision.
Proceedings ArticleDOI
Learning to Decompose and Organize Complex Tasks
TL;DR: A novel end-to-end pipeline that consumes a complex task and induces a dependency graph from unstructured text to represent sub-tasks and their relationships is proposed, demonstrating that its models outperform a state-of-the-art text generator significantly.