Open AccessJournal Article
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li,Peter J. Liu +8 more
Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.Abstract:
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.read more
Citations
More filters
Proceedings ArticleDOI
RetroNLU: Retrieval Augmented Task-Oriented Semantic Parsing
TL;DR: This paper extended a sequence-to-sequence model with a retrieval component, which is used to retrieve existing similar samples and present them as an additional context to the model, and analyzed the quality, model sensitivity, and performance of the nearest neighbor retrieval component's for semantic parses of varied utterance complexity.
Book ChapterDOI
High Performance Computing for Understanding Natural Language
TL;DR: This chapter gives an overview of state-of-the-art natural language processing problems, algorithms, models, and libraries, and details of a few specific applications that use pre-training or self-supervised learning on large amounts of data in text understanding.
Proceedings Article
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce
TL;DR: The authors proposed a knowledge-injected pre-trained language model based on the encoder-decoder transformer that can be transferred to both natural language understanding and generation tasks, which can learn a diverse set of domain-specific knowledge.
Journal ArticleDOI
CINS: Comprehensive Instruction for Few-Shot Learning in Task-Oriented Dialog Systems
TL;DR: The authors proposed Comprehensive Instruction (CINS) that exploits pre-trained language models with extra task-specific instructions to better utilize the power of PLMs for few-shot learning in task-oriented dialog.
Posted Content
Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News Detection in English
TL;DR: This article proposed an ensemble method of different pre-trained language models such as BERT, Roberta, Ernie, etc. with various training strategies including warm-up, learning rate schedule and k-fold cross-validation.