Open AccessJournal Article
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li,Peter J. Liu +8 more
Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.Abstract:
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.read more
Citations
More filters
Posted Content
Text-to-Text Multi-view Learning for Passage Re-ranking
TL;DR: This article proposed a text-to-text multi-view learning framework by incorporating an additional view, the text generation view, into a typical single-view passage ranking model, which is of help to the ranking performance compared to its singleview counterpart.
Posted Content
Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction
Lu Xu,Yew Ken Chia,Lidong Bing +2 more
TL;DR: This paper proposed a dual-channel span pruning strategy by incorporating supervision from the Aspect Term Extraction (ATE) and Opinion Term Extraction (OTE) tasks, which not only improves computational efficiency but also distinguishes the opinion and target spans more properly.
Proceedings ArticleDOI
Improving Social Meaning Detection with Pragmatic Masking and Surrogate Fine-Tuning
TL;DR: The authors propose pragmatic masking and surrogate fine-tuning as two complementing strategies that exploit social cues to drive pre-trained representations toward a broad set of concepts useful for a wide class of social meaning tasks.
Proceedings Article
Automatic Text Evaluation through the Lens of Wasserstein Barycenters.
TL;DR: In this paper, a new metric BaryScore is proposed to evaluate text generation based on deep contextualized embeddings (e.g., BERT, Roberta, ELMo).
Proceedings Article
ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning.
TL;DR: Recently, this article presented ExplaGraphs, a new generative and structured commonsense-reasoning task (and an associated dataset) of explanation graph generation for stance prediction, where given a belief and an argument, a model has to predict if the argument supports or counters the belief and also generate a commonsense augmented graph that serves as non-trivial, complete, and unambiguous explanation for the predicted stance.