scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model.

TL;DR: In this article, a parser free system log anomaly detection method that uses the BERT model, exhibiting excellent natural language processing performance, is proposed, which learns the model through masked language modeling, which is a BERT-based pre-training method, and proceeds with unsupervised learning-based anomaly detection using the masked language modelling loss function per log key word during the inference process.
Posted Content

ContraQA: Question Answering under Contradicting Contexts

TL;DR: This paper study the risk of misinformation to QA models by investigating the behavior of the QA model under contradicting contexts that are mixed with both real and fake information and build a misinformation-aware QA system as a counter-measure that integrates question answering and misinformation detection in a joint fashion.
Proceedings Article

kFolden: k-Fold Ensemble for Out-Of-Distribution Detection.

TL;DR: This paper proposed a simple yet effective framework kFolden, which mimics the behaviors of OOD detection during training without the use of any external data, which induces k sub-models, each of which is trained on a subset with k-1 categories with the left category masked unknown to the sub-model.
Posted Content

Constrained Text Generation with Global Guidance - Case Study on CommonGen.

TL;DR: This paper used reinforcement learning to address the limitation, measuring global constraints including fluency, common sense and concept coverage with a comprehensive score, which serves as the reward for reinforcement learning, and designed a guided decoding method at the word, fragment and sentence levels.
Posted Content

Domain-robust VQA with diverse datasets and methods but no target labels

TL;DR: In this article, the authors quantify domain shifts between popular VQA datasets, in both visual and textual space, and test the robustness of different families of visual question answering methods (two-stream, transformer and neuro-symbolic) to these shifts.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.