scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity

TL;DR: In this paper, a framework for calculating similarities between a given input query and a set of predefined questions to retrieve the question which matches to it the most was discussed, which can be generalized for any domain-specific search engine and can be used in other domains as well.
Posted Content

Does BERT Pretrained on Clinical Notes Reveal Sensitive Data

TL;DR: In this paper, a battery of approaches intended to recover Personal Health Information (PHI) from a trained BERT was designed to recover patient names and conditions with which they are associated.
Proceedings Article

$Q^2$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering.

TL;DR: This paper proposed an automatic evaluation metric for factual consistency in knowledge-grounded dialogue using automatic question generation and question answering, which compares answer spans using natural language inference (NLI), instead of token-based matching as done in previous work.
Posted Content

Aligning the Pretraining and Finetuning Objectives of Language Models

TL;DR: It is demonstrated that explicitly aligning the pretraining objectives to the finetuning objectives in language model training significantly improves the finETuning task performance and reduces the minimum amount of finetuned examples required, which allows us to build language models with smaller sizes for tasks with less available training data.
Proceedings ArticleDOI

Doing more with less: training large DNN models on commodity servers for the masses

TL;DR: In this article, the authors advocate rethinking how DNN frameworks schedule computation and move data to push the boundaries of training large models efficiently on modest multi-GPU deployments, and propose an approach to train large DNN models on commodity servers.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.