Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Home
/
Papers
/
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Journal Article•

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu - Show less +5 more

01 Jan 2020-Journal of Machine Learning Research-Vol. 21, Iss: 140, pp 1-67

TL;DR: This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

read less

Abstract: Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Societal Biases in Language Generation: Progress and Challenges

[...]

Emily Sheng¹, Kai-Wei Chang², Premkumar Natarajan², Nanyun Peng²•Institutions (2)

University of Southern California¹, University of California, Los Angeles²

01 Aug 2021

TL;DR: This paper presented a survey on societal biases in language generation, focusing on how data and techniques contribute to biases and progress towards reducing biases, and also conducted experiments to quantify the effects of these techniques.

...read moreread less

Abstract: Technology for language generation has advanced rapidly, spurred by advancements in pre-training large models on massive amounts of data and the need for intelligent agents to communicate in a natural manner. While techniques can effectively generate fluent text, they can also produce undesirable societal biases that can have a disproportionately negative impact on marginalized populations. Language generation presents unique challenges for biases in terms of direct user interaction and the structure of decoding techniques. To better understand these challenges, we present a survey on societal biases in language generation, focusing on how data and techniques contribute to biases and progress towards reducing biases. Motivated by a lack of studies on biases from decoding techniques, we also conduct experiments to quantify the effects of these techniques. By further discussing general trends and open challenges, we call to attention promising directions for research and the importance of fairness and inclusivity considerations for language generation applications.

...read moreread less

11 citations

Proceedings Article•DOI•

End-to-End Training of Neural Retrievers for Open-Domain Question Answering

[...]

Devendra Singh Sachan¹, Mostofa Patwary², Mohammad Shoeybi², Neel Kant², Wei Ping², William L. Hamilton³, Bryan Catanzaro² - Show less +3 more•Institutions (3)

Carnegie Mellon University¹, Nvidia², McGill University³

01 Aug 2021

TL;DR: This paper proposed an approach of unsupervised pre-training with the Inverse Cloze Task and masked salient spans, followed by supervised finetuning using question-context pairs, which leads to absolute gains of 2+ points over the previous best result in the top-20 retrieval accuracy on Natural Questions and TriviaQA datasets.

...read moreread less

Abstract: Recent work on training neural retrievers for open-domain question answering (OpenQA) has employed both supervised and unsupervised approaches. However, it remains unclear how unsupervised and supervised methods can be used most effectively for neural retrievers. In this work, we systematically study retriever pre-training. We first propose an approach of unsupervised pre-training with the Inverse Cloze Task and masked salient spans, followed by supervised finetuning using question-context pairs. This approach leads to absolute gains of 2+ points over the previous best result in the top-20 retrieval accuracy on Natural Questions and TriviaQA datasets. We next explore two approaches for end-to-end training of the reader and retriever components in OpenQA models, which differ in the manner the reader ingests the retrieved documents. Our experiments demonstrate the effectiveness of these approaches as we obtain state-of-the-art results. On the Natural Questions dataset, we obtain a top-20 retrieval accuracy of 84%, an improvement of 5 points over the recent DPR model. We also achieve good results on answer extraction, outperforming recent models like REALM and RAG by 3+ points.

...read moreread less

11 citations

Proceedings Article•DOI•

GO FIGURE: A Meta Evaluation of Factuality in Summarization

[...]

Saadia Gabriel¹, Asli Celikyilmaz², Rahul Jha², Yejin Choi³, Jianfeng Gao⁴ - Show less +1 more•Institutions (4)

University of Washington¹, Facebook², Allen Institute for Artificial Intelligence³, Microsoft⁴

01 Aug 2021

TL;DR: The authors introduce a meta-evaluation framework for evaluating factual consistency metrics and experiment with nine factuality metrics using synthetic and human-labeled factuality data from short news, long news and dialogue summarization domains.

...read moreread less

Abstract: Text generation models can generate factually inconsistent text containing distorted or fabricated facts about the source text. Recent work has focused on building evaluation models to verify the factual correctness of semantically constrained text generation tasks such as document summarization. While the field of factuality evaluation is growing fast, we don't have well-defined criteria for measuring the effectiveness, generalizability, reliability, or sensitivity of the factuality metrics. Focusing on these aspects, in this paper, we introduce a meta-evaluation framework for evaluating factual consistency metrics. We introduce five necessary, common-sense conditions for effective factuality metrics and experiment with nine recent factuality metrics using synthetic and human-labeled factuality data from short news, long news and dialogue summarization domains. Our framework enables assessing the efficiency of any new factual consistency metric on a variety of dimensions over multiple summarization domains and can be easily extended with new meta-evaluation criteria. We also present our conclusions towards standardizing the factuality evaluation metrics.

...read moreread less

11 citations

Proceedings Article•DOI•

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

[...]

01 Jan 2022

TL;DR: Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei as discussed by the authors .

...read moreread less

Abstract: Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022.

...read moreread less

11 citations

Posted Content•

All NLP Tasks Are Generation Tasks: A General Pretraining Framework

[...]

Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang - Show less +3 more

18 Mar 2021-arXiv: Computation and Language

TL;DR: This paper proposed a novel pretraining framework GLM (General Language Model) to address the challenge of none of the pretraining frameworks performs the best for all tasks, which introduces inconvenience for model development and selection.

...read moreread less

Abstract: There have been various types of pretraining architectures including autoregressive models (e.g., GPT), autoencoding models (e.g., BERT), and encoder-decoder models (e.g., T5). On the other hand, NLP tasks are different in nature, with three main categories being classification, unconditional generation, and conditional generation. However, none of the pretraining frameworks performs the best for all tasks, which introduces inconvenience for model development and selection. We propose a novel pretraining framework GLM (General Language Model) to address this challenge. Compared to previous work, our architecture has three major benefits: (1) it performs well on classification, unconditional generation, and conditional generation tasks with one single pretrained model; (2) it outperforms BERT-like models on classification due to improved pretrain-finetune consistency; (3) it naturally handles variable-length blank filling which is crucial for many downstream tasks. Empirically, GLM substantially outperforms BERT on the SuperGLUE natural language understanding benchmark with the same amount of pre-training data. Moreover, GLM with 1.25x parameters of BERT-Large achieves the best performance in NLU, conditional and unconditional generation at the same time, which demonstrates its generalizability to different downstream tasks.

...read moreread less

11 citations