scispace - formally typeset
Open AccessPosted Content

Finetuning Pretrained Transformers into Variational Autoencoders

Reads0
Chats0
TLDR
This article proposed a simple two-phase training scheme to convert a sequence-to-sequence Transformer into a VAE with just finetuning, which is competitive with massively pretrained Transformer-based VAEs in some internal metrics while falling short on others.
Abstract
Text variational autoencoders (VAEs) are notorious for posterior collapse, a phenomenon where the model's decoder learns to ignore signals from the encoder. Because posterior collapse is known to be exacerbated by expressive decoders, Transformers have seen limited adoption as components of text VAEs. Existing studies that incorporate Transformers into text VAEs (Li et al., 2020; Fang et al., 2021) mitigate posterior collapse using massive pretraining, a technique unavailable to most of the research community without extensive computing resources. We present a simple two-phase training scheme to convert a sequence-to-sequence Transformer into a VAE with just finetuning. The resulting language model is competitive with massively pretrained Transformer-based VAEs in some internal metrics while falling short on others. To facilitate training we comprehensively explore the impact of common posterior collapse alleviation techniques in the literature. We release our code for reproducability.

read more

Citations
More filters
Posted ContentDOI

ProT-VAE: Protein Transformer Variational AutoEncoder for Functional Protein Design

TL;DR: The Protein Transformer Variational AutoEncoder (ProT-VAE) as mentioned in this paper is a deep generative model for sequence-to-function mapping that learns interpretable low-dimensional latent embeddings and fully generative decoding for conditional sequence design with the expressive, alignment-free featurization offered by transformers.
Journal ArticleDOI

PCAE: A Framework of Plug-in Conditional Auto-Encoder for Controllable Text Generation

TL;DR: In this paper , a model-agnostic framework Plug-in Conditional Auto-Encoder for Controllable Text Generation (PCAE) is proposed for flexible and semi-supervised text generation.

Variational Sentence Augmentation for Masked Language Modeling

TL;DR: The authors proposed a variational sentence augmentation method that consists of variational autoencoder and Gated Recurrent Unit (GRU) for data augmentation, which encodes semantic and syntactic properties of the language.
Posted ContentDOI

Assessment of Emerging Pretraining Strategies in Interpretable Multimodal Deep Learning for Cancer Prognostication

TL;DR: In this article , an interpretable multimodal modeling framework that combines DNA methylation, gene expression, and histopathology (i.e., tissue slides) data was developed for predicting cancer patient prognosis from molecular and anatomic pathology information.
References
More filters
Proceedings Article

Attention is All you Need

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
ReportDOI

Building a large annotated corpus of English: the penn treebank

TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.
Proceedings ArticleDOI

Transformers: State-of-the-Art Natural Language Processing

TL;DR: Transformers is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community.
Proceedings Article

beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework

TL;DR: In this article, a modification of the variational autoencoder (VAE) framework is proposed to learn interpretable factorised latent representations from raw image data in a completely unsupervised manner.
Related Papers (5)