Finetuning Pretrained Transformers into Variational Autoencoders

Open AccessPosted Content

Finetuning Pretrained Transformers into Variational Autoencoders

Seongmin Park, +1 more

- 05 Aug 2021 -

arXiv: Computation and Language

Chats0

TLDR

This article proposed a simple two-phase training scheme to convert a sequence-to-sequence Transformer into a VAE with just finetuning, which is competitive with massively pretrained Transformer-based VAEs in some internal metrics while falling short on others.

Abstract:

Text variational autoencoders (VAEs) are notorious for posterior collapse, a phenomenon where the model's decoder learns to ignore signals from the encoder. Because posterior collapse is known to be exacerbated by expressive decoders, Transformers have seen limited adoption as components of text VAEs. Existing studies that incorporate Transformers into text VAEs (Li et al., 2020; Fang et al., 2021) mitigate posterior collapse using massive pretraining, a technique unavailable to most of the research community without extensive computing resources. We present a simple two-phase training scheme to convert a sequence-to-sequence Transformer into a VAE with just finetuning. The resulting language model is competitive with massively pretrained Transformer-based VAEs in some internal metrics while falling short on others. To facilitate training we comprehensively explore the impact of common posterior collapse alleviation techniques in the literature. We release our code for reproducability.

Citations

PDF

Open Access

More filters

Posted ContentDOI

ProT-VAE: Protein Transformer Variational AutoEncoder for Functional Protein Design

Emre Sevgen, +14 more

- 24 Jan 2023 -

bioRxiv

TL;DR: The Protein Transformer Variational AutoEncoder (ProT-VAE) as mentioned in this paper is a deep generative model for sequence-to-function mapping that learns interpretable low-dimensional latent embeddings and fully generative decoding for conditional sequence design with the expressive, alignment-free featurization offered by transformers.

...read moreread less

Journal ArticleDOI

PCAE: A Framework of Plug-in Conditional Auto-Encoder for Controllable Text Generation

Haoqin Tu, +4 more

- 01 Sep 2022 -

Knowledge Based Systems

TL;DR: In this paper , a model-agnostic framework Plug-in Conditional Auto-Encoder for Controllable Text Generation (PCAE) is proposed for flexible and semi-supervised text generation.

...read moreread less

DOI

Variational Sentence Augmentation for Masked Language Modeling

M. Safak Bilici, +1 more

TL;DR: The authors proposed a variational sentence augmentation method that consists of variational autoencoder and Gated Recurrent Unit (GRU) for data augmentation, which encodes semantic and syntactic properties of the language.

...read moreread less

Posted ContentDOI

Assessment of Emerging Pretraining Strategies in Interpretable Multimodal Deep Learning for Cancer Prognostication

Zarif L. Azher, +7 more

- 24 Nov 2022 -

bioRxiv

TL;DR: In this article , an interpretable multimodal modeling framework that combines DNA methylation, gene expression, and histopathology (i.e., tissue slides) data was developed for predicting cancer patient prognosis from molecular and anatomic pathology information.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

ReportDOI

Building a large annotated corpus of English: the penn treebank

Mitchell Marcus, +2 more

- 01 Jun 1993 -

Computational Linguistics

TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.

...read moreread less

Proceedings Article

beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework

Irina Higgins, +7 more

TL;DR: In this article, a modification of the variational autoencoder (VAE) framework is proposed to learn interpretable factorised latent representations from raw image data in a completely unsupervised manner.

...read moreread less