Open AccessPosted Content
Finetuning Pretrained Transformers into Variational Autoencoders
Seongmin Park,Jihwa Lee +1 more
Reads0
Chats0
TLDR
This article proposed a simple two-phase training scheme to convert a sequence-to-sequence Transformer into a VAE with just finetuning, which is competitive with massively pretrained Transformer-based VAEs in some internal metrics while falling short on others.Abstract:
Text variational autoencoders (VAEs) are notorious for posterior collapse, a phenomenon where the model's decoder learns to ignore signals from the encoder. Because posterior collapse is known to be exacerbated by expressive decoders, Transformers have seen limited adoption as components of text VAEs. Existing studies that incorporate Transformers into text VAEs (Li et al., 2020; Fang et al., 2021) mitigate posterior collapse using massive pretraining, a technique unavailable to most of the research community without extensive computing resources. We present a simple two-phase training scheme to convert a sequence-to-sequence Transformer into a VAE with just finetuning. The resulting language model is competitive with massively pretrained Transformer-based VAEs in some internal metrics while falling short on others. To facilitate training we comprehensively explore the impact of common posterior collapse alleviation techniques in the literature. We release our code for reproducability.read more
Citations
More filters
Posted ContentDOI
ProT-VAE: Protein Transformer Variational AutoEncoder for Functional Protein Design
Emre Sevgen,Joshua Moller,Adrian Lange,John Parker,Sean Quigley,Jeff Mayer,Poonam Srivastava,Sitaram Gayatri,David J. Hosfield,Maria Korshunova,Micha Livne,Michelle Gill,Rama Ranganathan,Anthony Costa,Andrew L. Ferguson +14 more
TL;DR: The Protein Transformer Variational AutoEncoder (ProT-VAE) as mentioned in this paper is a deep generative model for sequence-to-function mapping that learns interpretable low-dimensional latent embeddings and fully generative decoding for conditional sequence design with the expressive, alignment-free featurization offered by transformers.
Journal ArticleDOI
PCAE: A Framework of Plug-in Conditional Auto-Encoder for Controllable Text Generation
TL;DR: In this paper , a model-agnostic framework Plug-in Conditional Auto-Encoder for Controllable Text Generation (PCAE) is proposed for flexible and semi-supervised text generation.
Variational Sentence Augmentation for Masked Language Modeling
TL;DR: The authors proposed a variational sentence augmentation method that consists of variational autoencoder and Gated Recurrent Unit (GRU) for data augmentation, which encodes semantic and syntactic properties of the language.
Posted ContentDOI
Assessment of Emerging Pretraining Strategies in Interpretable Multimodal Deep Learning for Cancer Prognostication
Zarif L. Azher,Anish Suvarna,Ji-Qing Chen,Ze Zhang,Brock C. Christensen,Lucas A. Salas,Louis J. Vaickus,Joshua J. Levy +7 more
TL;DR: In this article , an interpretable multimodal modeling framework that combines DNA methylation, gene expression, and histopathology (i.e., tissue slides) data was developed for predicting cancer patient prognosis from molecular and anatomic pathology information.
References
More filters
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
ReportDOI
Building a large annotated corpus of English: the penn treebank
TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.
Proceedings ArticleDOI
Transformers: State-of-the-Art Natural Language Processing
Thomas Wolf,Lysandre Debut,Victor Sanh,Julien Chaumond,Clement Delangue,Anthony Moi,Pierric Cistac,Clara Ma,Yacine Jernite,Julien Plu,Canwen Xu,Teven Le Scao,Sylvain Gugger,Mariama Drame,Quentin Lhoest,Alexander M. Rush +15 more
TL;DR: Transformers is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community.
Proceedings Article
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
Irina Higgins,Loic Matthey,Arka Pal,Christopher P. Burgess,Xavier Glorot,Matthew Botvinick,Shakir Mohamed,Alexander Lerchner +7 more
TL;DR: In this article, a modification of the variational autoencoder (VAE) framework is proposed to learn interpretable factorised latent representations from raw image data in a completely unsupervised manner.