V
Vaclav Cvicek
Publications - 3
Citations - 271
Vaclav Cvicek is an academic researcher. The author has contributed to research in topics: Transformer (machine learning model) & Generalization. The author has an hindex of 3, co-authored 3 publications receiving 115 citations.
Papers
More filters
Proceedings ArticleDOI
ETC: Encoding Long and Structured Inputs in Transformers
Joshua Ainslie,Santiago Ontañón,Chris Alberti,Vaclav Cvicek,Zachary Fisher,Philip Pham,Anirudh Ravula,Sumit Sanghai,Qifan Wang,Li Yang +9 more
TL;DR: Extended Transformer Construction (ETC) as mentioned in this paper introduces a novel global-local attention mechanism between global tokens and regular input tokens to scale attention to longer inputs and achieves state-of-the-art results on four natural language datasets.
Posted Content
ETC: Encoding Long and Structured Inputs in Transformers
Joshua Ainslie,Santiago Ontañón,Chris Alberti,Vaclav Cvicek,Zachary Fisher,Philip Pham,Anirudh Ravula,Sumit Sanghai,Qifan Wang,Li Yang +9 more
TL;DR: A new Transformer architecture, Extended Transformer Construction (ETC), is presented that addresses two key challenges of standard Transformer architectures, namely scaling input length and encoding structured inputs.
Posted Content
Making Transformers Solve Compositional Tasks.
TL;DR: The authors explored the design space of Transformer models and showed that the inductive biases given to the model by several design decisions significantly impact compositional generalization, and identified Transformer configurations that generalize compositionally significantly better than previously reported in the literature in a diverse set of compositional tasks.