P
Philip Pham
Researcher at Google
Publications - 17
Citations - 1863
Philip Pham is an academic researcher from Google. The author has contributed to research in topics: Transformer (machine learning model) & Turing completeness. The author has an hindex of 10, co-authored 17 publications receiving 728 citations. Previous affiliations of Philip Pham include University of Pennsylvania & Duke University.
Papers
More filters
Posted Content
Big Bird: Transformers for Longer Sequences
Manzil Zaheer,Guru Guruganesh,Avinava Dubey,Joshua Ainslie,Chris Alberti,Santiago Ontañón,Philip Pham,Anirudh Ravula,Qifan Wang,Li Yang,Amr Ahmed +10 more
TL;DR: It is shown that BigBird is a universal approximator of sequence functions and is Turing complete, thereby preserving these properties of the quadratic, full attention model.
Proceedings ArticleDOI
ETC: Encoding Long and Structured Inputs in Transformers
Joshua Ainslie,Santiago Ontañón,Chris Alberti,Vaclav Cvicek,Zachary Fisher,Philip Pham,Anirudh Ravula,Sumit Sanghai,Qifan Wang,Li Yang +9 more
TL;DR: Extended Transformer Construction (ETC) as mentioned in this paper introduces a novel global-local attention mechanism between global tokens and regular input tokens to scale attention to longer inputs and achieves state-of-the-art results on four natural language datasets.
Posted Content
Long Range Arena: A Benchmark for Efficient Transformers
Yi Tay,Mostafa Dehghani,Samira Abnar,Yikang Shen,Dara Bahri,Philip Pham,Jinfeng Rao,Liu Yang,Sebastian Ruder,Donald Metzler +9 more
TL;DR: A systematic and unified benchmark, LRA, specifically focused on evaluating model quality under long-context scenarios, paves the way towards better understanding this class of efficient Transformer models, facilitates more research in this direction, and presents new challenging tasks to tackle.
Journal ArticleDOI
The Perils of Balance Testing in Experimental Design: Messy Analyses of Clean Data
TL;DR: It is shown that balance tests can destroy the basis on which scientific conclusions are formed, and can lead to erroneous and even fraudulent conclusions, and is advocated that scientists and journal editors resist the use of balance tests in all analyses of clean data.
Proceedings Article
Big Bird: Transformers for Longer Sequences
Manzil Zaheer,Guru Guruganesh,Kumar Avinava Dubey,Joshua Ainslie,Chris Alberti,Santiago Ontañón,Philip Pham,Anirudh Ravula,Qifan Wang,Li Yang,Amr Ahmed +10 more
TL;DR: BigBird as mentioned in this paper proposes a sparse attention mechanism that reduces the quadratic dependency on the sequence length due to the full attention mechanism, which is a universal approximator of sequence functions and is Turing complete.