Jared Davis

Researcher at Google

Publications - 17

Citations - 956

Jared Davis is an academic researcher from Google. The author has contributed to research in topics: Ode & Matrix representation. The author has an hindex of 6, co-authored 17 publications receiving 364 citations. Previous affiliations of Jared Davis include Stanford University.

Papers

PDF

Open Access

More filters

Posted Content

Rethinking Attention with Performers

Krzysztof Choromanski, +12 more

- 30 Sep 2020 -

arXiv: Learning

TL;DR: Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear space and time complexity, without relying on any priors such as sparsity or low-rankness are introduced.

...read moreread less

Proceedings Article

Rethinking Attention with Performers

Krzysztof Choromanski, +12 more

TL;DR: Performers as mentioned in this paper uses Fast Attention Via positive Orthogonal Random features (FAVOR+) to approximate softmax attention-kernels, which can estimate regular (softmax) full-rank attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity.

...read moreread less

Posted Content

On the Opportunities and Risks of Foundation Models.

Rishi Bommasani, +113 more

- 16 Aug 2021 -

arXiv: Learning

TL;DR: The authors provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e. g.g. model architectures, training procedures, data, systems, security, evaluation, theory) to their applications.

...read moreread less

Posted Content

Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers

Krzysztof Choromanski, +8 more

- 05 Jun 2020 -

arXiv: Learning

TL;DR: A new Transformer architecture, Performer, based on Fast Attention Via Orthogonal Random features (FAVOR), which demonstrates its effectiveness on the challenging task of protein sequence modeling and provides strong theoretical guarantees: unbiased estimation of the attention matrix and uniform convergence.

...read moreread less

Posted Content

Sub-Linear Memory: How to Make Performers SLiM.

Valerii Likhosherstov, +4 more

- 21 Dec 2020 -

arXiv: Learning

TL;DR: A thorough analysis of recent Transformer mechanisms with linear self-attention, Performers, results in a remarkable computational flexibility: forward and backward propagation can be performed with no approximations using sublinear memory as a function of $L$ (in addition to negligible storage for the input sequence), at a cost of greater time complexity in the parallel setting.

...read moreread less