scispace - formally typeset
O

Omer Levy

Researcher at Tel Aviv University

Publications -  130
Citations -  43830

Omer Levy is an academic researcher from Tel Aviv University. The author has contributed to research in topics: Language model & Computer science. The author has an hindex of 45, co-authored 111 publications receiving 25357 citations. Previous affiliations of Omer Levy include Facebook & Stanford University.

Papers
More filters
Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
Proceedings ArticleDOI

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

TL;DR: BART is presented, a denoising autoencoder for pretraining sequence-to-sequence models, which matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks.
Proceedings ArticleDOI

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

TL;DR: The gluebenchmark as mentioned in this paper is a benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models.
Proceedings Article

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

TL;DR: A benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models, which favors models that can represent linguistic knowledge in a way that facilitates sample-efficient learning and effective knowledge-transfer across tasks.
Proceedings Article

Neural Word Embedding as Implicit Matrix Factorization

TL;DR: It is shown that using a sparse Shifted Positive PMI word-context matrix to represent words improves results on two word similarity tasks and one of two analogy tasks, and conjecture that this stems from the weighted nature of SGNS's factorization.