B
Benjamin Van Durme
Researcher at Johns Hopkins University
Publications - 259
Citations - 11006
Benjamin Van Durme is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Parsing & Information extraction. The author has an hindex of 49, co-authored 259 publications receiving 8933 citations. Previous affiliations of Benjamin Van Durme include Carnegie Mellon University & Google.
Papers
More filters
Proceedings Article
PPDB: The Paraphrase Database
TL;DR: The 1.0 release of the paraphrase database, PPDB, contains over 220 million paraphrase pairs, consisting of 73 million phrasal and 8 million lexical paraphrases, as well as 140million paraphrase patterns, which capture many meaning-preserving syntactic transformations.
Proceedings ArticleDOI
Information Extraction over Structured Data: Question Answering with Freebase
Xuchen Yao,Benjamin Van Durme +1 more
TL;DR: It is shown that relatively modest information extraction techniques, when paired with a webscale corpus, can outperform these sophisticated approaches by roughly 34% relative gain.
Proceedings ArticleDOI
Hypothesis Only Baselines in Natural Language Inference
TL;DR: This article proposed a hypothesis-only baseline for diagnosing NLI, which is able to significantly outperform a majority-class baseline across a number of NLI datasets, and showed that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.
Posted Content
What do you learn from context? Probing for sentence structure in contextualized word representations
Ian Tenney,Patrick Xia,Berlin Chen,Alex Wang,Adam Poliak,R. Thomas McCoy,Najoung Kim,Benjamin Van Durme,Samuel R. Bowman,Dipanjan Das,Ellie Pavlick +10 more
TL;DR: The authors investigate word-level contextual representations from four recent models and investigate how they encode sentence structure across a range of syntactic, semantic, local, and long-range phenomena, finding that existing models trained on language modeling and translation produce strong representations for syntactic phenomena, but only offer comparably small improvements on semantic tasks over a non-contextual baseline.
Annotated Gigaword
TL;DR: This work has created layers of annotation on the English Gigaword v.5 corpus to render it useful as a standardized corpus for knowledge extraction and distributional semantics, and provides to the community a public reference set based on current state-of-the-art syntactic analysis and coreference resolution.