scispace - formally typeset
O

Oren Etzioni

Researcher at Allen Institute for Artificial Intelligence

Publications -  248
Citations -  35498

Oren Etzioni is an academic researcher from Allen Institute for Artificial Intelligence. The author has contributed to research in topics: Information extraction & Web page. The author has an hindex of 89, co-authored 245 publications receiving 33044 citations. Previous affiliations of Oren Etzioni include Vulcan Inc. & University of Washington.

Papers
More filters
Proceedings ArticleDOI

Extracting Product Features and Opinions from Reviews

TL;DR: Opine is introduced, an unsupervised information-extraction system which mines reviews in order to build a model of important product features, their evaluation by reviewers, and their relative quality across products.
Proceedings Article

Open information extraction from the web

TL;DR: Open Information Extraction (OIE) as mentioned in this paper is a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input.
Proceedings Article

Named Entity Recognition in Tweets: An Experimental Study

TL;DR: The novel T-ner system doubles F1 score compared with the Stanford NER system, and leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision.
Proceedings Article

Identifying Relations for Open Information Extraction

TL;DR: Two simple syntactic and lexical constraints on binary relations expressed by verbs are introduced in the ReVerb Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and woepos.
Proceedings ArticleDOI

Web document clustering: a feasibility demonstration

TL;DR: To satisfy the stringent requirements of the Web domain, an incremental, linear time algorithm called Suffix Tree Clustering (STC) is introduced which creates clusters based on phrases shared between documents, showing that STC is faster than standard clustering methods in this domain.