scispace - formally typeset
Open AccessProceedings Article

EntityRank: searching entities directly and holistically

Reads0
Chats0
TLDR
This work focuses on the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking.
Abstract
As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. While entities appear in many pages, current engines only find each page individually. Toward searching directly and holistically for finding information of finer granularity, we study the problem of entity search, a significant departure from traditional document retrieval. We focus on the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We evaluate our online prototype over a 2TB Web corpus, and show that EntityRank performs effectively.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

YAGO: A Large Ontology from Wikipedia and WordNet

TL;DR: YAGO is a large ontology with high coverage and precision, based on a clean logical model with a decidable consistency that allows representing n-ary relations in a natural way while maintaining compatibility with RDFS.
Journal ArticleDOI

Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions

TL;DR: A thorough overview and analysis of the main approaches to entity linking is presented, and various applications, the evaluation of entity linking systems, and future directions are discussed.
Journal ArticleDOI

Annotating and searching web tables using entities, types and relationships

TL;DR: This paper proposes new machine learning techniques to annotate table cells with entities that they likely mention, table columns with types from which entities are drawn for cells in the column, and relations that pairs of table columns seek to express, and a new graphical model for making all these labeling decisions for each table simultaneously.
Book ChapterDOI

New Regularized Algorithms for Transductive Learning

TL;DR: This work proposes a new graph-based label propagation algorithm for transductive learning that can be extended to incorporate additional prior information, and demonstrates it with classifying data where the labels are not mutually exclusive.
Proceedings ArticleDOI

From information to knowledge: harvesting entities and relationships from web sources

TL;DR: This tutorial discusses state-of-the-art methods, research opportunities, and open challenges along this avenue of knowledge harvesting, to automatically construct and maintain a comprehensive knowledge base of facts about named entities, their semantic classes, and their mutual relations as well as temporal contexts, with high precision and high recall.
References
More filters
Journal ArticleDOI

The anatomy of a large-scale hypertextual Web search engine

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Journal Article

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Sergey Brin, +1 more
- 01 Jan 1998 - 
TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.
Journal Article

Accurate methods for the statistics of surprise and coincidence

TL;DR: The basis of a measure based on likelihood ratios that can be applied to the analysis of text is described, and in cases where traditional contingency table methods work well, the likelihood ratio tests described here are nearly identical.
Proceedings ArticleDOI

Web-scale information extraction in knowitall: (preliminary results)

TL;DR: KnowItAll, a system that aims to automate the tedious process of extracting large collections of facts from the web in an autonomous, domain-independent, and scalable manner, is introduced.
Proceedings ArticleDOI

GATE - a General Architecture for Text Engineering

TL;DR: GATE lies at the intersection of human language computation and software engineering, and constitutes aninfrastructural system supporting research and development of languageprocessing software.