scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A search engine can use training data extracted from the logs to automatically tailor ranking functions to a particular user group or collection, and machine-learning techniques can harness to improve search quality.
Abstract: Search-engine logs provide a wealth of information that machine-learning techniques can harness to improve search quality. With proper interpretations that avoid inherent biases, a search engine can use training data extracted from the logs to automatically tailor ranking functions to a particular user group or collection.

203 citations

Proceedings ArticleDOI
M.F. Wyle1
09 Oct 1991
TL;DR: A discussion is given on an intelligent wide area network-based clipping service used as a test-bed for the development and effectiveness measurement of high-performance retrieval algorithms.
Abstract: A discussion is given on an intelligent wide area network-based clipping service used as a test-bed for the development and effectiveness measurement of high-performance retrieval algorithms. The system taps news wire and other information sources aperiodically, several times daily. It maintains a local 100 Mbyte news database that changes at a rate of 10 Mbytes per day. The system stores and manages user interest profiles, each of which includes several query texts. News items are selected for presentation by a completely automatic four-stage information filtering process; expert system rules and database knowledge criteria are used in the first three stages; vector-space ranking is used in the final stage. News items are actively and selectively disseminated to users via electronic mail. >

203 citations

Journal Article
TL;DR: A new family of topic- ranking algorithms for multi-labeled documents that achieve state-of-the-art results and outperforms topic-ranking adaptations of Rocchio's algorithm and of the Perceptron algorithm are described.
Abstract: We describe a new family of topic-ranking algorithms for multi-labeled documents. The motivation for the algorithms stem from recent advances in online learning algorithms. The algorithms are simple to implement and are also time and memory efficient. We provide a unified analysis of the family of algorithms in the mistake bound model. We then discuss experiments with the proposed family of topic-ranking algorithms on the Reuters-21578 corpus and the new corpus released by Reuters in 2000. On both corpora, the algorithms we present achieve state-of-the-art results and outperforms topic-ranking adaptations of Rocchio's algorithm and of the Perceptron algorithm.

201 citations

Proceedings ArticleDOI
10 May 2005
TL;DR: A ranking framework which models the process of generation of a stream of news articles, the news articles clustering by topics, and the evolution of news story over the time is proposed and can be obtained without a predefined sliding window of observation over the stream.
Abstract: According to a recent survey made by Nielsen NetRatings, searching on news articles is one of the most important activity online. Indeed, Google, Yahoo, MSN and many others have proposed commercial search engines for indexing news feeds. Despite this commercial interest, no academic research has focused on ranking a stream of news articles and a set of news sources. In this paper, we introduce this problem by proposing a ranking framework which models: (1) the process of generation of a stream of news articles, (2) the news articles clustering by topics, and (3) the evolution of news story over the time. The ranking algorithm proposed ranks news information, finding the most authoritative news sources and identifying the most interesting events in the different categories to which news article belongs. All these ranking measures take in account the time and can be obtained without a predefined sliding window of observation over the stream. The complexity of our algorithm is linear in the number of pieces of news still under consideration at the time of a new posting. This allow a continuous on-line process of ranking. Our ranking framework is validated on a collection of more than 300,000 pieces of news, produced in two months by more then 2000 news sources belonging to 13 different categories (World, U.S, Europe, Sports, Business, etc). This collection is extracted from the index of comeToMyHead, an academic news search engine available online.

200 citations

Proceedings Article
23 Sep 2007
TL;DR: This work focuses on the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking.
Abstract: As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. While entities appear in many pages, current engines only find each page individually. Toward searching directly and holistically for finding information of finer granularity, we study the problem of entity search, a significant departure from traditional document retrieval. We focus on the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We evaluate our online prototype over a 2TB Web corpus, and show that EntityRank performs effectively.

200 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168