scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Proceedings ArticleDOI
22 Jun 2003
TL;DR: This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems, based on chaotic (asynchronous) iterative solution of linear systems, which provided approximately a ten-fold reduction in network traffic for two-word and three-word querying.
Abstract: This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems. The implementation is based on chaotic (asynchronous) iterative solution of linear systems. The P2P implementation also enables incremental computation of pageranks as new documents are entered into or deleted from the network. Incremental update enables continuously accurate pageranks whereas the currently centralized web crawl and computation over Internet documents requires several days. This suggests possible applicability of the distributed algorithm to pagerank computations as a replacement for the centralized Web crawler based implementation for Internet documents. A complete solution of the distributed pagerank computation for an in-place network converges rapidly (1% accuracy in 10 iterations) for large systems although the time for iteration may be long. The incremental computation resulting from addition of a single document converges extremely rapidly, typically requiring update path lengths of fewer than 15 nodes even for large networks and very accurate solutions. This implementation of pagerank provides a uniform ranking scheme for documents in P2P systems, and its integration with P2P keyword search provides one solution to the network traffic problems engendered by return of document hits. In basic P2P keyword search, all the document hits must be returned to the querying node causing large network traffic. An incremental keyword search algorithm for P2P keyword search where document hits are sorted by pagerank, and incrementally returned to the querying node is proposed and evaluated. Integration of this algorithm into P2P keyword search can produce dramatic benefit both in terms of effectiveness for users and decrease in network traffic. The incremental search algorithm provided approximately a ten-fold reduction in network traffic for two-word and three-word querying.

111 citations

Patent
Simon Tong1, Mark Pearson1
10 Sep 2004
TL;DR: In this article, the authors describe systems and methods that improve search rankings for a search query by using data associated with queries related to the search query, such as a population associated with the query, an article associated with a query, and a ranking score for the article based at least in part on the associated population.
Abstract: Systems and methods that improve search rankings for a search query by using data associated with queries related to the search query are described. In one aspect, a search query is received, a population associated with the search query is determined, an article (such as a webpage) associated with the search query is determined, and a ranking score for the article based at least in part on data associated with the population is determined. Algorithms and types of data associated with a population useful in carrying out such systems and methods are described.

111 citations

Patent
17 Jun 2004
TL;DR: In this article, a system generates a model based on feature data relating to different features of a link from a linking document to a linked document and user behavior data related to navigational actions associated with the link and assigns a rank to a document based on the model.
Abstract: A system generates a model based on feature data relating to different features of a link from a linking document to a linked document and user behavior data relating to navigational actions associated with the link. The system also assigns a rank to a document based on the model.

110 citations

Proceedings ArticleDOI
Yun-Wu Huang1, Philip S. Yu1
01 Aug 1999
TL;DR: IBM T.J. Watson Research Center 30 Saw Mill River Road Hawthorne, NY 10532 is located on the outskirts of Hawthorne and is home to the largest collection of Watson research mice in the world, as well as a small number of other research mice from around the world.
Abstract: IBM T.J. Watson Research Center 30 Saw Mill River Road Hawthorne, NY 10532

110 citations

Proceedings ArticleDOI
20 Jul 2008
TL;DR: A framework for transductive learning of ranking functions is presented and it is shown that the answer is affirmative that unlabeled (test) data can be exploited to improve ranking performance.
Abstract: Ranking algorithms, whose goal is to appropriately order a set of objects/documents, are an important component of information retrieval systems. Previous work on ranking algorithms has focused on cases where only labeled data is available for training (i.e. supervised learning). In this paper, we consider the question whether unlabeled (test) data can be exploited to improve ranking performance. We present a framework for transductive learning of ranking functions and show that the answer is affirmative. Our framework is based on generating better features from the test data (via KernelPCA) and incorporating such features via Boosting, thus learning different ranking functions adapted to the individual test queries. We evaluate this method on the LETOR (TREC, OHSUMED) dataset and demonstrate significant improvements.

110 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168