scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Patent
07 Apr 2014
TL;DR: In this paper, a user who is composing or reading a document can identify and link multiple sets of key words into separate search queries by highlighting and assigning either unique search numbers, colors or other readily ascertained indicators of their logical relation.
Abstract: Systems and methods allow a user of a text or graphics editor to quickly create multiple robust internet search queries by selecting and ranking groups or individual key words from a document. A user who is composing or reading a document can identify and link multiple sets of key words into separate search queries by highlighting and assigning either unique search numbers, colors or other readily ascertained indicators of their logical relation. Each individual search query is routed to selected internet search engines, and the results are returned to the user in the same viewed document. The user may select the form in which the results are displayed. For example, results may be listed within the document by way footnotes, endnotes, or separate hover or pull-down windows accessible from the search terms. In addition, the user can browse, sort, rank, edit or eliminate portions of the results.

269 citations

Journal ArticleDOI
01 Jan 2006
TL;DR: It is shown that the feedback arc set problem for tournaments is NP-hard under randomized reductions, which settles a conjecture of Bang-Jensen and Thomassen.
Abstract: A tournament is an oriented complete graph. The feedback arc set problem for tournaments is the optimization problem of determining the minimum possible number of edges of a given input tournament T whose reversal makes T acyclic. Ailon, Charikar, and Newman showed that this problem is NP-hard under randomized reductions. Here we show that it is in fact NP-hard. This settles a conjecture of Bang-Jensen and Thomassen.

269 citations

Proceedings ArticleDOI
17 May 2004
TL;DR: This paper analyzes features of the rapidly growing "frontier" of the web, namely the part of theweb that crawlers are unable to cover for one reason or another, and suggests ways to improve the quality of ranking by modeling the growing presence of "link rot" on the web as more sites and pages fall out of maintenance.
Abstract: The celebrated PageRank algorithm has proved to be a very effective paradigm for ranking results of web search algorithms. In this paper we refine this basic paradigm to take into account several evolving prominent features of the web, and propose several algorithmic innovations. First, we analyze features of the rapidly growing "frontier" of the web, namely the part of the web that crawlers are unable to cover for one reason or another. We analyze the effect of these pages and find it to be significant. We suggest ways to improve the quality of ranking by modeling the growing presence of "link rot" on the web as more sites and pages fall out of maintenance. Finally we suggest new methods of ranking that are motivated by the hierarchical structure of the web, are more efficient than PageRank, and may be more resistant to direct manipulation.

269 citations

Proceedings ArticleDOI
S. Muthukrishnan1
06 Jan 2002
TL;DR: This paper considers document retrieval problems that are motivated by online query processing in databases, Information Retrieval systems and Computational Biology, and provides the first known optimal algorithm for the document listing problem.
Abstract: We are given a collection D of text documents d1,…,dk, with ∑i = n, which may be preprocessed. In the document listing problem, we are given an online query comprising of a pattern string p of length m and our goal is to return the set of all documents that contain one or more copies of p. In the closely related occurrence listing problem, we output the set of all positions within the documents where pattern p occurs. In 1973, Weiner [24] presented an algorithm with O(n) time and space preprocessing following which the occurrence listing problem can be solved in time O(m + output) where output is the number of positions where p occurs; this algorithm is clearly optimal. In contrast, no optimal algorithm is known for the closely related document listing problem, which is perhaps more natural and certainly well-motivated.We provide the first known optimal algorithm for the document listing problem. More generally, we initiate the study of pattern matching problems that require retrieving documents matched by the patterns; this contrasts with pattern matching problems that have been studied more frequently, namely, those that involve retrieving all occurrences of patterns. We consider document retrieval problems that are motivated by online query processing in databases, Information Retrieval systems and Computational Biology. We present very efficient (optimal) algorithms for our document retrieval problems. Our approach for solving such problems involve performing "local" encodings whereby they are reduced to range query problems on geometric objects --- points and lines --- that have color. We present improved algorithms for these colored range query problems that arise in our reductions using the structural properties of strings. This approach is quite general and yields simple, efficient, implementable algorithms for all the document retrieval problems in this paper.

267 citations

Journal ArticleDOI
TL;DR: In this article, a knowledge-based extended Boolean model (kb•ebm) is proposed to evaluate weighted queries and documents effectively, and avoids the problems of the previous methods.
Abstract: There have been several document ranking methods to calculate the conceptual distance or closeness between a Boolean query and a document. Though they provide good retrieval effectiveness in many cases, they do not support effective weighting schemes for queries and documents and also have several problems resulting from inappropriate evaluation of Boolean operators. We propose a new method called Knowledge‐Based Extended Boolean Model (kb‐ebm) in which Salton's extended Boolean model is incorporated. kb‐ebm evaluates weighted queries and documents effectively, and avoids the problems of the previous methods. kb‐ebm provides high quality document rankings by using term dependence information from is‐a hierarchies The performance experiments show that the proposed method closely simulates human behaviour.

266 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168