scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This work presents a computationally simple and theoretically justified method for assigning scores to candidate expansion terms within Rocchio's framework for query reweigthing, and discusses the effect on retrieval effectiveness of the main parameters involved in automatic query expansion.
Abstract: Techniques for automatic query expansion from top retrieved documents have shown promise for improving retrieval effectiveness on large collections; however, they often rely on an empirical ground, and there is a shortage of cross-system comparisons. Using ideas from Information Theory, we present a computationally simple and theoretically justified method for assigning scores to candidate expansion terms. Such scores are used to select and weight expansion terms within Rocchio's framework for query reweigthing. We compare ranking with information-theoretic query expansion versus ranking with other query expansion techniques, showing that the former achieves better retrieval effectiveness on several performance measures. We also discuss the effect on retrieval effectiveness of the main parameters involved in automatic query expansion, such as data sparseness, query difficulty, number of selected documents, and number of selected terms, pointing out interesting relationships.

404 citations

Book ChapterDOI
06 Aug 1995
TL;DR: An algorithm for ranking spatial objects according to increasing distance from a query object is introduced and analyzed, which is well suited for k nearest neighbor queries, and has the property that k needs not be fixed in advance.
Abstract: An algorithm for ranking spatial objects according to increasing distance from a query object is introduced and analyzed. The algorithm makes use of a hierarchical spatial data structure. The intended application area is a database environment, where the spatial data structure serves as an index. The algorithm is incremental in the sense that objects are reported one by one, so that a query processor can use the algorithm in a pipelined fashion for complex queries involving proximity. It is well suited for k nearest neighbor queries, and has the property that k needs not be fixed in advance.

400 citations

Journal ArticleDOI
TL;DR: In this article, a cost-effectiveness methodology is constructed, which results in a particular formula that can be used as a criterion to rank projects, and the ranking criterion is sufficiently operational to be useful in suggesting what to look at when determining actual conservation priorities among endangered species.
Abstract: This paper is about the economic theory of biodiversity preservation. A cost-effectiveness methodology is constructed, which results in a particular formula that can be used as a criterion to rank projects. The ranking criterion is sufficiently operational to be useful in suggesting what to look at when determining actual conservation priorities among endangered species. At the same time, the formula is firmly rooted in a mathematically rigorous optimization framework, so that its theoretical underpinnings are clear. The underlying model, called the Noah's Ark Problem, is intended to be a kind of canonical form that hones down to its analytical essence the problem of best preserving diversity under a limited budget constraint.

400 citations

Journal ArticleDOI
TL;DR: In this paper, the authors consider the situation where no relevance information is available, that is, at the start of the search, and propose strategies based on a probabilistic model for the initial search and an intermediate search.
Abstract: Most probabilistic retrieval models incorporate information about the occurrence of index terms in relevant and non‐relevant documents. In this paper we consider the situation where no relevance information is available, that is, at the start of the search. Based on a probabilistic model, strategies are proposed for the initial search and an intermediate search. Retrieval experiments with the Cranfield collection of 1,400 documents show that this initial search strategy is better than conventional search strategies both in terms of retrieval effectiveness and in terms of the number of queries that retrieve relevant documents. The intermediate search is shown to be a useful substitute for a relevance feedback search. Experiments with queries that do not retrieve relevant documents at high rank positions indicate that a cluster search would be an effective alternative strategy.

399 citations

Proceedings ArticleDOI
28 Jun 2009
TL;DR: This paper proposes a method for tag recommendation based on tensor factorization (TF) and provides a gradient descent algorithm to solve the optimization problem and demonstrates that this method outperforms other state-of-the-art tag recommendation methods like FolkRank, PageRank and HOSVD both in quality and prediction runtime.
Abstract: Tag recommendation is the task of predicting a personalized list of tags for a user given an item. This is important for many websites with tagging capabilities like last.fm or delicious. In this paper, we propose a method for tag recommendation based on tensor factorization (TF). In contrast to other TF methods like higher order singular value decomposition (HOSVD), our method RTF ('ranking with tensor factorization') directly optimizes the factorization model for the best personalized ranking. RTF handles missing values and learns from pairwise ranking constraints. Our optimization criterion for TF is motivated by a detailed analysis of the problem and of interpretation schemes for the observed data in tagging systems. In all, RTF directly optimizes for the actual problem using a correct interpretation of the data. We provide a gradient descent algorithm to solve our optimization problem. We also provide an improved learning and prediction method with runtime complexity analysis for RTF. The prediction runtime of RTF is independent of the number of observations and only depends on the factorization dimensions. Besides the theoretical analysis, we empirically show that our method outperforms other state-of-the-art tag recommendation methods like FolkRank, PageRank and HOSVD both in quality and prediction runtime.

399 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168