scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Book ChapterDOI
01 Jan 2010
TL;DR: Label ranking is a complex prediction task where the goal is to map instances to a total order over a finite set of predefined labels as mentioned in this paper, and it subsumes several supervised learning problems, such as multiclass prediction, multilabel classification, and hierarchical classification.
Abstract: Label ranking is a complex prediction task where the goal is to map instances to a total order over a finite set of predefined labels. An interesting aspect of this problem is that it subsumes several supervised learning problems, such as multiclass prediction, multilabel classification, and hierarchical classification. Unsurprisingly, there exists a plethora of label ranking algorithms in the literature due, in part, to this versatile nature of the problem. In this paper, we survey these algorithms.

126 citations

Proceedings ArticleDOI
13 Jun 2011
TL;DR: This paper proposes a novel source independent framework for research paper recommendation that requires as input only a single research paper and generates several potential queries by using terms in that paper, which are then submitted to existing Web information sources that hold research papers.
Abstract: As the number of research papers available on the Web has increased enormously over the years, paper recommender systems have been proposed to help researchers on automatically finding works of interest. The main problem with the current approaches is that they assume that recommending algorithms are provided with a rich set of evidence (e.g., document collections, citations, profiles) which is normally not widely available. In this paper we propose a novel source independent framework for research paper recommendation. The framework requires as input only a single research paper and generates several potential queries by using terms in that paper, which are then submitted to existing Web information sources that hold research papers. Once a set of candidate papers for recommendation is generated, the framework applies content-based recommending algorithms to rank the candidates in order to recommend the ones most related to the input paper. This is done by using only publicly available metadata (i.e., title and abstract). We evaluate our proposed framework by performing an extensive experimentation in which we analyzed several strategies for query generation and several ranking strategies for paper recommendation. Our results show that good recommendations can be obtained with simple and low cost strategies.

126 citations

Proceedings Article
16 Jun 2013
TL;DR: This work presents a general approach for converting an algorithm which has linear time in the size of the set to a sublinear one via label partitioning, which consists of learning an input partition and a label assignment to each partition of the space such that precision at k is optimized.
Abstract: We consider the case of ranking a very large set of labels, items, or documents, which is common to information retrieval, recommendation, and large-scale annotation tasks. We present a general approach for converting an algorithm which has linear time in the size of the set to a sublinear one via label partitioning. Our method consists of learning an input partition and a label assignment to each partition of the space such that precision at k is optimized, which is the loss function of interest in this setting. Experiments on large-scale ranking and recommendation tasks show that our method not only makes the original linear time algorithm computationally tractable, but can also improve its performance.

126 citations

Proceedings ArticleDOI
28 Jul 2003
TL;DR: Two new methods for estimating retrieval quality are introduced: the first one computes the empirical distribution of the probabilities of relevance from a small library sample, and assumes it to be representative for the whole library, and the second assumes that the indexing weights follow a normal distribution, leading to anormal distribution for the document scores.
Abstract: In a federated digital library system, it is too expensive to query every accessible library. Resource selection is the task to decide to which libraries a query should be routed. Most existing resource selection algorithms compute a library ranking in a heuristic way. In contrast, the decision-theoretic framework (DTF) follows a different approach on a better theoretic foundation: It computes a selection which minimises the overall costs (e.g. retrieval quality, time, money) of the distributed retrieval. For estimating retrieval quality the recall-precision function is proposed. In this paper, we introduce two new methods: The first one computes the empirical distribution of the probabilities of relevance from a small library sample, and assumes it to be representative for the whole library. The second method assumes that the indexing weights follow a normal distribution, leading to a normal distribution for the document scores. Furthermore, we present the first evaluation of DTF by comparing this theoretical approach with the heuristical state-of-the-art system CORI; here we find that DTF outperforms CORI in most cases.

126 citations

Proceedings ArticleDOI
Jianfeng Gao1, Wei Yuan2, Xiao Li1, Kefeng Deng1, Jian-Yun Nie2 
19 Jul 2009
TL;DR: Two smoothing methods to expand clickthrough data are presented: query clustering via Random Walk on click graphs and a discounting method inspired by the Good-Turing estimator, which consistently outperform those trained on unsmoothed features.
Abstract: Incorporating features extracted from clickthrough data (called clickthrough features) has been demonstrated to significantly improve the performance of ranking models for Web search applications. Such benefits, however, are severely limited by the data sparseness problem, i.e., many queries and documents have no or very few clicks. The ranker thus cannot rely strongly on clickthrough features for document ranking. This paper presents two smoothing methods to expand clickthrough data: query clustering via Random Walk on click graphs and a discounting method inspired by the Good-Turing estimator. Both methods are evaluated on real-world data in three Web search domains. Experimental results show that the ranking models trained on smoothed clickthrough features consistently outperform those trained on unsmoothed features. This study demonstrates both the importance and the benefits of dealing with the sparseness problem in clickthrough data.

126 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168