(PDF) Cumulated gain-based evaluation of IR techniques (2002) | Kalervo Järvelin

Citations

PDF

Open Access

More filters

Book•

Learning to Rank for Information Retrieval

[...]

Tie-Yan Liu¹•Institutions (1)

Microsoft¹

27 Jun 2009

TL;DR: Three major approaches to learning to rank are introduced, i.e., the pointwise, pairwise, and listwise approaches, the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures are analyzed, and the performance of these approaches on the LETOR benchmark datasets is evaluated.

...read moreread less

Abstract: This tutorial is concerned with a comprehensive introduction to the research area of learning to rank for information retrieval. In the first part of the tutorial, we will introduce three major approaches to learning to rank, i.e., the pointwise, pairwise, and listwise approaches, analyze the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures, evaluate the performance of these approaches on the LETOR benchmark datasets, and demonstrate how to use these approaches to solve real ranking applications. In the second part of the tutorial, we will discuss some advanced topics regarding learning to rank, such as relational ranking, diverse ranking, semi-supervised ranking, transfer ranking, query-dependent ranking, and training data preprocessing. In the third part, we will briefly mention the recent advances on statistical learning theory for ranking, which explain the generalization ability and statistical consistency of different ranking methods. In the last part, we will conclude the tutorial and show several future research directions.

...read moreread less

2,515 citations

Cites background from "Cumulated gain-based evaluation of ..."

...Discounted cumulative gain (DCG): While the aforementioned measures are mainly designed for binary judgments, DCG [65, 66] can leverage the relevance judgment in terms of multiple ordered categories, and has an explicit position discount factor in its definition....
[...]

Journal Article•DOI•

PathSim: meta path-based top-K similarity search in heterogeneous information networks

[...]

Yizhou Sun¹, Jiawei Han¹, Xifeng Yan², Philip S. Yu³, Tianyi Wu⁴ - Show less +1 more•Institutions (4)

University of Illinois at Urbana–Champaign¹, University of California, Santa Barbara², University of Illinois at Chicago³, Microsoft⁴

01 Aug 2011

TL;DR: Under the meta path framework, a novel similarity measure called PathSim is defined that is able to find peer objects in the network (e.g., find authors in the similar field and with similar reputation), which turns out to be more meaningful in many scenarios compared with random-walk based similarity measures.

...read moreread less

Abstract: Similarity search is a primitive operation in database and Web search engines. With the advent of large-scale heterogeneous information networks that consist of multi-typed, interconnected objects, such as the bibliographic networks and social media networks, it is important to study similarity search in such networks. Intuitively, two objects are similar if they are linked by many paths in the network. However, most existing similarity measures are defined for homogeneous networks. Different semantic meanings behind paths are not taken into consideration. Thus they cannot be directly applied to heterogeneous networks.In this paper, we study similarity search that is defined among the same type of objects in heterogeneous networks. Moreover, by considering different linkage paths in a network, one could derive various similarity semantics. Therefore, we introduce the concept of meta path-based similarity, where a meta path is a path consisting of a sequence of relations defined between different object types (i.e., structural paths at the meta level). No matter whether a user would like to explicitly specify a path combination given sufficient domain knowledge, or choose the best path by experimental trials, or simply provide training examples to learn it, meta path forms a common base for a network-based similarity search engine. In particular, under the meta path framework we define a novel similarity measure called PathSim that is able to find peer objects in the network (e.g., find authors in the similar field and with similar reputation), which turns out to be more meaningful in many scenarios compared with random-walk based similarity measures. In order to support fast online query processing for PathSim queries, we develop an efficient solution that partially materializes short meta paths and then concatenates them online to compute top-k results. Experiments on real data sets demonstrate the effectiveness and efficiency of our proposed paradigm.

...read moreread less

1,583 citations

Cites result from "Cumulated gain-based evaluation of ..."

...Then we use the measure nDCG (Normalized Discounted Cumulative Gain, with the value between 0 and 1, the higher the better) [9] to evaluate the quality of a ranking algorithm by comparing its output ranking results with the labeled ones (Table 5)....
[...]

Book Chapter•DOI•

A probabilistic interpretation of precision, recall and F -score, with implication for evaluation

[...]

Cyril Goutte¹, Eric Gaussier¹•Institutions (1)

Xerox¹

21 Mar 2005

TL;DR: A probabilistic setting is used which allows us to obtain posterior distributions on these performance indicators, rather than point estimates, and is applied to the case where different methods are run on different datasets from the same source.

...read moreread less

Abstract: We address the problems of 1/ assessing the confidence of the standard point estimates, precision, recall and F-score, and 2/ comparing the results, in terms of precision, recall and F-score, obtained using two different methods. To do so, we use a probabilistic setting which allows us to obtain posterior distributions on these performance indicators, rather than point estimates. This framework is applied to the case where different methods are run on different datasets from the same source, as well as the standard situation where competing results are obtained on the same data.

...read moreread less

1,402 citations

Book Chapter•DOI•

Evaluating Recommendation Systems

[...]

Guy Shani¹, Asela Gunawardana¹•Institutions (1)

Microsoft¹

01 Jan 2011

TL;DR: This paper discusses how to compare recommenders based on a set of properties that are relevant for the application, and focuses on comparative studies, where a few algorithms are compared using some evaluation metric, rather than absolute benchmarking of algorithms.

...read moreread less

Abstract: Recommender systems are now popular both commercially and in the research community, where many approaches have been suggested for providing recommendations. In many cases a system designer that wishes to employ a recommendation system must choose between a set of candidate approaches. A first step towards selecting an appropriate algorithm is to decide which properties of the application to focus upon when making this choice. Indeed, recommendation systems have a variety of properties that may affect user experience, such as accuracy, robustness, scalability, and so forth. In this paper we discuss how to compare recommenders based on a set of properties that are relevant for the application. We focus on comparative studies, where a few algorithms are compared using some evaluation metric, rather than absolute benchmarking of algorithms. We describe experimental settings appropriate for making choices between algorithms. We review three types of experiments, starting with an offline setting, where recommendation approaches are compared without user interaction, then reviewing user studies, where a small group of subjects experiment with the system and report on the experience, and finally describe large scale online experiments, where real user populations interact with the system. In each of these cases we describe types of questions that can be answered, and suggest protocols for experimentation. We also discuss how to draw trustworthy conclusions from the conducted experiments. We then review a large set of properties, and explain how to evaluate systems given relevant properties. We also survey a large set of evaluation metrics in the context of the properties that they evaluate.

...read moreread less

1,238 citations

Cites background from "Cumulated gain-based evaluation of ..."

...Normalized Cumulative Discounted Gain (NDCG) [27] is a measure from information retrieval, where positions are discounted logarithmically....
[...]
...Thus, ARHR decays more slowly than R score but faster than NDCG. Online evaluation of ranking In an online experiment designed to evaluate the ranking of the recommendation list, we can look at the interactions of users with the system....
[...]
...A measure closely related to R-score and NDCG is Average Reciprocal Hit Rank (ARHR) [14] which is an un-normalized measure that assigns a utility 1/k to a successful recommendation at position k....
[...]
...NDCG is the normalized version of DCG given by NDCG = DCG DCG∗ (8.18) where DCG∗ is the ideal DCG....
[...]

Book•

Search Engines: Information Retrieval in Practice

[...]

W. Bruce Croft, Donald Metzler, Trevor Strohman

16 Feb 2009

TL;DR: This text provides the background and tools needed to evaluate, compare and modify search engines and numerous programming exercises make extensive use of Galago, a Java-based open source search engine.

...read moreread less

Abstract: KEY BENEFIT: Written by a leader in the field of information retrieval, this text provides the background and tools needed to evaluate, compare and modify search engines. KEY TOPICS: Coverage of the underlying IR and mathematical models reinforce key concepts. Numerous programming exercises make extensive use of Galago, a Java-based open source search engine. MARKET: A valuable tool for search engine and information retrieval professionals.

...read moreread less

1,050 citations

Collapse

Cumulated gain-based evaluation of IR techniques

Citations

Cites background from "Cumulated gain-based evaluation of ..."

Cites result from "Cumulated gain-based evaluation of ..."

Cites background from "Cumulated gain-based evaluation of ..."

References

"Cumulated gain-based evaluation of ..." refers background in this paper

Related Papers (5)