scispace - formally typeset
Search or ask a question
Proceedings Article

Learning Rankings via Convex Hull Separation

05 Dec 2005-Vol. 18, pp 395-402
TL;DR: Experiments indicate that the proposed algorithm for learning ranking functions from order constraints between sets—i.e. classes—of training samples is at least as accurate as the current state-of-the-art and several orders of magnitude faster than current methods.
Abstract: We propose efficient algorithms for learning ranking functions from order constraints between sets—i.e. classes—of training samples. Our algorithms may be used for maximizing the generalized Wilcoxon Mann Whitney statistic that accounts for the partial ordering of the classes: special cases include maximizing the area under the ROC curve for binary classification and its generalization for ordinal regression. Experiments on public benchmarks indicate that: (a) the proposed algorithm is at least as accurate as the current state-of-the-art; (b) computationally, it is several orders of magnitude faster and—unlike current methods—it is easily able to handle even large datasets with over 20,000 samples.

Content maybe subject to copyright    Report

Citations
More filters
Book
Tie-Yan Liu1
27 Jun 2009
TL;DR: Three major approaches to learning to rank are introduced, i.e., the pointwise, pairwise, and listwise approaches, the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures are analyzed, and the performance of these approaches on the LETOR benchmark datasets is evaluated.
Abstract: This tutorial is concerned with a comprehensive introduction to the research area of learning to rank for information retrieval. In the first part of the tutorial, we will introduce three major approaches to learning to rank, i.e., the pointwise, pairwise, and listwise approaches, analyze the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures, evaluate the performance of these approaches on the LETOR benchmark datasets, and demonstrate how to use these approaches to solve real ranking applications. In the second part of the tutorial, we will discuss some advanced topics regarding learning to rank, such as relational ranking, diverse ranking, semi-supervised ranking, transfer ranking, query-dependent ranking, and training data preprocessing. In the third part, we will briefly mention the recent advances on statistical learning theory for ranking, which explain the generalization ability and statistical consistency of different ranking methods. In the last part, we will conclude the tutorial and show several future research directions.

2,515 citations


Cites background from "Learning Rankings via Convex Hull S..."

  • ...• Other learning-to-rank algorithms [15, 19, 32, 93, 109, 127, 142, 143, 144] that are based on association rules, decision systems, and other technologies; other theoretical analysis on ranking [50]; and applications of learning-to-rank methods [87, 128]....

    [...]

Proceedings ArticleDOI
20 Jun 2007
TL;DR: It is proposed that learning to rank should adopt the listwise approach in which lists of objects are used as 'instances' in learning, and introduces two probability models, respectively referred to as permutation probability and top k probability, to define a listwise loss function for learning.
Abstract: The paper is concerned with learning to rank, which is to construct a model or a function for ranking objects. Learning to rank is useful for document retrieval, collaborative filtering, and many other applications. Several methods for learning to rank have been proposed, which take object pairs as 'instances' in learning. We refer to them as the pairwise approach in this paper. Although the pairwise approach offers advantages, it ignores the fact that ranking is a prediction task on list of objects. The paper postulates that learning to rank should adopt the listwise approach in which lists of objects are used as 'instances' in learning. The paper proposes a new probabilistic method for the approach. Specifically it introduces two probability models, respectively referred to as permutation probability and top k probability, to define a listwise loss function for learning. Neural Network and Gradient Descent are then employed as model and algorithm in the learning method. Experimental results on information retrieval show that the proposed listwise approach performs better than the pairwise approach.

2,003 citations

Proceedings ArticleDOI
Jun Xu1, Hang Li1
23 Jul 2007
TL;DR: The proposed novel learning algorithm, referred to as AdaRank, repeatedly constructs 'weak rankers' on the basis of reweighted training data and finally linearly combines the weak rankers for making ranking predictions, which proves that the training process of AdaRank is exactly that of enhancing the performance measure used.
Abstract: In this paper we address the issue of learning to rank for document retrieval. In the task, a model is automatically created with some training data and then is utilized for ranking of documents. The goodness of a model is usually evaluated with performance measures such as MAP (Mean Average Precision) and NDCG (Normalized Discounted Cumulative Gain). Ideally a learning algorithm would train a ranking model that could directly optimize the performance measures with respect to the training data. Existing methods, however, are only able to train ranking models by minimizing loss functions loosely related to the performance measures. For example, Ranking SVM and RankBoost train ranking models by minimizing classification errors on instance pairs. To deal with the problem, we propose a novel learning algorithm within the framework of boosting, which can minimize a loss function directly defined on the performance measures. Our algorithm, referred to as AdaRank, repeatedly constructs 'weak rankers' on the basis of reweighted training data and finally linearly combines the weak rankers for making ranking predictions. We prove that the training process of AdaRank is exactly that of enhancing the performance measure used. Experimental results on four benchmark datasets show that AdaRank significantly outperforms the baseline methods of BM25, Ranking SVM, and RankBoost.

873 citations


Cites background from "Learning Rankings via Convex Hull S..."

  • ...For other approaches to learning to rank, refer to [2, 11, 31]....

    [...]

Journal ArticleDOI
Tie-Yan Liu1
TL;DR: A statistical ranking theory is introduced, which can describe different learning-to-rank algorithms, and be used to analyze their query-level generalization abilities.
Abstract: Learning to rank for Information Retrieval (IR) is a task to automatically construct a ranking model using training data, such that the model can sort new objects according to their degrees of relevance, preference, or importance. Many IR problems are by nature ranking problems, and many IR technologies can be potentially enhanced by using learning-to-rank techniques. The objective of this tutorial is to give an introduction to this research direction. Specifically, the existing learning-to-rank algorithms are reviewed and categorized into three approaches: the pointwise, pairwise, and listwise approaches. The advantages and disadvantages with each approach are analyzed, and the relationships between the loss functions used in these approaches and IR evaluation measures are discussed. Then the empirical evaluations on typical learning-to-rank methods are shown, with the LETOR collection as a benchmark dataset, which seems to suggest that the listwise approach be the most effective one among all the approaches. After that, a statistical ranking theory is introduced, which can describe different learning-to-rank algorithms, and be used to analyze their query-level generalization abilities. At the end of the tutorial, we provide a summary and discuss potential future work on learning to rank.

591 citations

References
More filters
Journal ArticleDOI
TL;DR: This work describes and analyze an efficient algorithm called RankBoost for combining preferences based on the boosting approach to machine learning, and gives theoretical results describing the algorithm's behavior both on the training data, and on new test data not seen during training.
Abstract: We study the problem of learning to accurately rank a set of objects by combining a given collection of ranking or preference functions. This problem of combining preferences arises in several applications, such as that of combining the results of different search engines, or the "collaborative-filtering" problem of ranking movies for a user based on the movie rankings provided by other users. In this work, we begin by presenting a formal framework for this general problem. We then describe and analyze an efficient algorithm called RankBoost for combining preferences based on the boosting approach to machine learning. We give theoretical results describing the algorithm's behavior both on the training data, and on new test data not seen during training. We also describe an efficient implementation of the algorithm for a particular restricted but common case. We next discuss two experiments we carried out to assess the performance of RankBoost. In the first experiment, we used the algorithm to combine different web search strategies, each of which is a query expansion for a given domain. The second experiment is a collaborative-filtering task for making movie recommendations.

1,889 citations


"Learning Rankings via Convex Hull S..." refers methods in this paper

  • ...In other related work, boosting methods have been p roposed for learning preferences [3], and a combinatorial structure called the ranking poset was used for conditional modeling of partially ranked data[8], in the context of comb ining ranked sets of web pages produced by various web-page search engines....

    [...]

Proceedings Article
24 Jul 1998
TL;DR: RankBoost as discussed by the authors is an algorithm for combining preferences based on the boosting approach to machine learning, which can be applied to several applications, such as that of combining the results of different search engines, or the "collaborative filtering" problem of ranking movies for a user based on movie rankings provided by other users.
Abstract: We study the problem of learning to accurately rank a set of objects by combining a given collection of ranking or preference functions. This problem of combining preferences arises in several applications, such as that of combining the results of different search engines, or the "collaborative-filtering" problem of ranking movies for a user based on the movie rankings provided by other users. In this work, we begin by presenting a formal framework for this general problem. We then describe and analyze an efficient algorithm called RankBoost for combining preferences based on the boosting approach to machine learning. We give theoretical results describing the algorithm's behavior both on the training data, and on new test data not seen during training. We also describe an efficient implementation of the algorithm for a particular restricted but common case. We next discuss two experiments we carried out to assess the performance of RankBoost. In the first experiment, we used the algorithm to combine different web search strategies, each of which is a query expansion for a given domain. The second experiment is a collaborative-filtering task for making movie recommendations.

1,888 citations

Proceedings ArticleDOI
04 Jul 2004
TL;DR: This paper proposes to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs, and demonstrates the versatility and effectiveness of the method on problems ranging from supervised grammar learning and named-entity recognition, to taxonomic text classification and sequence alignment.
Abstract: Learning general functional dependencies is one of the main goals in machine learning. Recent progress in kernel-based methods has focused on designing flexible and powerful input representations. This paper addresses the complementary issue of problems involving complex outputs such as multiple dependent output variables and structured output spaces. We propose to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs. The resulting optimization problem is solved efficiently by a cutting plane algorithm that exploits the sparseness and structural decomposition of the problem. We demonstrate the versatility and effectiveness of our method on problems ranging from supervised grammar learning and named-entity recognition, to taxonomic text classification and sequence alignment.

1,446 citations


"Learning Rankings via Convex Hull S..." refers methods in this paper

  • ...g.[13] presents SVM based algorithms for handling structured and i nterdependent output spaces, and [5] discusses automatic document categorization into p re-defined hierarchies or taxonomies of topics....

    [...]

01 Jan 2000

1,049 citations


"Learning Rankings via Convex Hull S..." refers background or methods in this paper

  • ...By contrast, bipartite ranking solutions are evaluated using theWilcoxon-Mann-Whitney (WMW) statistic which measures the (sample averaged) probability that anypair of samples is ordered correctly; intuitively, the WMW statistic may be interpreted as thearea under the ROC curve(AUC)....

    [...]

  • ...Experiments on public benchmarks indicate that: (a) the proposed algorithm s at least as accurate as the current state-of-the-art; (b) computationally, it is several orders of magnitude faster and—unlike current methods—it is easily able to handle even large datasets with over 20,000 samples....

    [...]

  • ...Ordinal regression and methods for handling structured output classes: For a classic description of generalized linear models for ordinal regression, see [11]....

    [...]

  • ...W compare our method against SVM for ranking (e.g.[4, 6]) using the SVM-light package2 and an efficient Gaussian process method (the informative vector machine)3 [7]....

    [...]

01 Jan 2000

864 citations


"Learning Rankings via Convex Hull S..." refers background or methods in this paper

  • ...g.[4, 6]) using the SVM-light package 2 and an efficient Gaussian process method (the informative ve ctor machine)3 [7]....

    [...]

  • ...g.[4]) as a special case when each set A is reduced to a singleton and the order graph is equal to...

    [...]

  • ...Learning Rankings: The problem of learning rankings was first treated as a classi fication problem on pairs of objects by Herbrich [4] and subsequently used on a web page ranking task by Joachims [6]; a variety of authors have investigated this approach recently....

    [...]

  • ...Learning Rankings: The problem of learning rankings was first treated as a classification problem on pairs of objects by Herbrich [4] and subsequentlyused on a web page ranking task by Joachims [6]; a variety of authors have investigatedthis approach recently....

    [...]

  • ...Relationship to the proposed work:Our algorithm penalizes wrong ordering of pairs of training instances in order to learn ranking functions (sim ilar to [4]), but in addition, it can also utilize the notion of a structured class order graph....

    [...]