scispace - formally typeset
Search or ask a question
Proceedings Article

Learning Rankings via Convex Hull Separation

05 Dec 2005-Vol. 18, pp 395-402
TL;DR: Experiments indicate that the proposed algorithm for learning ranking functions from order constraints between sets—i.e. classes—of training samples is at least as accurate as the current state-of-the-art and several orders of magnitude faster than current methods.
Abstract: We propose efficient algorithms for learning ranking functions from order constraints between sets—i.e. classes—of training samples. Our algorithms may be used for maximizing the generalized Wilcoxon Mann Whitney statistic that accounts for the partial ordering of the classes: special cases include maximizing the area under the ROC curve for binary classification and its generalization for ordinal regression. Experiments on public benchmarks indicate that: (a) the proposed algorithm is at least as accurate as the current state-of-the-art; (b) computationally, it is several orders of magnitude faster and—unlike current methods—it is easily able to handle even large datasets with over 20,000 samples.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
Jun Xu1, Tie-Yan Liu1, Min Lu2, Hang Li1, Wei-Ying Ma1 
20 Jul 2008
TL;DR: Experimental results show that the methods based on direct optimization of evaluation measures can always outperform conventional methods of Ranking SVM and RankBoost, however, no significant difference exists among the performances of the direct optimization methods themselves.
Abstract: One of the central issues in learning to rank for information retrieval is to develop algorithms that construct ranking models by directly optimizing evaluation measures used in information retrieval such as Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG). Several such algorithms including SVMmap and AdaRank have been proposed and their effectiveness has been verified. However, the relationships between the algorithms are not clear, and furthermore no comparisons have been conducted between them. In this paper, we conduct a study on the approach of directly optimizing evaluation measures in learning to rank for Information Retrieval (IR). We focus on the methods that minimize loss functions upper bounding the basic loss function defined on the IR measures. We first provide a general framework for the study and analyze the existing algorithms of SVMmap and AdaRank within the framework. The framework is based on upper bound analysis and two types of upper bounds are discussed. Moreover, we show that we can derive new algorithms on the basis of this analysis and create one example algorithm called PermuRank. We have also conducted comparisons between SVMmap, AdaRank, PermuRank, and conventional methods of Ranking SVM and RankBoost, using benchmark datasets. Experimental results show that the methods based on direct optimization of evaluation measures can always outperform conventional methods of Ranking SVM and RankBoost. However, no significant difference exists among the performances of the direct optimization methods themselves.

159 citations


Cites methods from "Learning Rankings via Convex Hull S..."

  • ...For other methods belonging to the approach, refer to [12, 29, 6, 25, 21, 30, 24, 31]....

    [...]

Book ChapterDOI
01 Jan 2010
TL;DR: Label ranking is a complex prediction task where the goal is to map instances to a total order over a finite set of predefined labels as mentioned in this paper, and it subsumes several supervised learning problems, such as multiclass prediction, multilabel classification, and hierarchical classification.
Abstract: Label ranking is a complex prediction task where the goal is to map instances to a total order over a finite set of predefined labels. An interesting aspect of this problem is that it subsumes several supervised learning problems, such as multiclass prediction, multilabel classification, and hierarchical classification. Unsurprisingly, there exists a plethora of label ranking algorithms in the literature due, in part, to this versatile nature of the problem. In this paper, we survey these algorithms.

126 citations

Journal Article
TL;DR: This work discusses the problem of learning to rank labels from a real valued feedback associated with each label and describes an efficient algorithm, called SOPOPO, for solving the reduced problem by employing a soft projection onto the polyhedron defined by a reduced set of constraints.
Abstract: We discuss the problem of learning to rank labels from a real valued feedback associated with each label. We cast the feedback as a preferences graph where the nodes of the graph are the labels and edges express preferences over labels. We tackle the learning problem by defining a loss function for comparing a predicted graph with a feedback graph. This loss is materialized by decomposing the feedback graph into bipartite sub-graphs. We then adopt the maximum-margin framework which leads to a quadratic optimization problem with linear constraints. While the size of the problem grows quadratically with the number of the nodes in the feedback graph, we derive a problem of a significantly smaller size and prove that it attains the same minimum. We then describe an efficient algorithm, called SOPOPO, for solving the reduced problem by employing a soft projection onto the polyhedron defined by a reduced set of constraints. We also describe and analyze a wrapper procedure for batch learning when multiple graphs are provided for training. We conclude with a set of experiments which show significant improvements in run time over a state of the art interior-point algorithm.

102 citations

Proceedings ArticleDOI
20 Jul 2008
TL;DR: This paper analyzes the properties of ties and develops novel learning frameworks which combine ties and preference data using statistical paired comparison models to improve the performance of learned ranking functions.
Abstract: Designing effective ranking functions is a core problem for information retrieval and Web search since the ranking functions directly impact the relevance of the search results. The problem has been the focus of much of the research at the intersection of Web search and machine learning, and learning ranking functions from preference data in particular has recently attracted much interest. The objective of this paper is to empirically examine several objective functions that can be used for learning ranking functions from preference data. Specifically, we investigate the roles of ties in the learning process. By ties, we mean preference judgments that two documents have equal degree of relevance with respect to a query. This type of data has largely been ignored or not properly modeled in the past. In this paper, we analyze the properties of ties and develop novel learning frameworks which combine ties and preference data using statistical paired comparison models to improve the performance of learned ranking functions. The resulting optimization problems explicitly incorporating ties and preference data are solved using gradient boosting methods. Experimental studies are conducted using three publicly available data sets which demonstrate the effectiveness of the proposed new methods.

45 citations


Cites methods from "Learning Rankings via Convex Hull S..."

  • ...Other methods for learning ranking functions are proposed in [4,12] Another approach to learning a ranking function address the problem of optimizing the loss function directly related to the performance measures of information retrieval, such as precision, mean average precision and Normalized Discount Cumulative Gain [6, 17, 28, 29]....

    [...]

Proceedings Article
21 Jun 2014
TL;DR: Theoretical analysis demonstrates that the proposed large margin optimization criteria can strengthen and improve the robustness and generalization performance of preference learning algorithms on the obtained low-dimensional subspace.
Abstract: This paper studies dimensionality reduction in a weakly supervised setting, in which the preference relationship between examples is indicated by weak cues. A novel framework is proposed that integrates two aspects of the large margin principle (angle and distance), which simultaneously encourage angle consistency between preference pairs and maximize the distance between examples in preference pairs. Two specific algorithms are developed: an alternating direction method to learn a linear transformation matrix and a gradient boosting technique to optimize a non-linear transformation directly in the function space. Theoretical analysis demonstrates that the proposed large margin optimization criteria can strengthen and improve the robustness and generalization performance of preference learning algorithms on the obtained low-dimensional subspace. Experimental results on real-world datasets demonstrate the significance of studying dimensionality reduction in the weakly supervised setting and the effectiveness of the proposed framework.

43 citations


Cites methods from "Learning Rankings via Convex Hull S..."

  • ...In this paper we study the weakly supervised dimensionality reduction problem, where the preference relationships between examples are provided rather than explicit class labels....

    [...]

  • ...Three datasets 3 (Table 1) were used that have previously been used to evaluate ranking and ordinal regression (Fung et al., 2006)....

    [...]

References
More filters
Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations


"Learning Rankings via Convex Hull S..." refers background in this paper

  • ...Note that enforcing the constraints defined above indeed implies the desired ordering, since we have: Aw + y ≥ −γ ≥ γ̂ + 1 ≥ γ̂ ≥ Aw − y It is also important to note the connection with Support Vector Machines (SVM) formulation [10, 14] for the binary case....

    [...]

Book
01 Jan 1983
TL;DR: In this paper, a generalization of the analysis of variance is given for these models using log- likelihoods, illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables), and gamma (variance components).
Abstract: The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log- likelihoods. These generalized linear models are illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables) and gamma (variance components).

23,215 citations


"Learning Rankings via Convex Hull S..." refers methods in this paper

  • ...Ordinal regression and methods for handling structured output classes: For a classic description of generalized linear models for ordinal regre ssion, see [11]....

    [...]

01 Feb 1977

5,933 citations


"Learning Rankings via Convex Hull S..." refers background in this paper

  • ...B′u− w′[A− ′ − A+ ′ ] = 0, b′u ≤ −1, u ≥ 0, (7) Where the second equivalent form of the constraints was obtained by negation (as before), and the third equivalent form results from ourthird key insight: the application of Farka’s theorem of alternatives[9]....

    [...]

Proceedings ArticleDOI
23 Jul 2002
TL;DR: The goal of this paper is to develop a method that utilizes clickthrough data for training, namely the query-log of the search engine in connection with the log of links the users clicked on in the presented ranking.
Abstract: This paper presents an approach to automatically optimizing the retrieval quality of search engines using clickthrough data. Intuitively, a good information retrieval system should present relevant documents high in the ranking, with less relevant documents following below. While previous approaches to learning retrieval functions from examples exist, they typically require training data generated from relevance judgments by experts. This makes them difficult and expensive to apply. The goal of this paper is to develop a method that utilizes clickthrough data for training, namely the query-log of the search engine in connection with the log of links the users clicked on in the presented ranking. Such clickthrough data is available in abundance and can be recorded at very low cost. Taking a Support Vector Machine (SVM) approach, this paper presents a method for learning retrieval functions. From a theoretical perspective, this method is shown to be well-founded in a risk minimization framework. Furthermore, it is shown to be feasible even for large sets of queries and features. The theoretical results are verified in a controlled experiment. It shows that the method can effectively adapt the retrieval function of a meta-search engine to a particular group of users, outperforming Google in terms of retrieval quality after only a couple of hundred training examples.

4,453 citations

Book
01 Jan 1969
TL;DR: It is shown that if A is closed for all k → x x, k → y y, where ( k A ∈ ) k y x , then ( ) A ∉ y x .
Abstract: Part 1 (if): Assume that Z is closed. We must show that if A is closed for all k → x x , k → y y , where ( k A ∈ ) k y x , then ( ) A ∈ y x . By the definition of Z being closed, we know that all points arbitrarily close to Z are in Z. Let k → x x , k → y y , and ( k A ∈ ) k y x . Now, for any ε > 0, there exists an N such that for all k ≥ N we have || || k ε − < x x , || || k ε − < y y which implies that ( ) , x y is arbitrarily close to Z, so ( ) , x y ∈ Z and ( ) A ∈ y x . Thus, A is closed.

2,146 citations


"Learning Rankings via Convex Hull S..." refers background in this paper

  • ...Bu− w[A ′ − A ′ ] = 0, bu ≤ −1, u ≥ 0, (7) Where the second equivalent form of the constraints was obtai ned by negation (as before), and the third equivalent form results from our third key insight: the application of Farka’s theorem of alternatives[9]....

    [...]