scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 2009"


Proceedings Article
18 Jun 2009
TL;DR: In this article, the authors proposed a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem, which is based on stochastic gradient descent with bootstrap sampling.
Abstract: Item recommendation is the task of predicting a personalized ranking on a set of items (e.g. websites, movies, products). In this paper, we investigate the most common scenario with implicit feedback (e.g. clicks, purchases). There are many methods for item recommendation from implicit feedback like matrix factorization (MF) or adaptive k-nearest-neighbor (kNN). Even though these methods are designed for the item prediction task of personalized ranking, none of them is directly optimized for ranking. In this paper we present a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem. We also provide a generic learning algorithm for optimizing models with respect to BPR-Opt. The learning method is based on stochastic gradient descent with bootstrap sampling. We show how to apply our method to two state-of-the-art recommender models: matrix factorization and adaptive kNN. Our experiments indicate that for the task of personalized ranking our optimization method outperforms the standard learning techniques for MF and kNN. The results show the importance of optimizing models for the right criterion.

3,429 citations


Book
Tie-Yan Liu1
27 Jun 2009
TL;DR: Three major approaches to learning to rank are introduced, i.e., the pointwise, pairwise, and listwise approaches, the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures are analyzed, and the performance of these approaches on the LETOR benchmark datasets is evaluated.
Abstract: This tutorial is concerned with a comprehensive introduction to the research area of learning to rank for information retrieval. In the first part of the tutorial, we will introduce three major approaches to learning to rank, i.e., the pointwise, pairwise, and listwise approaches, analyze the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures, evaluate the performance of these approaches on the LETOR benchmark datasets, and demonstrate how to use these approaches to solve real ranking applications. In the second part of the tutorial, we will discuss some advanced topics regarding learning to rank, such as relational ranking, diverse ranking, semi-supervised ranking, transfer ranking, query-dependent ranking, and training data preprocessing. In the third part, we will briefly mention the recent advances on statistical learning theory for ranking, which explain the generalization ability and statistical consistency of different ranking methods. In the last part, we will conclude the tutorial and show several future research directions.

2,515 citations


Journal ArticleDOI
Tie-Yan Liu1
TL;DR: A statistical ranking theory is introduced, which can describe different learning-to-rank algorithms, and be used to analyze their query-level generalization abilities.
Abstract: Learning to rank for Information Retrieval (IR) is a task to automatically construct a ranking model using training data, such that the model can sort new objects according to their degrees of relevance, preference, or importance. Many IR problems are by nature ranking problems, and many IR technologies can be potentially enhanced by using learning-to-rank techniques. The objective of this tutorial is to give an introduction to this research direction. Specifically, the existing learning-to-rank algorithms are reviewed and categorized into three approaches: the pointwise, pairwise, and listwise approaches. The advantages and disadvantages with each approach are analyzed, and the relationships between the loss functions used in these approaches and IR evaluation measures are discussed. Then the empirical evaluations on typical learning-to-rank methods are shown, with the LETOR collection as a benchmark dataset, which seems to suggest that the listwise approach be the most effective one among all the approaches. After that, a statistical ranking theory is introduced, which can describe different learning-to-rank algorithms, and be used to analyze their query-level generalization abilities. At the end of the tutorial, we provide a summary and discuss potential future work on learning to rank.

591 citations


Journal ArticleDOI
TL;DR: In this paper, the authors use a hierarchical Bayesian modeling framework and estimate the model using Markov Chain Monte Carlo methods to quantify the relationship between various keyword characteristics, position of the advertisement, and landing page quality score on consumer search and purchase behavior as well as on advertiser's cost per click and the search engine's ranking decision.
Abstract: The phenomenon of sponsored search advertising---where advertisers pay a fee to Internet search engines to be displayed alongside organic (nonsponsored) Web search results---is gaining ground as the largest source of revenues for search engines. Using a unique six-month panel data set of several hundred keywords collected from a large nationwide retailer that advertises on Google, we empirically model the relationship between different sponsored search metrics such as click-through rates, conversion rates, cost per click, and ranking of advertisements. Our paper proposes a novel framework to better understand the factors that drive differences in these metrics. We use a hierarchical Bayesian modeling framework and estimate the model using Markov Chain Monte Carlo methods. Using a simultaneous equations model, we quantify the relationship between various keyword characteristics, position of the advertisement, and the landing page quality score on consumer search and purchase behavior as well as on advertiser's cost per click and the search engine's ranking decision. Specifically, we find that the monetary value of a click is not uniform across all positions because conversion rates are highest at the top and decrease with rank as one goes down the search engine results page. Though search engines take into account the current period's bid as well as prior click-through rates before deciding the final rank of an advertisement in the current period, the current bid has a larger effect than prior click-through rates. We also find that an increase in landing page quality scores is associated with an increase in conversion rates and a decrease in advertiser's cost per click. Furthermore, our analysis shows that keywords that have more prominent positions on the search engine results page, and thus experience higher click-through or conversion rates, are not necessarily the most profitable ones---profits are often higher at the middle positions than at the top or the bottom ones. Besides providing managerial insights into search engine advertising, these results shed light on some key assumptions made in the theoretical modeling literature in sponsored search.

482 citations


Proceedings ArticleDOI
20 Apr 2009
TL;DR: This paper proposes a tag ranking scheme, aiming to automatically rank the tags associated with a given image according to their relevance to the image content, and applies tag ranking into three applications: tag-based image search, tag recommendation, and group recommendation.
Abstract: Social media sharing web sites like Flickr allow users to annotate images with free tags, which significantly facilitate Web image search and organization. However, the tags associated with an image generally are in a random order without any importance or relevance information, which limits the effectiveness of these tags in search and other applications. In this paper, we propose a tag ranking scheme, aiming to automatically rank the tags associated with a given image according to their relevance to the image content. We first estimate initial relevance scores for the tags based on probability density estimation, and then perform a random walk over a tag similarity graph to refine the relevance scores. Experimental results on a 50, 000 Flickr photo collectionshow that the proposed tag ranking method is both effective and efficient. We also apply tag ranking into three applications: (1) tag-based image search, (2) tag recommendation, and (3) group recommendation, which demonstrates that the proposed tag ranking approach really boosts the performances of social-tagging related applications.

469 citations


Book ChapterDOI
27 Aug 2009
TL;DR: In this paper, the problem of ranking all process models in a repository according to their similarity with respect to a given process model is investigated, and four graph matching algorithms, ranging from a greedy one to a relatively exhaustive one, are evaluated.
Abstract: We investigate the problem of ranking all process models in a repository according to their similarity with respect to a given process model. We focus specifically on the application of graph matching algorithms to this similarity search problem. Since the corresponding graph matching problem is NP-complete, we seek to find a compromise between computational complexity and quality of the computed ranking. Using a repository of 100 process models, we evaluate four graph matching algorithms, ranging from a greedy one to a relatively exhaustive one. The results show that the mean average precision obtained by a fast greedy algorithm is close to that obtained with the most exhaustive algorithm.

406 citations


Proceedings ArticleDOI
28 Jun 2009
TL;DR: This paper proposes a method for tag recommendation based on tensor factorization (TF) and provides a gradient descent algorithm to solve the optimization problem and demonstrates that this method outperforms other state-of-the-art tag recommendation methods like FolkRank, PageRank and HOSVD both in quality and prediction runtime.
Abstract: Tag recommendation is the task of predicting a personalized list of tags for a user given an item. This is important for many websites with tagging capabilities like last.fm or delicious. In this paper, we propose a method for tag recommendation based on tensor factorization (TF). In contrast to other TF methods like higher order singular value decomposition (HOSVD), our method RTF ('ranking with tensor factorization') directly optimizes the factorization model for the best personalized ranking. RTF handles missing values and learns from pairwise ranking constraints. Our optimization criterion for TF is motivated by a detailed analysis of the problem and of interpretation schemes for the observed data in tagging systems. In all, RTF directly optimizes for the actual problem using a correct interpretation of the data. We provide a gradient descent algorithm to solve our optimization problem. We also provide an improved learning and prediction method with runtime complexity analysis for RTF. The prediction runtime of RTF is independent of the number of observations and only depends on the factorization dimensions. Besides the theoretical analysis, we empirically show that our method outperforms other state-of-the-art tag recommendation methods like FolkRank, PageRank and HOSVD both in quality and prediction runtime.

399 citations


Journal ArticleDOI
TL;DR: A new ranking method and a new similarity measure for IT2 FSs are proposed and the results are useful in understanding the uncertainties associated with linguistic terms and hence how to use them effectively in survey design and linguistic information processing.

342 citations


Journal ArticleDOI
TL;DR: This work takes advantage of the entire Physical Review publication archive to construct authors' networks where weighted edges, as measured from opportunely normalized citation counts, define a proxy for the mechanism of scientific credit transfer.
Abstract: Recently, the abundance of digital data is enabling the implementation of graph-based ranking algorithms that provide system level analysis for ranking publications and authors. Here, we take advantage of the entire Physical Review publication archive (1893-2006) to construct authors' networks where weighted edges, as measured from opportunely normalized citation counts, define a proxy for the mechanism of scientific credit transfer. On this network, we define a ranking method based on a diffusion algorithm that mimics the spreading of scientific credits on the network. We compare the results obtained with our algorithm with those obtained by local measures such as the citation count and provide a statistical analysis of the assignment of major career awards in the area of physics. A website where the algorithm is made available to perform customized rank analysis can be found at the address http://www.physauthorsrank.org.

331 citations


Proceedings ArticleDOI
06 Aug 2009
TL;DR: This work proposes an unsupervised method for keyphrase extraction that outperforms sate-of-the-art graph-based ranking methods (TextRank) by 9.5% in F1-measure and guarantees the document to be semantically covered by these exemplar terms.
Abstract: Keyphrases are widely used as a brief summary of documents. Since manual assignment is time-consuming, various unsupervised ranking methods based on importance scores are proposed for keyphrase extraction. In practice, the keyphrases of a document should not only be statistically important in the document, but also have a good coverage of the document. Based on this observation, we propose an unsupervised method for keyphrase extraction. Firstly, the method finds exemplar terms by leveraging clustering techniques, which guarantees the document to be semantically covered by these exemplar terms. Then the keyphrases are extracted from the document using the exemplar terms. Our method outperforms sate-of-the-art graph-based ranking methods (TextRank) by 9.5% in F1-measure.

331 citations


Book ChapterDOI
23 Sep 2009
TL;DR: “background samples”, that is, examples which do not belong to any of the classes being learned, may provide a significant performance boost to such face recognition systems, and is defined and evaluated as an extension to the recently proposed “One-Shot Similarity” (OSS) measure.
Abstract: Evaluating the similarity of images and their descriptors by employing discriminative learners has proven itself to be an effective face recognition paradigm. In this paper we show how “background samples”, that is, examples which do not belong to any of the classes being learned, may provide a significant performance boost to such face recognition systems. In particular, we make the following contributions. First, we define and evaluate the “Two-Shot Similarity” (TSS) score as an extension to the recently proposed “One-Shot Similarity” (OSS) measure. Both these measures utilize background samples to facilitate better recognition rates. Second, we examine the ranking of images most similar to a query image and employ these as a descriptor for that image. Finally, we provide results underscoring the importance of proper face alignment in automatic face recognition systems. These contributions in concert allow us to obtain a success rate of 86.83% on the Labeled Faces in the Wild (LFW) benchmark, outperforming current state-of-the-art results.

Proceedings ArticleDOI
02 Nov 2009
TL;DR: This paper creates a taxonomy of query refinement strategies and builds a high precision rule-based classifier to detect each type of reformulation, finding that some reformulations are better suited to helping users when the current results are already fruitful, while other reformulation are more effective when the results are lacking.
Abstract: Users frequently modify a previous search query in hope of retrieving better results. These modifications are called query reformulations or query refinements. Existing research has studied how web search engines can propose reformulations, but has given less attention to how people perform query reformulations. In this paper, we aim to better understand how web searchers refine queries and form a theoretical foundation for query reformulation. We study users' reformulation strategies in the context of the AOL query logs. We create a taxonomy of query refinement strategies and build a high precision rule-based classifier to detect each type of reformulation. Effectiveness of reformulations is measured using user click behavior. Most reformulation strategies result in some benefit to the user. Certain strategies like add/remove words, word substitution, acronym expansion, and spelling correction are more likely to cause clicks, especially on higher ranked results. In contrast, users often click the same result as their previous query or select no results when forming acronyms and reordering words. Perhaps the most surprising finding is that some reformulations are better suited to helping users when the current results are already fruitful, while other reformulations are more effective when the results are lacking. Our findings inform the design of applications that can assist searchers; examples are described in this paper.

Proceedings ArticleDOI
19 Jul 2009
TL;DR: A novel positional language model (PLM) is proposed which implements both heuristics in a unified language model and is effective for passage retrieval and performs better than a state-of-the-art proximity-based retrieval model.
Abstract: Although many variants of language models have been proposed for information retrieval, there are two related retrieval heuristics remaining "external" to the language modeling approach: (1) proximity heuristic which rewards a document where the matched query terms occur close to each other; (2) passage retrieval which scores a document mainly based on the best matching passage. Existing studies have only attempted to use a standard language model as a "black box" to implement these heuristics, making it hard to optimize the combination parameters. In this paper, we propose a novel positional language model (PLM) which implements both heuristics in a unified language model. The key idea is to define a language model for each position of a document, and score a document based on the scores of its PLMs. The PLM is estimated based on propagated counts of words within a document through a proximity-based density function, which both captures proximity heuristics and achieves an effect of "soft" passage retrieval. We propose and study several representative density functions and several different PLM-based document ranking strategies. Experiment results on standard TREC test collections show that the PLM is effective for passage retrieval and performs better than a state-of-the-art proximity-based retrieval model.

Proceedings ArticleDOI
19 Jul 2009
TL;DR: An efficient document ranking algorithm is derived that generalizes the well-known probability ranking principle by considering both the uncertainty of relevance predictions and correlations between retrieved documents and the benefit of diversification is mathematically quantified.
Abstract: This paper studies document ranking under uncertainty. It is tackled in a general situation where the relevance predictions of individual documents have uncertainty, and are dependent between each other. Inspired by the Modern Portfolio Theory, an economic theory dealing with investment in financial markets, we argue that ranking under uncertainty is not just about picking individual relevant documents, but about choosing the right combination of relevant documents. This motivates us to quantify a ranked list of documents on the basis of its expected overall relevance (mean) and its variance; the latter serves as a measure of risk, which was rarely studied for document ranking in the past. Through the analysis of the mean and variance, we show that an optimal rank order is the one that balancing the overall relevance (mean) of the ranked list against its risk level (variance). Based on this principle, we then derive an efficient document ranking algorithm. It generalizes the well-known probability ranking principle (PRP) by considering both the uncertainty of relevance predictions and correlations between retrieved documents. Moreover, the benefit of diversification is mathematically quantified; we show that diversifying documents is an effective way to reduce the risk of document ranking. Experimental results in text retrieval confirm performance.

Patent
19 Jun 2009
TL;DR: In this paper, a computer implemented method and system for ranking a user in an organization based on the user's information technology related activities and arriving at an end risk score used for determining the risk involved in activities performed by the user and for other purposes.
Abstract: Disclosed herein is a computer implemented method and system for ranking a user in an organization based on the user's information technology related activities and arriving at an end risk score used for determining the risk involved in activities performed by the user and for other purposes. Group risk ranking profiles and security policies for usage of the organization's resources are created. The user is associated with one or more group risk ranking profiles. A security client application tracks the user's activities. Points are assigned to the user's tracked activities based on each of the associated group risk ranking profiles. The assigned points are aggregated to generate a first risk score. The assigned points of the user's tracked activities are modified at different levels based on predefined rules. The modified points are aggregated to generate the end risk score which is used for compliance and governance purposes, optimizing resources, etc.

Patent
09 Apr 2009
TL;DR: In this paper, the sentimental significance of a group of historical documents related to a topic is assessed with respect to change in an extrinsic metric for the topic and a unique sentiment binding label is included to the content of actions documents that are determined to have sentimental significance.
Abstract: The sentimental significance of a group of historical documents related to a topic is assessed with respect to change in an extrinsic metric for the topic. A unique sentiment binding label is included to the content of actions documents that are determined to have sentimental significance and the group of documents is inserted into a historical document sentiment vector space for the topic. Action areas in the vector space are defined from the locations of action documents and singular sentiment vector may be created that describes the cumulative action area. Newly published documents are sentiment-scored by semantically comparing them to documents in the space and/or to the singular sentiment vector. The sentiment scores for the newly published documents are supplemented by human sentiment assessment of the documents and a sentiment time decay factor is applied to the supplemented sentiment score of each newly published documents. User queries are received and a set of sentiment-ranked documents is returned with the highest age-adjusted sentiment scores.

Proceedings ArticleDOI
29 Mar 2009
TL;DR: This work is able to prove that, in contrast to all existing approaches, the expected rank satisfies all the required properties for a ranking query, and provides efficient solutions to compute this ranking across the major models of uncertain data, such as attribute-level and tuple-level uncertainty.
Abstract: When dealing with massive quantities of data, top-k queries are a powerful technique for returning only the k most relevant tuples for inspection, based on a scoring function. The problem of efficiently answering such ranking queries has been studied and analyzed extensively within traditional database settings. The importance of the top-k is perhaps even greater in probabilistic databases, where a relation can encode exponentially many possible worlds. There have been several recent attempts to propose definitions and algorithms for ranking queries over probabilistic data. However, these all lack many of the intuitive properties of a top-k over deterministic data. Specifically, we define a number of fundamental properties, including exact-k, containment, unique-rank, value-invariance, and stability, which are all satisfied by ranking queries on certain data. We argue that all these conditions should also be fulfilled by any reasonable definition for ranking uncertain data. Unfortunately, none of the existing definitions is able to achieve this. To remedy this shortcoming, this work proposes an intuitive new approach of expected rank. This uses the well-founded notion of the expected rank of each tuple across all possible worlds as the basis of the ranking. We are able to prove that, in contrast to all existing approaches, the expected rank satisfies all the required properties for a ranking query. We provide efficient solutions to compute this ranking across the major models of uncertain data, such as attribute-level and tuple-level uncertainty. For an uncertain relation of N tuples, the processing cost is O(N logN)—no worse than simply sorting the relation. In settings where there is a high cost for generating each tuple in turn, we provide pruning techniques based on probabilistic tail bounds that can terminate the search early and guarantee that the top-k has been found. Finally, a comprehensive experimental study confirms the effectiveness of our approach.

Proceedings ArticleDOI
20 Apr 2009
TL;DR: A multi-modality recommendation based on both tag and visual correlation is proposed, and the tag recommendation is formulates as a learning problem, and Rankboost algorithm is applied to learn an optimal combination of these ranking features from different modalities.
Abstract: Social tagging provides valuable and crucial information for large-scale web image retrieval. It is ontology-free and easy to obtain; however, irrelevant tags frequently appear, and users typically will not tag all semantic objects in the image, which is also called semantic loss. To avoid noises and compensate for the semantic loss, tag recommendation is proposed in literature. However, current recommendation simply ranks the related tags based on the single modality of tag co-occurrence on the whole dataset, which ignores other modalities, such as visual correlation. This paper proposes a multi-modality recommendation based on both tag and visual correlation, and formulates the tag recommendation as a learning problem. Each modality is used to generate a ranking feature, and Rankboost algorithm is applied to learn an optimal combination of these ranking features from different modalities. Experiments on Flickr data demonstrate the effectiveness of this learning-based multi-modality recommendation strategy.

Proceedings Article
07 Dec 2009
TL;DR: This work proposes a probabilistic framework that addresses the challenge of evaluating ranking algorithms using the expectation of NDCG over all the possible permutations of documents, and proposes an algorithm that outperforms state-of-the-art ranking algorithms on several benchmark data sets.
Abstract: Learning to rank is a relatively new field of study, aiming to learn a ranking function from a set of training data with relevancy labels. The ranking algorithms are often evaluated using information retrieval measures, such as Normalized Discounted Cumulative Gain (NDCG) [1] and Mean Average Precision (MAP) [2]. Until recently, most learning to rank algorithms were not using a loss function related to the above mentioned evaluation measures. The main difficulty in direct optimization of these measures is that they depend on the ranks of documents, not the numerical values output by the ranking function. We propose a probabilistic framework that addresses this challenge by optimizing the expectation of NDCG over all the possible permutations of documents. A relaxation strategy is used to approximate the average of NDCG over the space of permutation, and a bound optimization approach is proposed to make the computation efficient. Extensive experiments show that the proposed algorithm outperforms state-of-the-art ranking algorithms on several benchmark data sets.

Proceedings ArticleDOI
19 Jul 2009
TL;DR: This work uses various measures of query quality described in the literature as features to represent sub-queries, and trains a classifier to reduce long queries to more effective shorter ones that lack those extraneous terms.
Abstract: Long queries frequently contain many extraneous terms that hinder retrieval of relevant documents. We present techniques to reduce long queries to more effective shorter ones that lack those extraneous terms. Our work is motivated by the observation that perfectly reducing long TREC description queries can lead to an average improvement of 30% in mean average precision. Our approach involves transforming the reduction problem into a problem of learning to rank all sub-sets of the original query (sub-queries) based on their predicted quality, and selecting the top sub-query. We use various measures of query quality described in the literature as features to represent sub-queries, and train a classifier. Replacing the original long query with the top-ranked sub-query chosen by the ranker results in a statistically significant average improvement of 8% on our test sets. Analysis of the results shows that query reduction is well-suited for moderately-performing long queries, and a small set of query quality predictors are well-suited for the task of ranking sub-queries.

Proceedings ArticleDOI
19 Oct 2009
TL;DR: This paper proposes a new query suggestion scheme named Visual Query Suggestion (VQS), which provides a more effective query interface to formulate an intent-specific query by joint text and image suggestions, and shows that VQS outperforms these engines in terms of both the quality of query suggestion and search performance.
Abstract: Query suggestion is an effective approach to improve the usability of image search. Most existing search engines are able to automatically suggest a list of textual query terms based on users' current query input, which can be called Textual Query Suggestion. This paper proposes a new query suggestion scheme named Visual Query Suggestion (VQS) which is dedicated to image search. It provides a more effective query interface to formulate an intent-specific query by joint text and image suggestions. We show that VQS is able to more precisely and more quickly help users specify and deliver their search intents. When a user submits a text query, VQS first provides a list of suggestions, each containing a keyword and a collection of representative images in a dropdown menu. If the user selects one of the suggestions, the corresponding keyword will be added to complement the initial text query as the new text query, while the image collection will be formulated as the visual query. VQS then performs image search based on the new text query using text search techniques, as well as content-based visual retrieval to refine the search results by using the corresponding images as query examples. We compare VQS with three popular image search engines, and show that VQS outperforms these engines in terms of both the quality of query suggestion and search performance.

Proceedings ArticleDOI
19 Oct 2009
TL;DR: A novel ranking algorithm, namely ranking with Local Regression and Global Alignment (LRGA), which learns a robust Laplacian matrix for data ranking and a unified objective function to globally align the local models from all the data points so that an optimal ranking value can be assigned to each data point.
Abstract: Rich multimedia content including images, audio and text are frequently used to describe the same semantics in E-Learning and Ebusiness web pages, instructive slides, multimedia cyclopedias, and so on. In this paper, we present a framework for cross-media retrieval, where the query example and the retrieved result(s) can be of different media types. We first construct Multimedia Correlation Space (MMCS) by exploring the semantic correlation of different multimedia modalities, during which multimedia content and co-occurrence information is utilized. We propose a novel ranking algorithm, namely ranking with Local Regression and Global Alignment (LRGA), which learns a robust Laplacian matrix for data ranking. In LRGA, for each data point, a local linear regression model is used to predict the ranking values of its neighboring points. We propose a unified objective function to globally align the local models from all the data points so that an optimal ranking value can be assigned to each data point. LRGA is insensitive to parameters, making it particularly suitable for data ranking. A relevance feedback algorithm is proposed to improve the retrieval performance. Comprehensive experiments have demonstrated the effectiveness of our methods.

Journal ArticleDOI
TL;DR: This paper proposes a novel approach for designing and developing a QoS ontology and its QoS-based ranking algorithm for evaluating Web services and can be used in various applications in order to facilitate automatic and dynamic discovery and selection of Web services.

Proceedings Article
Wei Chen1, Tie-Yan Liu2, Yanyan Lan1, Zhi-Ming Ma1, Hang Li2 
07 Dec 2009
TL;DR: In this article, the authors reveal the relationship between ranking measures and loss functions in learning-to-rank methods, such as Ranking SVM, RankBoost, RankNet, and ListMLE, and show that the loss functions of these methods are upper bounds of the measure-based ranking errors.
Abstract: Learning to rank has become an important research topic in machine learning. While most learning-to-rank methods learn the ranking functions by minimizing loss functions, it is the ranking measures (such as NDCG and MAP) that are used to evaluate the performance of the learned ranking functions. In this work, we reveal the relationship between ranking measures and loss functions in learning-to-rank methods, such as Ranking SVM, RankBoost, RankNet, and ListMLE. We show that the loss functions of these methods are upper bounds of the measure-based ranking errors. As a result, the minimization of these loss functions will lead to the maximization of the ranking measures. The key to obtaining this result is to model ranking as a sequence of classification tasks, and define a so-called essential loss for ranking as the weighted sum of the classification errors of individual tasks in the sequence. We have proved that the essential loss is both an upper bound of the measure-based ranking errors, and a lower bound of the loss functions in the aforementioned methods. Our proof technique also suggests a way to modify existing loss functions to make them tighter bounds of the measure-based ranking errors. Experimental results on benchmark datasets show that the modifications can lead to better ranking performances, demonstrating the correctness of our theoretical analysis.

Journal ArticleDOI
01 Aug 2009
TL;DR: This paper contends that a single, specific ranking function may not suffice for probabilistic databases, and proposes two parameterized ranking functions, called PRFω and PRFe, that generalize or can approximate many of the previously proposed ranking functions.
Abstract: The dramatic growth in the number of application domains that naturally generate probabilistic, uncertain data has resulted in a need for efficiently supporting complex querying and decision-making over such data. In this paper, we present a unified approach to ranking and top-k query processing in probabilistic databases by viewing it as a multi-criteria optimization problem, and by deriving a set of features that capture the key properties of a probabilistic dataset that dictate the ranked result. We contend that a single, specific ranking function may not suffice for probabilistic databases, and we instead propose two parameterized ranking functions, called PRFω and PRFe, that generalize or can approximate many of the previously proposed ranking functions. We present novel generating functions-based algorithms for efficiently ranking large datasets according to these ranking functions, even if the datasets exhibit complex correlations modeled using probabilistic and/xor trees or Markov networks. We further propose that the parameters of the ranking function be learned from user preferences, and we develop an approach to learn those parameters. Finally, we present a comprehensive experimental study that illustrates the effectiveness of our parameterized ranking functions, especially PRFe, at approximating other ranking functions and the scalability of our proposed algorithms for exact or approximate ranking.

Proceedings ArticleDOI
19 Jul 2009
TL;DR: Experimental results on three datasets show that Ranking SVM significantly outperforms the baseline methods of SVM and Naive Bayes, indicating that it is better to exploit learning to rank techniques in keyphrase extraction.
Abstract: This paper addresses the issue of automatically extracting keyphrases from a document. Previously, this problem was formalized as classification and learning methods for classification were utilized. This paper points out that it is more essential to cast the problem as ranking and employ a learning to rank method to perform the task. Specifically, it employs Ranking SVM, a state-of-art method of learning to rank, in keyphrase extraction. Experimental results on three datasets show that Ranking SVM significantly outperforms the baseline methods of SVM and Naive Bayes, indicating that it is better to exploit learning to rank techniques in keyphrase extraction.

Proceedings ArticleDOI
24 Mar 2009
TL;DR: This paper presents Zerber+R -- a ranking model which allows for privacy-preserving top-k retrieval from an outsourced inverted index and proposes a relevance score transformation function which makes relevance scores of different terms indistinguishable, such that even if stored on an untrusted server they do not reveal information about the indexed data.
Abstract: Privacy-preserving document exchange among collaboration groups in an enterprise as well as across enterprises requires techniques for sharing and search of access-controlled information through largely untrusted servers. In these settings search systems need to provide confidentiality guarantees for shared information while offering IR properties comparable to the ordinary search engines. Top-k is a standard IR technique which enables fast query execution on very large indexes and makes systems highly scalable. However, indexing access-controlled information for top-k retrieval is a challenging task due to the sensitivity of the term statistics used for ranking.In this paper we present Zerber+R -- a ranking model which allows for privacy-preserving top-k retrieval from an outsourced inverted index. We propose a relevance score transformation function which makes relevance scores of different terms indistinguishable, such that even if stored on an untrusted server they do not reveal information about the indexed data. Experiments on two real-world data sets show that Zerber+R makes economical usage of bandwidth and offers retrieval properties comparable with an ordinary inverted index.

Proceedings Article
15 Apr 2009
TL;DR: This paper shows how to naturally derive a space of possible ranking criteria, and shows that several recent contributions in the feature selection literature are points within this continuous space, and that there exist many points that have never been explored.
Abstract: Feature Filters are among the simplest and fastest approaches to feature selection. A filter defines a statistical criterion, used to rank features on how useful they are expected to be for classification. The highest ranking features are retained, and the lowest ranking can be discarded. A common approach is to use the Mutual Information between the feature and class label. This area has seen a recent flurry of activity, resulting in a confusing variety of heuristic criteria all based on mutual information, and a lack of a principled way to understand or relate them. The contribution of this paper is a unifying theoretical understanding of such filters. In contrast to current methods which manually construct filter criteria with particular properties, we show how to naturally derive a space of possible ranking criteria. We will show that several recent contributions in the feature selection literature are points within this continuous space, and that there exist many points that have never been explored.

Book ChapterDOI
07 Dec 2009
TL;DR: The XER tasks and the evaluation procedure used at the XER track in 2009, where a new version of Wikipedia was used as underlying collection are described; and the approaches adopted by the participants are summarized.
Abstract: In some situations search engine users would prefer to retrieve entities instead of just documents. Example queries include "Italian Nobel prize winners", "Formula 1 drivers that won the Monaco Grand Prix", or "German spoken Swiss cantons". The XML Entity Ranking (XER) track at INEX creates a discussion forum aimed at standardizing evaluation procedures for entity retrieval. This paper describes the XER tasks and the evaluation procedure used at the XER track in 2009, where a new version of Wikipedia was used as underlying collection; and summarizes the approaches adopted by the participants.

Book ChapterDOI
30 Aug 2009
TL;DR: This paper presents the first automated unsupervised analysis of LDA models to identify junk topics from legitimate ones, and to rank the topic significance.
Abstract: Topic models, like Latent Dirichlet Allocation (LDA), have been recently used to automatically generate text corpora topics, and to subdivide the corpus words among those topics. However, not all the estimated topics are of equal importance or correspond to genuine themes of the domain. Some of the topics can be a collection of irrelevant words, or represent insignificant themes. Current approaches to topic modeling perform manual examination to find meaningful topics. This paper presents the first automated unsupervised analysis of LDA models to identify junk topics from legitimate ones, and to rank the topic significance. Basically, the distance between a topic distribution and three definitions of "junk distribution" is computed using a variety of measures, from which an expressive figure of the topic significance is implemented using 4-phase Weighted Combination approach. Our experiments on synthetic and benchmark datasets show the effectiveness of the proposed approach in ranking the topic significance.