scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Proceedings ArticleDOI
01 Jan 2018
TL;DR: In this paper, the authors studied the problem of constrained ranking with fairness and diversity constraints and showed that the problem is hard to approximate even with simple constraints such as gender, race, and political opinion constraints.
Abstract: Ranking algorithms are deployed widely to order a set of items in applications such as search engines, news feeds, and recommendation systems. Recent studies, however, have shown that, left unchecked, the output of ranking algorithms can result in decreased diversity in the type of content presented, promote stereotypes, and polarize opinions. In order to address such issues, we study the following variant of the traditional ranking problem when, in addition, there are fairness or diversity constraints. Given a collection of items along with 1) the value of placing an item in a particular position in the ranking, 2) the collection of sensitive attributes (such as gender, race, political opinion) of each item and 3) a collection of fairness constraints that, for each k, bound the number of items with each attribute that are allowed to appear in the top k positions of the ranking, the goal is to output a ranking that maximizes the value with respect to the original rank quality metric while respecting the constraints. This problem encapsulates various well-studied problems related to bipartite and hypergraph matching as special cases and turns out to be hard to approximate even with simple constraints. Our main technical contributions are fast exact and approximation algorithms along with complementary hardness results that, together, come close to settling the approximability of this constrained ranking maximization problem. Unlike prior work on the approximability of constrained matching problems, our algorithm runs in linear time, even when the number of constraints is (polynomially) large, its approximation ratio does not depend on the number of constraints, and it produces solutions with small constraint violations. Our results rely on insights about the constrained matching problem when the objective function satisfies certain properties that appear in common ranking metrics such as discounted cumulative gain (DCG), Spearman's rho or Bradley-Terry, along with the nested structure of fairness constraints.

142 citations

Journal ArticleDOI
TL;DR: This article demonstrates that, in order to estimate the mathematical expectations of Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG), it only need to predict the relevance probability of each image.
Abstract: This article studies a novel problem in image search. Given a text query and the image ranking list returned by an image search system, we propose an approach to automatically predict the search performance. We demonstrate that, in order to estimate the mathematical expectations of Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG), we only need to predict the relevance probability of each image. We accomplish the task with a query-adaptive graph-based learning based on the images’ ranking order and visual content. We validate our approach with a large-scale dataset that contains the image search results of 1,165 queries from 4 popular image search engines. Empirical studies demonstrate that our approach is able to generate predictions that are highly correlated with the real search performance. Based on the proposed image search performance prediction scheme, we introduce three applications: image metasearch, multilingual image search, and Boolean image search. Comprehensive experiments are conducted to validate our approach.

142 citations

Journal ArticleDOI
TL;DR: A ranking algorithm is described that exploits the entire network structure of similarity relationships among proteins in a sequence database by performing a diffusion operation on a precomputed, weighted network.
Abstract: Biologists regularly search databases of DNA or protein sequences for evolutionary or functional relationships to a given query sequence. We describe a ranking algorithm that exploits the entire network structure of similarity relationships among proteins in a sequence database by performing a diffusion operation on a precomputed, weighted network. The resulting ranking algorithm, evaluated by using a human-curated database of protein structures, is efficient and provides significantly better rankings than a local network search algorithm such as psi-blast.

142 citations

Journal ArticleDOI
TL;DR: This paper investigates the multiple attribute decision making (MADM) problem with fuzzy preference information on alternatives and proposes an eigenvector method to rank them and three optimization models are introduced, which integrate subjective fuzzy preference relations and objective information in different ways.

141 citations

Book
25 Nov 2008
TL;DR: This book makes two major contributions to the field of information retrieval: first, a new way to look at topical relevance, complementing the two dominant models, i.e., the classical probabilistic model and the language modeling approach, and which explicitly combines documents, queries, and relevance in a single formalism.
Abstract: A modern information retrieval system must have the capability to find, organize and present very different manifestations of information such as text, pictures, videos or database records any of which may be of relevance to the user. However, the concept of relevance, while seemingly intuitive, is actually hard to define, and it's even harder to model in a formal way. Lavrenko does not attempt to bring forth a new definition of relevance, nor provide arguments as to why any particular definition might be theoretically superior or more complete. Instead, he takes a widely accepted, albeit somewhat conservative definition, makes several assumptions, and from them develops a new probabilistic model that explicitly captures that notion of relevance. With this book, he makes two major contributions to the field of information retrieval: first, a new way to look at topical relevance, complementing the two dominant models, i.e., the classical probabilistic model and the language modeling approach, and which explicitly combines documents, queries, and relevance in a single formalism; second, a new method for modeling exchangeable sequences of discrete random variables which does not make any structural assumptions about the data and which can also handle rare events. Thus his book is of major interest to researchers and graduate students in information retrieval who specialize in relevance modeling, ranking algorithms, and language modeling.

141 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168