Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Ranking queries on uncertain data: a probabilistic threshold approach

[...]

Ming Hua¹, Jian Pei¹, Wenjie Zhang², Xuemin Lin²•Institutions (2)

Simon Fraser University¹, University of New South Wales²

09 Jun 2008

TL;DR: An efficient exact algorithm, a fast sampling algorithm, and a Poisson approximation based algorithm are presented for answering probabilistic threshold top-k queries on uncertain data, which computes uncertain records taking a probability of at least p to be in the top- k list.

...read moreread less

Abstract: Uncertain data is inherent in a few important applications such as environmental surveillance and mobile object tracking. Top-k queries (also known as ranking queries) are often natural and useful in analyzing uncertain data in those applications. In this paper, we study the problem of answering probabilistic threshold top-k queries on uncertain data, which computes uncertain records taking a probability of at least p to be in the top-k list where p is a user specified probability threshold. We present an efficient exact algorithm, a fast sampling algorithm, and a Poisson approximation based algorithm. An empirical study using real and synthetic data sets verifies the effectiveness of probabilistic threshold top-k queries and the efficiency of our methods.

...read moreread less

291 citations

Proceedings Article•DOI•

FA*IR: A Fair Top-k Ranking Algorithm

[...]

Meike Zehlike¹, Francesco Bonchi², Carlos Castillo³, Sara Hajian, Mohamed Megahed¹, Ricardo Baeza-Yates³ - Show less +2 more•Institutions (3)

Technical University of Berlin¹, Institute for Scientific Interchange², Pompeu Fabra University³

20 Jun 2017-arXiv: Computers and Society

TL;DR: The Fair Top-K Ranking (FTR) algorithm as discussed by the authors is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an underrepresented group along a ranked list.

...read moreread less

Abstract: In this work, we define and solve the Fair Top-k Ranking problem, in which we want to determine a subset of k candidates from a large pool of n >> k candidates, maximizing utility (i.e., select the "best" candidates) subject to group fairness criteria. Our ranked group fairness definition extends group fairness using the standard notion of protected groups and is based on ensuring that the proportion of protected candidates in every prefix of the top-k ranking remains statistically above or indistinguishable from a given minimum. Utility is operationalized in two ways: (i) every candidate included in the top-$k$ should be more qualified than every candidate not included; and (ii) for every pair of candidates in the top-k, the more qualified candidate should be ranked above. An efficient algorithm is presented for producing the Fair Top-k Ranking, and tested experimentally on existing datasets as well as new datasets released with this paper, showing that our approach yields small distortions with respect to rankings that maximize utility without considering fairness criteria. To the best of our knowledge, this is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an under-represented group along a ranked list.

...read moreread less

291 citations

Proceedings Article•DOI•

Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages

[...]

Rahul Agrawal¹, Archit Gupta, Yashoteja Prabhu, Manik Varma•Institutions (1)

Microsoft¹

13 May 2013

TL;DR: It is demonstrated that it is possible to efficiently predict the relevant subset of queries from a large set of monetizable ones by posing the problem as a multi-label learning task with each query being represented by a separate label.

...read moreread less

Abstract: Recommending phrases from web pages for advertisers to bid on against search engine queries is an important research problem with direct commercial impact. Most approaches have found it infeasible to determine the relevance of all possible queries to a given ad landing page and have focussed on making recommendations from a small set of phrases extracted (and expanded) from the page using NLP and ranking based techniques. In this paper, we eschew this paradigm, and demonstrate that it is possible to efficiently predict the relevant subset of queries from a large set of monetizable ones by posing the problem as a multi-label learning task with each query being represented by a separate label. We develop Multi-label Random Forests to tackle problems with millions of labels. Our proposed classifier has prediction costs that are logarithmic in the number of labels and can make predictions in a few milliseconds using 10 Gb of RAM. We demonstrate that it is possible to generate training data for our classifier automatically from click logs without any human annotation or intervention. We train our classifier on tens of millions of labels, features and training points in less than two days on a thousand node cluster. We develop a sparse semi-supervised multi-label learning formulation to deal with training set biases and noisy labels harvested automatically from the click logs. This formulation is used to infer a belief in the state of each label for each training ad and the random forest classifier is extended to train on these beliefs rather than the given labels. Experiments reveal significant gains over ranking and NLP based techniques on a large test set of 5 million ads using multiple metrics.

...read moreread less

290 citations

Journal Article•DOI•

Learning Context-Sensitive Shape Similarity by Graph Transduction

[...]

Xiang Bai¹, Xingwei Yang², Longin Jan Latecki², Wenyu Liu¹, Zhuowen Tu³ - Show less +1 more•Institutions (3)

Huazhong University of Science and Technology¹, Temple University², University of California, Los Angeles³

01 May 2010-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new perspective to this problem is provided by considering the existing shapes as a group, and study their similarity measures to the query shape in a graph structure, which achieves promising improvements on both shape classification and shape clustering.

...read moreread less

Abstract: Shape similarity and shape retrieval are very important topics in computer vision. The recent progress in this domain has been mostly driven by designing smart shape descriptors for providing better similarity measure between pairs of shapes. In this paper, we provide a new perspective to this problem by considering the existing shapes as a group, and study their similarity measures to the query shape in a graph structure. Our method is general and can be built on top of any existing shape similarity measure. For a given similarity measure, a new similarity is learned through graph transduction. The new similarity is learned iteratively so that the neighbors of a given shape influence its final similarity to the query. The basic idea here is related to PageRank ranking, which forms a foundation of Google Web search. The presented experimental results demonstrate that the proposed approach yields significant improvements over the state-of-art shape matching algorithms. We obtained a retrieval rate of 91.61 percent on the MPEG-7 data set, which is the highest ever reported in the literature. Moreover, the learned similarity by the proposed method also achieves promising improvements on both shape classification and shape clustering.

...read moreread less

287 citations

Patent•

Full-text relevancy ranking

[...]

Theodore George Diamond, Daniel Allen Hendrick, Eric Rehm, Melissa Anne Riesland

04 Jan 2005

TL;DR: In this paper, a method and system for ranking relevancy of metadata associated with media on a computer network, such as multimedia and streaming media, include categorizing the metadata into sets of metadata.

...read moreread less

Abstract: A method and system for ranking relevancy of metadata associated with media on a computer network, such as multimedia and streaming media, include categorizing the metadata into sets of metadata. The categories are broad categories relating to areas such as who, what, when, and where, such as artist, media type, and creation date, creation location. Weights are assigned to each set of metadata. Weights are related to technical information such as bit rate, duration, sampling rate, frequency of occurrence of a specific term, etc. A score is calculated for ranking the relevancy of each set of metadata. The score is calculated in accordance with the assigned weight and category. This score is available for search systems (e.g., search engines) and/or users to determine the relative ranking of search results.

...read moreread less

286 citations

Collapse

Network Information

Performance

Metrics

30,908

Papers

494,240

Citations

No. of papers in the topic in previous years
Year	Papers
2024	1
2023	3,112
2022	6,541
2021	1,105
2020	1,082
2019	1,168

Ranking (information retrieval)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics