scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Proceedings Article
01 Jan 2003
TL;DR: Studies conducted by the two participating groups compared a search engine using automatic topic distillation features with the same engine with those features disabled in order to determine whether the automatic topic Distillation features assisted the users in the performance of their tasks and whether humans could achieve better results than the automatic system.
Abstract: The TREC 2003 web track consisted of both a non-interactive stream and an interactive stream. Both streams worked with the .GOV test collection. The non-interactive stream continued an investigation into the importance of homepages in Web ranking, via both a Topic Distillation task and a Navigational task. In the topic distillation task, systems were expected to return a list of the homepages of sites relevant to each of a series of broad queries. This differs from previous homepage experiments in that queries may have multiple correct answers. The navigational task required systems to return a particular desired web page as early as possible in the ranking in response to queries. In half of the queries, the target answer was the homepage of a site and the query was derived from the name of the site (Homepage finding) while in the other half, the target answers were not homepages and the queries were derived from the name of the page (Named page finding). The two types of query were arbitrarily mixed and not identified. The interactive stream focused on human participation in a topic distillation task over the .GOV collection. Studies conducted by the two participating groups compared a search engine using automatic topic distillation features with the same engine with those features disabled in order to determine whether the automatic topic distillation features assisted the users in the performance of their tasks and whether humans could achieve better results than the automatic system.

195 citations

Proceedings ArticleDOI
Milad Shokouhi1
28 Jul 2013
TL;DR: The results suggest that supervised rankers enhanced by personalization features can significantly outperform the existing popularity-based base-lines, in terms of mean reciprocal rank (MRR) by up to 9%.
Abstract: Query auto-completion (QAC) is one of the most prominent features of modern search engines. The list of query candidates is generated according to the prefix entered by the user in the search box and is updated on each new key stroke. Query prefixes tend to be short and ambiguous, and existing models mostly rely on the past popularity of matching candidates for ranking. However, the popularity of certain queries may vary drastically across different demographics and users. For instance, while instagram and imdb have comparable popularities overall and are both legitimate candidates to show for prefix i, the former is noticeably more popular among young female users, and the latter is more likely to be issued by men. In this paper, we present a supervised framework for personalizing auto-completion ranking. We introduce a novel labelling strategy for generating offline training labels that can be used for learning personalized rankers. We compare the effectiveness of several user-specific and demographic-based features and show that among them, the user's long-term search history and location are the most effective for personalizing auto-completion rankers. We perform our experiments on the publicly available AOL query logs, and also on the larger-scale logs of Bing. The results suggest that supervised rankers enhanced by personalization features can significantly outperform the existing popularity-based base-lines, in terms of mean reciprocal rank (MRR) by up to 9%.

195 citations

31 Jul 2011
TL;DR: This paper presents a novel approach for automatic detection of semantic change of words based on distributional similarity models and shows that the method obtains good results with respect to a reference ranking produced by human raters.
Abstract: This paper presents a novel approach for automatic detection of semantic change of words based on distributional similarity models. We show that the method obtains good results with respect to a reference ranking produced by human raters. The evaluation also analyzes the performance of frequency-based methods, comparing them to the similarity method proposed.

195 citations

Proceedings Article
23 Sep 2007
TL;DR: The top-k dominating query as mentioned in this paper returns k data objects which dominate the highest number of objects in a dataset, which is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects.
Abstract: The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scales at different dimensions. Despite their importance, top-k dominating queries have not received adequate attention from the research community. In this paper, we design specialized algorithms that apply on indexed multi-dimensional data and fully exploit the characteristics of the problem. Experiments on synthetic datasets demonstrate that our algorithms significantly outperform a previous skyline-based approach, while our results on real datasets show the meaningfulness of top-k dominating queries.

195 citations

Proceedings Article
07 Dec 2009
TL;DR: This work proposes a probabilistic framework that addresses the challenge of evaluating ranking algorithms using the expectation of NDCG over all the possible permutations of documents, and proposes an algorithm that outperforms state-of-the-art ranking algorithms on several benchmark data sets.
Abstract: Learning to rank is a relatively new field of study, aiming to learn a ranking function from a set of training data with relevancy labels. The ranking algorithms are often evaluated using information retrieval measures, such as Normalized Discounted Cumulative Gain (NDCG) [1] and Mean Average Precision (MAP) [2]. Until recently, most learning to rank algorithms were not using a loss function related to the above mentioned evaluation measures. The main difficulty in direct optimization of these measures is that they depend on the ranks of documents, not the numerical values output by the ranking function. We propose a probabilistic framework that addresses this challenge by optimizing the expectation of NDCG over all the possible permutations of documents. A relaxation strategy is used to approximate the average of NDCG over the space of permutation, and a bound optimization approach is proposed to make the computation efficient. Extensive experiments show that the proposed algorithm outperforms state-of-the-art ranking algorithms on several benchmark data sets.

194 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168