scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A user-centered investigation of interactive query expansion within the context of a relevance feedback system is presented, providing evidence for the effectiveness of interactive querying and highlighting the need for more research on.
Abstract: A user-centered investigation of interactive query expansion within the context of a relevance feedback system is presented in this article. Data were collected from 25 searches using the INSPEC database. The data collection mechanisms included questionnaires, transaction logs, and relevance evaluations. The results discuss issues that relate to query expansion, retrieval effectiveness, the correspondence of the on-line-to-off-line relevance judgments, and the selection of terms for query expansion by users (interactive query expansion). The main conclusions drawn from the results of the study are that: (1) one-third of the terms presented to users in a list of candidate terms for query expansion was identified by the users as potentially useful for query expansion. (2) These terms were mainly judged as either variant expressions (synonyms) or alternative (related) terms to the initial query terms. However, a substantial portion of the selected terms were identified as representing new ideas. (3) The relationships identified between the five best terms selected by the users for query expansion and the initial query terms were that: (a) 34% of the query expansion terms have no relationship or other type of correspondence with a query term; (b) 66% of the remaining query expansion terms have a relationship to the query terms. These relationships were: narrower term (46%), broader term (3%), related term (17%). (4) The results provide evidence for the effectiveness of interactive query expansion. The initial search produced on average three highly relevant documents; the query expansion search produced on average nine further highly relevant documents. The conclusions highlight the need for more research on: interactive query expansion, the comparative evaluation of automatic vs. interactive query expansion, the study of weighted Web-based or Web-accessible retrieval systems in operational environments, and for user studies in searching ranked retrieval systems in general.

122 citations

Proceedings ArticleDOI
02 Nov 2009
TL;DR: Two novel models for faceted topic retrieval are introduced, one based on pruning a set of retrieved documents and onebased on retrieving sets of documents through direct optimization of evaluation measures.
Abstract: Traditional models of information retrieval assume documents are independently relevant. But when the goal is retrieving diverse or novel information about a topic, retrieval models need to capture dependencies between documents. Such tasks require alternative evaluation and optimization methods that operate on different types of relevance judgments. We define faceted topic retrieval as a particular novelty-driven task with the goal of finding a set of documents that cover the different facets of an information need. A faceted topic retrieval system must be able to cover as many facets as possible with the smallest number of documents. We introduce two novel models for faceted topic retrieval, one based on pruning a set of retrieved documents and one based on retrieving sets of documents through direct optimization of evaluation measures. We compare the performance of our models to MMR and the probabilistic model due to Zhai et al. on a set of 60 topics annotated with facets, showing that our models are competitive.

122 citations

Posted Content
TL;DR: Key to the approach is that during training the ranking problem can be viewed as a linear assignment problem, which can be solved by the Hungarian Marriage algorithm, and a sort operation is sufficient, as the algorithm assigns a relevance score to every document, query pair.
Abstract: Web page ranking and collaborative filtering require the optimization of sophisticated performance measures. Current Support Vector approaches are unable to optimize them directly and focus on pairwise comparisons instead. We present a new approach which allows direct optimization of the relevant loss functions. This is achieved via structured estimation in Hilbert spaces. It is most related to Max-Margin-Markov networks optimization of multivariate performance measures. Key to our approach is that during training the ranking problem can be viewed as a linear assignment problem, which can be solved by the Hungarian Marriage algorithm. At test time, a sort operation is sufficient, as our algorithm assigns a relevance score to every (document, query) pair. Experiments show that the our algorithm is fast and that it works very well.

122 citations

Proceedings ArticleDOI
18 Mar 2013
TL;DR: A scalable integrated inverted index, named I3, which adopts the Quadtree structure to hierarchically partition the data space into cells, which captures the spatial locality of a keyword and designs a new storage mechanism for efficient retrieval of keyword cell.
Abstract: In this big data era, huge amounts of spatial documents have been generated everyday through various location based services. Top-k spatial keyword search is an important approach to exploring useful information from a spatial database. It retrieves k documents based on a ranking function that takes into account both textual relevance (similarity between the query and document keywords) and spatial relevance (distance between the query and document locations). Various hybrid indexes have been proposed in recent years which mainly combine the R-tree and the inverted index so that spatial pruning and textual pruning can be executed simultaneously. However, the rapid growth in data volume poses significant challenges to existing methods in terms of the index maintenance cost and query processing time.In this paper, we propose a scalable integrated inverted index, named I3, which adopts the Quadtree structure to hierarchically partition the data space into cells. The basic unit of I3 is the keyword cell, which captures the spatial locality of a keyword. Moreover, we design a new storage mechanism for efficient retrieval of keyword cell and preserve additional summary information to facilitate pruning. Experiments conducted on real spatial datasets (Twitter and Wikipedia) demonstrate the superiority of I3 over existing schemes such as IR-tree and S2I in various aspects: it incurs shorter construction time to build the index, it has lower index storage cost, it is order of magnitude faster in updates, and it is highly scalable and answers top-k spatial keyword queries efficiently.

122 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168