scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Proceedings ArticleDOI
13 Jun 2004
TL;DR: This work hypothesizes the existence of a hidden syntax that guides the creation of query interfaces, albeit from different sources, and develops a 2P grammar and a best-effort parser, which together realize a parsing mechanism for a hypothetical syntax.
Abstract: Recently, the Web has been rapidly "deepened" by many searchable databases online, where data are hidden behind query forms. For modelling and integrating Web databases, the very first challenge is to understand what a query interface says- or what query capabilities a source supports. Such automatic extraction of interface semantics is challenging, as query forms are created autonomously. Our approach builds on the observation that, across myriad sources, query forms seem to reveal some "concerted structure," by sharing common building blocks. Toward this insight, we hypothesize the existence of a hidden syntax that guides the creation of query interfaces, albeit from different sources. This hypothesis effectively transforms query interfaces into a visual language with a non-prescribed grammar- and, thus, their semantic understanding a parsing problem. Such a paradigm enables principled solutions for both declaratively representing common patterns, by a derived grammar, and systematically interpreting query forms, by a global parsing mechanism. To realize this paradigm, we must address the challenges of a hypothetical syntax- that it is to be derived, and that it is secondary to the input. At the heart of our form extractor, we thus develop a 2P grammar and a best-effort parser, which together realize a parsing mechanism for a hypothetical syntax. Our experiments show the promise of this approach-it achieves above 85% accuracy for extracting query conditions across random sources.

223 citations

Book ChapterDOI
21 Sep 1998
TL;DR: The paper shows that the new probabilistic interpretation of tf×idf term weighting might lead to better understanding of statistical ranking mechanisms, for example by explaining how they relate to coordination level ranking.
Abstract: This paper presents a new probabilistic model of information retrieval The most important modeling assumption made is that documents and queries are defined by an ordered sequence of single terms This assumption is not made in well known existing models of information retrieval, but is essential in the field of statistical natural language processing Advances already made in statistical natural language processing will be used in this paper to formulate a probabilistic justification for using tf×idf term weighting The paper shows that the new probabilistic interpretation of tf×idf term weighting might lead to better understanding of statistical ranking mechanisms, for example by explaining how they relate to coordination level ranking A pilot experiment on the Cranfield test collection indicates that the presented model outperforms the vector space model with classical tf×idf and cosine length normalisation

222 citations

Journal ArticleDOI
TL;DR: A set of principles and a novel rank-by-feature framework that could enable users to better understand distributions in one (1D) or two dimensions (2D) and discover relationships, clusters, gaps, outliers, and other features and implemented in the Hierarchical Clustering Explorer.
Abstract: Interactive exploration of multidimensional data sets is challenging because: (1) it is difficult to comprehend patterns in more than three dimensions, and (2) current systems often are a patchwork of graphical and statistical methods leaving many researchers uncertain about how to explore their data in an orderly manner. We offer a set of principles and a novel rank-by-feature framework that could enable users to better understand distributions in one (1D) or two dimensions (2D), and then discover relationships, clusters, gaps, outliers, and other features. Users of our framework can view graphical presentations (histograms, boxplots, and scatterplots), and then choose a feature detection criterion to rank 1D or 2D axis-parallel projections. By combining information visualization techniques (overview, coordination, and dynamic query) with summaries and statistical methods users can systematically examine the most important 1D and 2D axis-parallel projections. We summarize our Graphics, Ranking, and Interaction for Discovery (GRID) principles as: (1) study 1D, study 2D, then find features (2) ranking guides insight, statistics confirm. We implemented the rank-by-feature framework in the Hierarchical Clustering Explorer, but the same data exploration principles could enable users to organize their discovery process so as to produce more thorough analyses and extract deeper insights in any multidimensional data application, such as spreadsheets, statistical packages, or information visualization tools.

222 citations

Patent
Carl J. Kraenzel1, Paul B. Moody1, Joann Ruvolo1, Thomas P. Moran1, Justin Lessler1 
26 Aug 2004
TL;DR: In this paper, a method of generating a context-inferenced search query and sorting a result of the query is described, which includes analyzing an event associated with the user to determine a contextual setting, dynamically generating a search query based on the contextual setting and searching at least one information source using the search query to generate a search result.
Abstract: A method of generating a context-inferenced search query and of sorting a result of the query is described. The method includes analyzing an event associated with the user to determine a contextual setting, dynamically generating a search query based on the contextual setting, and searching at least one information source using the search query to generate a search result. Additionally, the method includes calculating an importance value for each item of the search result, sorting the items of the search result according the importance value, and displaying the sorted search result to the user.

222 citations

Proceedings ArticleDOI
24 Jul 2011
TL;DR: A novel cascade ranking model is formulated and developed, which unlike previous approaches, can simultaneously improve both top k ranked effectiveness and retrieval efficiency and a novel boosting algorithm is presented for learning such cascades to directly optimize the tradeoff between effectiveness and efficiency.
Abstract: There is a fundamental tradeoff between effectiveness and efficiency when designing retrieval models for large-scale document collections. Effectiveness tends to derive from sophisticated ranking functions, such as those constructed using learning to rank, while efficiency gains tend to arise from improvements in query evaluation and caching strategies. Given their inherently disjoint nature, it is difficult to jointly optimize effectiveness and efficiency in end-to-end systems. To address this problem, we formulate and develop a novel cascade ranking model, which unlike previous approaches, can simultaneously improve both top k ranked effectiveness and retrieval efficiency. The model constructs a cascade of increasingly complex ranking functions that progressively prunes and refines the set of candidate documents to minimize retrieval latency and maximize result set quality. We present a novel boosting algorithm for learning such cascades to directly optimize the tradeoff between effectiveness and efficiency. Experimental results show that our cascades are faster and return higher quality results than comparable ranking models.

222 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168