scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Proceedings ArticleDOI
13 Jun 2004
TL;DR: This paper introduces a new technique, called adaptive data partitioning (ADP), which is based on the idea of dividing the source data into regions, each executed by different, complementary plans, and shows how this model can be applied in novel ways to correct for underestimated selectivity and cardinality values.
Abstract: An effective query optimizer finds a query plan that exploits the characteristics of the source data. In data integration, little is known in advance about sources' properties, which necessitates the use of adaptive query processing techniques to adjust query processing on-the-fly. Prior work in adaptive query processing has focused on compensating for delays and adjusting for mis-estimated cardinality or selectivity values. In this paper, we present a generalized architecture for adaptive query processing and introduce a new technique, called adaptive data partitioning (ADP), which is based on the idea of dividing the source data into regions, each executed by different, complementary plans. We show how this model can be applied in novel ways to not only correct for underestimated selectivity and cardinality values, but also to discover and exploit order in the source data, and to detect and exploit source data that can be effectively pre-aggregated. We experimentally compare a number of alternative strategies and show that our approach is effective.

116 citations

Patent
14 May 2003
TL;DR: In this article, a search and retrieval system allows a user to search free text within sections of schema independent documents, which may include structured, semi-structured, and unstructured documents.
Abstract: A search and retrieval permits a user to search free text within sections of schema independent documents. The documents, which may include structured, semi-structured, and unstructured documents, contain text organized into a plurality of sections, such as XML tags. The repository of documents is schema independent, such that the search system does not require pre-defined fields for the sections. To execute a search, the search system receives a query that specifies at least one section and at least one free text query construct for text within the section. In general, the free text query construct specifies at least one free text search condition. The search system identifies sections in the repository of documents as specified in the query, and evaluates the free text query construct for the text within sections to determine whether the free text search condition is met.

115 citations

Journal ArticleDOI
TL;DR: This research proposes a framework based on expert opinion elicitation, developed to select the software engineering measures which are the best software reliability indicators, based on the top 30 measures identified in an earlier study conducted by Lawrence Livermore National Laboratory.
Abstract: This research proposes a framework based on expert opinion elicitation, developed to select the software engineering measures which are the best software reliability indicators. The current research is based on the top 30 measures identified in an earlier study conducted by Lawrence Livermore National Laboratory. A set of ranking criteria and their levels were identified. The score of each measure for each ranking criterion was elicited through expert opinion and then aggregated into a single score using multiattribute utility theory. The basic aggregation scheme selected was a linear additive scheme. A comprehensive sensitivity analysis was carried out. The sensitivity analysis included: variation of the ranking criteria levels, variation of the weights, variation of the aggregation schemes. The top-ranked measures were identified. Use of these measures in each software development phase can lead to a more reliable quantitative prediction of software reliability.

115 citations

Proceedings ArticleDOI
07 Aug 2017
TL;DR: Both query expansion experiments on four TREC collections and query classification experiments on the KDD Cup 2005 dataset suggest that the relevance-based word embedding models significantly outperform state-of-the-art proximity-based embedding model, such as word2vec and GloVe.
Abstract: Learning a high-dimensional dense representation for vocabulary terms, also known as a word embedding, has recently attracted much attention in natural language processing and information retrieval tasks. The embedding vectors are typically learned based on term proximity in a large corpus. This means that the objective in well-known word embedding algorithms, e.g., word2vec, is to accurately predict adjacent word(s) for a given word or context. However, this objective is not necessarily equivalent to the goal of many information retrieval (IR) tasks. The primary objective in various IR tasks is to capture relevance instead of term proximity, syntactic, or even semantic similarity. This is the motivation for developing unsupervised relevance-based word embedding models that learn word representations based on query-document relevance information. In this paper, we propose two learning models with different objective functions; one learns a relevance distribution over the vocabulary set for each query, and the other classifies each term as belonging to the relevant or non-relevant class for each query. To train our models, we used over six million unique queries and the top ranked documents retrieved in response to each query, which are assumed to be relevant to the query. We extrinsically evaluate our learned word representation models using two IR tasks: query expansion and query classification. Both query expansion experiments on four TREC collections and query classification experiments on the KDD Cup 2005 dataset suggest that the relevance-based word embedding models significantly outperform state-of-the-art proximity-based embedding models, such as word2vec and GloVe.

115 citations

Journal ArticleDOI
TL;DR: In particular, the authors analyzes the impact of the Academic Ranking of World Universities (ARWU) on the desempeño of universitarias in the world.

115 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168