scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Proceedings ArticleDOI
Zaiqing Nie1, Yunxiao Ma1, Shuming Shi1, Ji-Rong Wen1, Wei-Ying Ma1 
08 May 2007
TL;DR: This paper proposes several language models for Web object retrieval, namely an unstructured object retrieval model, a structured object retrieved model, and a hybrid model with both structured and unstructuring retrieval features, and concludes that the hybrid model is the superior by taking into account the extraction errors at varying levels.
Abstract: The primary function of current Web search engines is essentially relevance ranking at the document level. However, myriad structured information about real-world objects is embedded in static Web pages and online Web databases. Document-level information retrieval can unfortunately lead to highly inaccurate relevance ranking in answering object-oriented queries. In this paper, we propose a paradigm shift to enable searching at the object level. In traditional information retrieval models, documents are taken as the retrieval units and the content of a document is considered reliable. However, this reliability assumption is no longer valid in the object retrieval context when multiple copies of information about the same object typically exist. These copies may be inconsistent because of diversity of Web site qualities and the limited performance of current information extraction techniques. If we simply combine the noisy and inaccurate attribute information extracted from different sources, we may not be able to achieve satisfactory retrieval performance. In this paper, we propose several language models for Web object retrieval, namely an unstructured object retrieval model, a structured object retrieval model, and a hybrid model with both structured and unstructured retrieval features. We test these models on a paper search engine and compare their performances. We conclude that the hybrid model is the superior by taking into account the extraction errors at varying levels.

129 citations

Journal ArticleDOI
TL;DR: A new scheme which implements a recursive HSV-space segmentation technique to identify perceptually prominent color areas, providing robust retrieval results for a wide range of gamma nonlinearity values, which proves to be of great importance since, in general, the image acquisition source is unknown.

129 citations

Journal ArticleDOI
TL;DR: This article defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score.
Abstract: In this article we present Supervised Semantic Indexing which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy). However, unlike LSI our models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as cross-language retrieval or online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods.

128 citations

Journal ArticleDOI
01 Jan 2019
TL;DR: The results show that the credibility of information in the D-AHP method slightly impacts the ranking of alternatives, but the priority weights of alternatives are influenced in a relatively obvious extent.
Abstract: Multi-criteria decision making (MCDM) has attracted wide interest due to its extensive applications in practice. In our previous study, a method called D-AHP (AHP method extended by D numbers preference relation) was proposed to study the MCDM problems based on a D numbers extended fuzzy preference relation, and a solution for the D-AHP method has been given to obtain the weights and ranking of alternatives from the decision data, in which the results obtained by using the D-AHP method are influenced by the credibility of information. However, in previous study the impact of information’s credibility on the results is not sufficiently investigated, which becomes an unsolved issue in the D-AHP. In this paper, we focus on the credibility of information within the D-AHP method and study its impact on the results of a MCDM problem. Information with different credibilities including high, medium and low, respectively, is taken into consideration. The results show that the credibility of information in the D-AHP method slightly impacts the ranking of alternatives, but the priority weights of alternatives are influenced in a relatively obvious extent.

128 citations

Proceedings Article
01 Mar 2020
TL;DR: The Deep Learning Track as mentioned in this paper is the first track with large human-labeled training sets, introducing two sets corresponding to two tasks, each with rigorous TREC-style blind evaluation and reusable test sets.
Abstract: The Deep Learning Track is a new track for TREC 2019, with the goal of studying ad hoc ranking in a large data regime. It is the first track with large human-labeled training sets, introducing two sets corresponding to two tasks, each with rigorous TREC-style blind evaluation and reusable test sets. The document retrieval task has a corpus of 3.2 million documents with 367 thousand training queries, for which we generate a reusable test set of 43 queries. The passage retrieval task has a corpus of 8.8 million passages with 503 thousand training queries, for which we generate a reusable test set of 43 queries. This year 15 groups submitted a total of 75 runs, using various combinations of deep learning, transfer learning and traditional IR ranking methods. Deep learning runs significantly outperformed traditional IR runs. Possible explanations for this result are that we introduced large training data and we included deep models trained on such data in our judging pools, whereas some past studies did not have such training data or pooling.

128 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168