scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Patent
10 Aug 2010
TL;DR: A computer implemented method comprising at least one computer in accordance with the invention is characterised by the following steps: receiving a search query comprising at most of the search terms, deriving at most one synonym for each search term, expanding the received search query with the synonym, and retrieving the search results obtained with the expanded query as discussed by the authors.
Abstract: The invention relates to data searching and translation. In particular, the invention relates to searching documents from the Internet or databases. Even further, the invention also relates to translating words in documents, WebPages, images or speech from one language to the next. A computer implemented method comprising at least one computer in accordance with the invention is characterised by the following steps: receiving a search query comprising at least one search term, deriving at least one synonym for at least one search term, expanding the received search query with the at least one synonym, searching at least one document using the said expanded search query, retrieving the search results obtained with the said expanded query, ranking the said search results based on context of occurrence of at least one search term. The best mode of the invention is considered to be an Internet search engine that delivers better search results.

139 citations

Book ChapterDOI
20 Sep 2010
TL;DR: A new criterion to overcome "double" distribution shift is formulated and a practical approach "Transfer Cross Validation" (TrCV) is presented to select both models and data in a cross validation framework, optimized for transfer learning.
Abstract: One solution to the lack of label problem is to exploit transfer learning, whereby one acquires knowledge from source-domains to improve the learning performance in the target-domain. The main challenge is that the source and target domains may have different distributions. An open problem is how to select the available models (including algorithms and parameters) and importantly, abundance of source-domain data, through statistically reliable methods, thus making transfer learning practical and easy-to-use for real-world applications. To address this challenge, one needs to take into account the difference in both marginal and conditional distributions in the same time, but not just one of them. In this paper, we formulate a new criterion to overcome "double" distribution shift and present a practical approach "Transfer Cross Validation" (TrCV) to select both models and data in a cross validation framework, optimized for transfer learning. The idea is to use density ratio weighting to overcome the difference in marginal distributions and propose a "reverse validation" procedure to quantify how well a model approximates the true conditional distribution of target-domain. The usefulness of TrCV is demonstrated on different cross-domain tasks, including wine quality evaluation, web-user ranking and text categorization. The experiment results show that the proposed method outperforms both traditional cross-validation and one state-of-the-art method which only considers marginal distribution shift. The software and datasets are available from the authors.

139 citations

Book ChapterDOI
09 Sep 2003
TL;DR: This work study pruning techniques for query execution in large engines in the case where there is a global ranking of pages, as provided by Pagerank or any other method, in addition to the standard term-based approach, and shows that there is significant potential benefit in such techniques.
Abstract: Large web search engines have to answer thousands of queries per second with interactive response times. A major factor in the cost of executing a query is given by the lengths of the inverted lists for the query terms, which increase with the size of the document collection and are often in the range of many megabytes. To address this issue, IR and database researchers have proposed pruning techniques that compute or approximate term-based ranking functions without scanning over the full inverted lists. Over the last few years, search engines have incorporated new types of ranking techniques that exploit aspects such as the hyperlink structure of the web or the popularity of a page to obtain improved results. We focus on the question of how such techniques can be efficiently integrated into query processing. In particular, we study pruning techniques for query execution in large engines in the case where we have a global ranking of pages, as provided by Pagerank or any other method, in addition to the standard term-based approach. We describe pruning schemes for this case and evaluate their efficiency on an experimental cluster-based search engine with million web pages. Our results show that there is significant potential benefit in such techniques.

138 citations

Proceedings ArticleDOI
Ke Yan1, Yonghong Tian1, Yaowei Wang1, Wei Zeng1, Tiejun Huang1 
01 Oct 2017
TL;DR: This paper model the relationship of vehicle images as multiple grains, and proposes two approaches to alleviate the precise vehicle search problem by exploiting multi-grain ranking constraints, which achieve the state-of-the-art performance on both datasets.
Abstract: Precise search of visually-similar vehicles poses a great challenge in computer vision, which needs to find exactly the same vehicle among a massive vehicles with visually similar appearances for a given query image. In this paper, we model the relationship of vehicle images as multiple grains. Following this, we propose two approaches to alleviate the precise vehicle search problem by exploiting multi-grain ranking constraints. One is Generalized Pairwise Ranking, which generalizes the conventional pairwise from considering only binary similar/dissimilar relations to multiple relations. The other is Multi-Grain based List Ranking, which introduces permutation probability to score a permutation of a multi-grain list, and further optimizes the ranking by the likelihood loss function. We implement the two approaches with multi-attribute classification in a multi-task deep learning framework. To further facilitate the research on precise vehicle search, we also contribute two high-quality and well-annotated vehicle datasets, named VD1 and VD2, which are collected from two different cities with diverse annotated attributes. As two of the largest publicly available precise vehicle search datasets, they contain 1,097,649 and 807,260 vehicle images respectively. Experimental results show that our approaches achieve the state-of-the-art performance on both datasets.

138 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168