Topic
Ranking (information retrieval)
About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.
Papers published on a yearly basis
Papers
More filters
••
29 Sep 2007TL;DR: This paper proposes several new approaches for query expansion, in which textual keywords, visual examples, or initial retrieval results are analyzed to identify the most relevant visual concepts for the given query, and develops both lexical and statistical approaches.
Abstract: We study the problem of semantic concept-based query expansion and re-ranking for multimedia retrieval. In particular, we explore the utility of a fixed lexicon of visual semantic concepts for automatic multimedia retrieval and re-ranking purposes. In this paper, we propose several new approaches for query expansion, in which textual keywords, visual examples, or initial retrieval results are analyzed to identify the most relevant visual concepts for the given query. These concepts are then used to generate additional query results and/or to re-rank an existing set of results. We develop both lexical and statistical approaches for text query expansion, as well as content-based approaches for visual query expansion. In addition, we study several other recently proposed methods for concept-based query expansion. In total, we compare 7 different approaches for expanding queries with visual semantic concepts. They are evaluated using a large video corpus and 39 concept detectors from the TRECVID-2006 video retrieval benchmark. We observe consistent improvement over the baselines for all 7 approaches, leading to an overall performance gain of 77% relative to a text retrieval baseline, and a 31% improvement relative to a state-of-the-art multimodal retrieval baseline.
220 citations
••
TL;DR: The results show that meta-features associated with a query can be combined with text retrieval techniques to improve the understanding and treatment of text search on documents with timestamps.
Abstract: Documents with timestamps, such as email and news, can be placed along a timeline. The timeline for a set of documents returned in response to a query gives an indication of how documents relevant to that query are distributed in time. Examining the timeline of a query result set allows us to characterize both how temporally dependent the topic is, as well as how relevant the results are likely to be. We outline characteristic patterns in query result set timelines, and show experimentally that we can automatically classify documents into these classes. We also show that properties of the query result set timeline can help predict the mean average precision of a query. These results show that meta-features associated with a query can be combined with text retrieval techniques to improve our understanding and treatment of text search on documents with timestamps.
220 citations
••
TL;DR: This paper surveys QE techniques in IR from 1960 to 2017 with respect to core techniques, data sources used, weighting and ranking methodologies, user participation and applications – bringing out similarities and differences.
Abstract: With the ever increasing size of the web, relevant information extraction on the Internet with a query formed by a few keywords has become a big challenge. Query Expansion (QE) plays a crucial role in improving searches on the Internet. Here, the user’s initial query is reformulated by adding additional meaningful terms with similar significance. QE – as part of information retrieval (IR) – has long attracted researchers’ attention. It has become very influential in the field of personalized social document, question answering, cross-language IR, information filtering and multimedia IR. Research in QE has gained further prominence because of IR dedicated conferences such as TREC (Text Information Retrieval Conference) and CLEF (Conference and Labs of the Evaluation Forum). This paper surveys QE techniques in IR from 1960 to 2017 with respect to core techniques, data sources used, weighting and ranking methodologies, user participation and applications – bringing out similarities and differences.
219 citations
••
TL;DR: A probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed, and a novel technique for estimating parameters that does not require human relevance judgments is described.
Abstract: We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document is about a particular topic is computed from term frequencies, modeled as Poisson distributions. Unlike previous probabilistic retrieval models, we do not attempt to estimate relevance–but rather our focus is "relatedness", the probability that a user would want to examine a particular document given known interest in another. We also describe a novel technique for estimating parameters that does not require human relevance judgments; instead, the process is based on the existence of MeSH ® in MEDLINE ®. The pmra retrieval model was compared against bm25, a competitive probabilistic model that shares theoretical similarities. Experiments using the test collection from the TREC 2005 genomics track shows a small but statistically significant improvement of pmra over bm25 in terms of precision. Our experiments suggest that the pmra model provides an effective ranking algorithm for related article search.
219 citations
••
06 Nov 2017TL;DR: Experiments on both benchmark LETOR dataset and a large scale clickthrough data show that DeepRank can significantly outperform learning to ranking methods, and existing deep learning methods.
Abstract: This paper concerns a deep learning approach to relevance ranking in information retrieval (IR). Existing deep IR models such as DSSM and CDSSM directly apply neural networks to generate ranking scores, without explicit understandings of the relevance. According to the human judgement process, a relevance label is generated by the following three steps: 1) relevant locations are detected; 2) local relevances are determined; 3) local relevances are aggregated to output the relevance label. In this paper we propose a new deep learning architecture, namely DeepRank, to simulate the above human judgment process. Firstly, a detection strategy is designed to extract the relevant contexts. Then, a measure network is applied to determine the local relevances by utilizing a convolutional neural network (CNN) or two-dimensional gated recurrent units (2D-GRU). Finally, an aggregation network with sequential integration and term gating mechanism is used to produce a global relevance score. DeepRank well captures important IR characteristics, including exact/semantic matching signals, proximity heuristics, query term importance, and diverse relevance requirement. Experiments on both benchmark LETOR dataset and a large scale clickthrough data show that DeepRank can significantly outperform learning to ranking methods, and existing deep learning methods.
218 citations