Topic
Ranking (information retrieval)
About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.
Papers published on a yearly basis
Papers
More filters
•
26 Jul 2004TL;DR: In this article, a method of personalizing a search of a document collection to a user comprises monitoring documents accessed by a user, identifying first phrases present in one or more of the accessed documents, identifying corresponding first related phrases related to the corresponding identified first phrase, selecting search results comprising documents responsive to the query, identifying by operation of a processor configured to manipulate data within a computer system, one OR more second phrases related with the query and that are present in a user model.
Abstract: A method of personalizing a search of a document collection to a user comprises monitoring documents accessed by a user, identifying first phrases present in one or more of the accessed documents, identifying one or more corresponding first related phrases related to the corresponding identified first phrase, receiving a query including one or more second phrases from the user, selecting search results comprising documents responsive to the query, identifying by operation of a processor configured to manipulate data within a computer system, one or more second phrases related to one or more second phrases of the query and that are present in a user model, weighting scores of corresponding search results according to the identified one or more second related phrases, ranking the search results according to their weighted scores to provide personalized search results and presenting them to the user.
123 citations
••
TL;DR: Essie is a phrase-based search engine with term and concept query expansion and probabilistic relevancy ranking that shows that a judicious combination of exploiting document structure, phrase searching, and concept based query expansion is a useful approach for information retrieval in the biomedical domain.
123 citations
••
01 Jun 2005
TL;DR: This thesis investigating search and retrieval in collections of images and video, where video is defined as a sequence of still images, focuses on retrieval from generic, heterogeneous multimedia collections.
Abstract: This thesis discusses information retrieval from multimedia archives, focusing on documents containing visual material. We investigate search and retrieval in collections of images and video, where video is defined as a sequence of still images. No assumptions are made with respect to the content of the documents; we concentrate on retrieval from generic, heterogeneous multimedia collections. In this research area a user's query typically consists of one or more example images and the implicit request is: "Find images similar to this one." In addition the query may contain a textual description of the information need. The research presented here addresses three issues within this area.
123 citations
••
TL;DR: This paper proposes a more general framework for handling time-sensitive queries and automatically identifies the important time intervals that are likely to be of interest for a query and builds scoring techniques that seamlessly integrate the temporal aspect into the overall ranking mechanism.
Abstract: Time is an important dimension of relevance for a large number of searches, such as over blogs and news archives. So far, research on searching over such collections has largely focused on locating topically similar documents for a query. Unfortunately, topic similarity alone is not always sufficient for document ranking. In this paper, we observe that, for an important class of queries that we call time-sensitive queries, the publication time of the documents in a news archive is important and should be considered in conjunction with the topic similarity to derive the final document ranking. Earlier work has focused on improving retrieval for “recency” queries that target recent documents. We propose a more general framework for handling time-sensitive queries and we automatically identify the important time intervals that are likely to be of interest for a query. Then, we build scoring techniques that seamlessly integrate the temporal aspect into the overall ranking mechanism. We present an extensive experimental evaluation using a variety of news article data sets, including TREC data as well as real web data analyzed using the Amazon Mechanical Turk. We examine several techniques for detecting the important time intervals for a query over a news archive and for incorporating this information in the retrieval process. We show that our techniques are robust and significantly improve result quality for time-sensitive queries compared to state-of-the-art retrieval techniques.
123 citations
•
01 Jan 2011TL;DR: This chapter contains sections titled: Half Title, MIT Lincoln Laboratory Series, Title, Copyright, Dedication, Table of Contents, Preface, About the Authors, Acknowledgments.
Abstract: Almost all (important) decision problems are inevitably subject to some level of uncertainty either about data measurements, the parameters, or predictions describing future evolution. The significance of handling uncertainty is further amplified by the large volume of uncertain data automatically generated by modern data gathering or integration systems. Examples include imprecise sensor measurements in a sensor network, inconsistent information collected from different sources in a data integration application, noisy observation data in scientific domains, and so on. Various types of problems of decision making under uncertainty have been a subject of extensive research in computer science, economics and social science. In this talk, I will focus on two important problems in this domain: (1) ranking and top-k query processing over probabilistic database and (2) utility maximization for stochastic combinatorial problems. I will also briefly discuss some of my other research works, such as stochastic matching, distributed multi-query processing, if time allows.
123 citations