scispace - formally typeset
Search or ask a question

Showing papers presented at "International ACM SIGIR Conference on Research and Development in Information Retrieval in 2005"


Proceedings ArticleDOI
15 Aug 2005
TL;DR: It is concluded that clicks are informative but biased, and while this makes the interpretation of clicks as absolute relevance judgments difficult, it is shown that relative preferences derived from clicks are reasonably accurate on average.
Abstract: This paper examines the reliability of implicit feedback generated from clickthrough data in WWW search. Analyzing the users' decision process using eyetracking and comparing implicit feedback against manual relevance judgments, we conclude that clicks are informative but biased. While this makes the interpretation of clicks as absolute relevance judgments difficult, we show that relative preferences derived from clicks are reasonably accurate on average.

1,484 citations


Proceedings ArticleDOI
15 Aug 2005
TL;DR: A novel approach is developed to train the model that directly maximizes the mean average precision rather than maximizing the likelihood of the training data, and significant improvements are possible by modeling dependencies, especially on the larger web collections.
Abstract: This paper develops a general, formal framework for modeling term dependencies via Markov random fields. The model allows for arbitrary text features to be incorporated as evidence. In particular, we make use of features based on occurrences of single terms, ordered phrases, and unordered phrases. We explore full independence, sequential dependence, and full dependence variants of the model. A novel approach is developed to train the model that directly maximizes the mean average precision rather than maximizing the likelihood of the training data. Ad hoc retrieval experiments are presented on several newswire and web collections, including the GOV2 collection used at the TREC 2004 Terabyte Track. The results show significant improvements are possible by modeling dependencies, especially on the larger web collections.

996 citations


Proceedings ArticleDOI
15 Aug 2005
TL;DR: This research suggests that rich representations of the user and the corpus are important for personalization, but that it is possible to approximate these representations and provide efficient client-side algorithms for personalizing search.
Abstract: We formulate and study search algorithms that consider a user's prior interactions with a wide variety of content to personalize that user's current Web search. Rather than relying on the unrealistic assumption that people will precisely specify their intent when searching, we pursue techniques that leverage implicit information about the user's interests. This information is used to re-rank Web search results within a relevance feedback framework. We explore rich models of user interests, built from both search-related information, such as previously issued queries and previously visited Web pages, and other information about the user such as documents and email the user has read and created. Our research suggests that rich representations of the user and the corpus are important for personalization, but that it is possible to approximate these representations and provide efficient client-side algorithms for personalizing search. We show that such personalization algorithms can significantly improve on current Web search.

928 citations


Proceedings ArticleDOI
15 Aug 2005
TL;DR: In this paper, clusters generated from the training data provide the basis for data smoothing and neighborhood selection and show that the new proposed approach consistently outperforms other state-of-art collaborative filtering algorithms.
Abstract: Memory-based approaches for collaborative filtering identify the similarity between two users by comparing their ratings on a set of items. In the past, the memory-based approach has been shown to suffer from two fundamental problems: data sparsity and difficulty in scalability. Alternatively, the model-based approach has been proposed to alleviate these problems, but this approach tends to limit the range of users. In this paper, we present a novel approach that combines the advantages of these two approaches by introducing a smoothing-based method. In our approach, clusters generated from the training data provide the basis for data smoothing and neighborhood selection. As a result, we provide higher accuracy as well as increased efficiency in recommendations. Empirical studies on two datasets (EachMovie and MovieLens) show that our new proposed approach consistently outperforms other state-of-art collaborative filtering algorithms.

706 citations


Proceedings ArticleDOI
15 Aug 2005
TL;DR: This paper proposes several context-sensitive retrieval algorithms based on statistical language models to combine the preceding queries and clicked document summaries with the current query for better ranking of documents.
Abstract: A major limitation of most existing retrieval models and systems is that the retrieval decision is made based solely on the query and document collection; information about the actual user and search context is largely ignored. In this paper, we study how to exploit implicit feedback information, including previous queries and clickthrough information, to improve retrieval accuracy in an interactive information retrieval setting. We propose several context-sensitive retrieval algorithms based on statistical language models to combine the preceding queries and clicked document summaries with the current query for better ranking of documents. We use the TREC AP data to create a test collection with search context information, and quantitatively evaluate our models using this test set. Experiment results show that using implicit feedback, especially the clicked document summaries, can improve retrieval performance substantially.

501 citations


Proceedings ArticleDOI
15 Aug 2005
TL;DR: It is found that the t-test is highly reliable (more so than the sign or Wilcoxon test), and is far more reliable than simply showing a large percentage difference in effectiveness measures between IR systems.
Abstract: The effectiveness of information retrieval systems is measured by comparing performance on a common set of queries and documents. Significance tests are often used to evaluate the reliability of such comparisons. Previous work has examined such tests, but produced results with limited application. Other work established an alternative benchmark for significance, but the resulting test was too stringent. In this paper, we revisit the question of how such tests should be used. We find that the t-test is highly reliable (more so than the sign or Wilcoxon test), and is far more reliable than simply showing a large percentage difference in effectiveness measures between IR systems. Our results show that past empirical work on significance tests over-estimated the error of such tests. We also re-consider comparisons between the reliability of precision at rank 10 and mean average precision, arguing that past comparisons did not consider the assessor effort required to compute such measures. This investigation shows that assessor effort would be better spent building test collections with more topics, each assessed in less detail.

381 citations


Proceedings ArticleDOI
15 Aug 2005
TL;DR: An additional criterion for web page ranking is introduced, namely the distance between a user profile defined using ODP topics and the sets of O DP topics covered by each URL returned in regular web search, and the boundaries of biasing PageRank on subtopics of the ODP are investigated.
Abstract: The Open Directory Project is clearly one of the largest collaborative efforts to manually annotate web pages. This effort involves over 65,000 editors and resulted in metadata specifying topic and importance for more than 4 million web pages. Still, given that this number is just about 0.05 percent of the Web pages indexed by Google, is this effort enough to make a difference? In this paper we discuss how these metadata can be exploited to achieve high quality personalized web search. First, we address this by introducing an additional criterion for web page ranking, namely the distance between a user profile defined using ODP topics and the sets of ODP topics covered by each URL returned in regular web search. We empirically show that this enhancement yields better results than current web search using Google. Then, in the second part of the paper, we investigate the boundaries of biasing PageRank on subtopics of the ODP in order to automatically extend these metadata to the whole web.

327 citations


Proceedings ArticleDOI
Eric Gaussier1, Cyril Goutte1
15 Aug 2005
TL;DR: It is shown that PLSA solves the problem of NMF with KL divergence, and the implications of this relationship are explored.
Abstract: Non-negative Matrix Factorization (NMF, [5]) and Probabilistic Latent Semantic Analysis (PLSA, [4]) have been successfully applied to a number of text analysis tasks such as document clustering. Despite their different inspirations, both methods are instances of multinomial PCA [1]. We further explore this relationship and first show that PLSA solves the problem of NMF with KL divergence, and then explore the implications of this relationship.

305 citations


Proceedings ArticleDOI
Hang Cui1, Renxu Sun1, Keya Li1, Min-Yen Kan1, Tat-Seng Chua1 
15 Aug 2005
TL;DR: This work presents two methods for learning relation mapping scores from past QA pairs: one based on mutual information and the other on expectation maximization, which significantly outperforms state-of-the-art density-based passage retrieval methods.
Abstract: State-of-the-art question answering (QA) systems employ term-density ranking to retrieve answer passages Such methods often retrieve incorrect passages as relationships among question terms are not considered Previous studies attempted to address this problem by matching dependency relations between questions and answers They used strict matching, which fails when semantically equivalent relationships are phrased differently We propose fuzzy relation matching based on statistical models We present two methods for learning relation mapping scores from past QA pairs: one based on mutual information and the other on expectation maximization Experimental results show that our method significantly outperforms state-of-the-art density-based passage retrieval methods by up to 78% in mean reciprocal rank Relation matching also brings about a 50% improvement in a system enhanced by query expansion

264 citations


Proceedings ArticleDOI
15 Aug 2005
TL;DR: This paper introduces the multi-label informed latent semantic indexing (MLSI) algorithm which preserves the information of inputs and meanwhile captures the correlations between the multiple outputs and incorporates the human-annotated category information.
Abstract: Latent semantic indexing (LSI) is a well-known unsupervised approach for dimensionality reduction in information retrieval. However if the output information (i.e. category labels) is available, it is often beneficial to derive the indexing not only based on the inputs but also on the target values in the training data set. This is of particular importance in applications with multiple labels, in which each document can belong to several categories simultaneously. In this paper we introduce the multi-label informed latent semantic indexing (MLSI) algorithm which preserves the information of inputs and meanwhile captures the correlations between the multiple outputs. The recovered "latent semantics" thus incorporate the human-annotated category information and can be used to greatly improve the prediction accuracy. Empirical study based on two data sets, Reuters-21578 and RCV1, demonstrates very encouraging results.

253 citations


Proceedings ArticleDOI
15 Aug 2005
TL;DR: Novel learning methods for estimating the quality of results returned by a search engine in response to a query and the usefulness of quality estimation for several applications, among them improvement of retrieval, detecting queries for which no relevant content exists in the document collection, and distributed information retrieval are presented.
Abstract: In this article we present novel learning methods for estimating the quality of results returned by a search engine in response to a query. Estimation is based on the agreement between the top results of the full query and the top results of its sub-queries. We demonstrate the usefulness of quality estimation for several applications, among them improvement of retrieval, detecting queries for which no relevant content exists in the document collection, and distributed information retrieval. Experiments on TREC data demonstrate the robustness and the effectiveness of our learning algorithms.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: This paper explores correlations among categories with maximum entropy method and derives a classification algorithm for multi-labelled documents that significantly outperforms the combination of single label approach.
Abstract: Many classification problems require classifiers to assign each single document into more than one category, which is called multi-labelled classification. The categories in such problems usually are neither conditionally independent from each other nor mutually exclusive, therefore it is not trivial to directly employ state-of-the-art classification algorithms without losing information of relation among categories. In this paper, we explore correlations among categories with maximum entropy method and derive a classification algorithm for multi-labelled documents. Our experiments show that this method significantly outperforms the combination of single label approach.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: This work proposes ten strategies for solving the problem of associating ads with a Web page from a computer science perspective and suggests that great accuracy in content-targeted advertising can be attained with appropriate algorithms.
Abstract: The current boom of the Web is associated with the revenues originated from on-line advertising. While search-based advertising is dominant, the association of ads with a Web page (during user navigation) is becoming increasingly important. In this work, we study the problem of associating ads with a Web page, referred to as content-targeted advertising, from a computer science perspective. We assume that we have access to the text of the Web page, the keywords declared by an advertiser, and a text associated with the advertiser's business. Using no other information and operating in fully automatic fashion, we propose ten strategies for solving the problem and evaluate their effectiveness. Our methods indicate that a matching strategy that takes into account the semantics of the problem (referred to as AAK for "ads and keywords") can yield gains in average precision figures of 60% compared to a trivial vector-based strategy. Further, a more sophisticated impedance coupling strategy, which expands the text of the Web page to reduce vocabulary impedance with regard to an advertisement, can yield extra gains in average precision of 50%. These are first results. They suggest that great accuracy in content-targeted advertising can be attained with appropriate algorithms.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: A novel ranking scheme named Affinity Ranking (AR) is proposed to re-rank search results by optimizing two metrics: diversity -- which indicates the variance of topics in a group of documents; and information richness -- which measures the coverage of a single document to its topic.
Abstract: In this paper, we propose a novel ranking scheme named Affinity Ranking (AR) to re-rank search results by optimizing two metrics: (1) diversity -- which indicates the variance of topics in a group of documents; (2) information richness -- which measures the coverage of a single document to its topic. Both of the two metrics are calculated from a directed link graph named Affinity Graph (AG). AG models the structure of a group of documents based on the asymmetric content similarities between each pair of documents. Experimental results in Yahoo! Directory, ODP Data, and Newsgroup data demonstrate that our proposed ranking algorithm significantly improves the search performance. Specifically, the algorithm achieves 31% improvement in diversity and 12% improvement in information richness relatively within the top 10 search results.

Proceedings Article
01 Jan 2005

Proceedings ArticleDOI
15 Aug 2005
TL;DR: This work proposes a probabilistic model to incorporate both content and time information in a unified framework for retrospective news event detection and builds an interactive RED system, HISCOVERY, which provides additional functions to present events, Photo Story and Chronicle.
Abstract: Retrospective news event detection (RED) is defined as the discovery of previously unidentified events in historical news corpus. Although both the contents and time information of news articles are helpful to RED, most researches focus on the utilization of the contents of news articles. Few research works have been carried out on finding better usages of time information. In this paper, we do some explorations on both directions based on the following two characteristics of news articles. On the one hand, news articles are always aroused by events; on the other hand, similar articles reporting the same event often redundantly appear on many news sources. The former hints a generative model of news articles, and the latter provides data enriched environments to perform RED. With consideration of these characteristics, we propose a probabilistic model to incorporate both content and time information in a unified framework. This model gives new representations of both news articles and news events. Furthermore, based on this approach, we build an interactive RED system, HISCOVERY, which provides additional functions to present events, Photo Story and Chronicle.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: The results show that the model achieves substantial and significant improvements with respect to the models without these relationships, and clearly shows the benefit of integrating word relationships into language models for IR.
Abstract: In this paper, we propose a novel dependency language modeling approach for information retrieval. The approach extends the existing language modeling approach by relaxing the independence assumption. Our goal is to build a language model in which various word relationships can be integrated. In this work, we integrate two types of relationship extracted from WordNet and co-occurrence relationships respectively. The integrated model has been tested on several TREC collections. The results show that our model achieves substantial and significant improvements with respect to the models without these relationships. These results clearly show the benefit of integrating word relationships into language models for IR.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: This paper proposed a structural re-ranking approach to ad hoc information retrieval, which reorder the documents in an initially retrieved set by exploiting asymmetric relationships between them, and showed that integrating centrality into standard language-model-based retrieval is quite effective at improving precision at top ranks.
Abstract: Inspired by the PageRank and HITS (hubs and authorities) algorithms for Web search, we propose a structural re-ranking approach to ad hoc information retrieval: we reorder the documents in an initially retrieved set by exploiting asymmetric relationships between them. Specifically, we consider generation links, which indicate that the language model induced from one document assigns high probability to the text of another; in doing so, we take care to prevent bias against long documents. We study a number of re-ranking criteria based on measures of centrality in the graphs formed by generation links, and show that integrating centrality into standard language-model-based retrieval is quite effective at improving precision at top ranks.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: This paper proposes a new axiomatic approach to developing retrieval models based on direct modeling of relevance with formalized retrieval constraints defined at the level of terms, and derives several new retrieval functions using this framework.
Abstract: Existing retrieval models generally do not offer any guarantee for optimal retrieval performance. Indeed, it is even difficult, if not impossible, to predict a model's empirical performance analytically. This limitation is at least partly caused by the way existing retrieval models are developed where relevance is only coarsely modeled at the level of documents and queries as opposed to a finer granularity level of terms. In this paper, we present a new axiomatic approach to developing retrieval models based on direct modeling of relevance with formalized retrieval constraints defined at the level of terms. The basic idea of this axiomatic approach is to search in a space of candidate retrieval functions for one that can satisfy a set of reasonable retrieval constraints. To constrain the search space, we propose to define a retrieval function inductively and decompose a retrieval function into three component functions. Inspired by the analysis of the existing retrieval functions with the inductive definition, we derive several new retrieval functions using the axiomatic retrieval framework. Experiment results show that the derived new retrieval functions are more robust and less sensitive to parameter settings than the existing retrieval functions with comparable optimal performance.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: FLOE is presented, a simple density analysis method for modelling the shape of the transformation required, based on training data and without assuming independence between feature and baseline, for a new query independent feature.
Abstract: A query independent feature, relating perhaps to document content, linkage or usage, can be transformed into a static, per-document relevance weight for use in ranking. The challenge is to find a good function to transform feature values into relevance scores. This paper presents FLOE, a simple density analysis method for modelling the shape of the transformation required, based on training data and without assuming independence between feature and baseline. For a new query independent feature, it addresses the questions: is it required for ranking, what sort of transformation is appropriate and, after adding it, how successful was the chosen transformation? Based on this we apply sigmoid transformations to PageRank, indegree, URL Length and ClickDistance, tested in combination with a BM25 baseline.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: This paper proposed a dependency-network based collective classification method, in which the local classifiers are maximum entropy models based on words and certain relational features, exploiting the sequential correlation among email messages in the same thread.
Abstract: We consider classification of email messages as to whether or not they contain certain "email acts", such as a request or a commitment. We show that exploiting the sequential correlation among email messages in the same thread can improve email-act classification. More specifically, we describe a new text-classification algorithm based on a dependency-network based collective classification method, in which the local classifiers are maximum entropy models based on words and certain relational features. We show that statistically significant improvements over a bag-of-words baseline classifier can be obtained for some, but not all, email-act classes. Performance improvements obtained by collective classification appears to be consistent across many email acts suggested by prior speech-act theory.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: This paper presents eight different methods of generating MDS and evaluates each of these methods on a large set of topics used in past DUC workshops, showing a significant improvement in the quality of summaries based on topic themes over MDS methods that use other alternative topic representations.
Abstract: The problem of using topic representations for multi-document summarization (MDS) has received considerable attention recently. In this paper, we describe five different topic representations and introduce a novel representation of topics based on topic themes. We present eight different methods of generating MDS and evaluate each of these methods on a large set of topics used in past DUC workshops. Our evaluation results show a significant improvement in the quality of summaries based on topic themes over MDS methods that use other alternative topic representations.

Journal ArticleDOI
01 Jun 2005
TL;DR: The World Wide Web is a context in which traditional Information Retrieval methods are challenged, and given the volume of the Web and its speed of change, the coverage of modern search engines is relatively small.
Abstract: The key factors for the success of the World Wide Web are its large size and the lack of a centralized control over its contents. Both issues are also the most important source of problems for locating information. The Web is a context in which traditional Information Retrieval methods are challenged, and given the volume of the Web and its speed of change, the coverage of modern search engines is relatively small. Moreover, the distribution of quality is very skewed, and interesting pages are scarce in comparison with the rest of the content.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: The algorithms used to discover a number of other instances of large-scale phrase-level replication within the two data sets collected in December 2002 and June 2004 are described.
Abstract: Two years ago, we conducted a study on the evolution of web pages over time. In the course of that study, we discovered a large number of machine-generated "spam" web pages emanating from a handful of web servers in Germany. These spam web pages were dynamically assembled by stitching together grammatically well-formed German sentences drawn from a large collection of sentences. This discovery motivated us to develop techniques for finding other instances of such "slice and dice" generation of web pages, where pages are automatically generated by stitching together phrases drawn from a limited corpus. We applied these techniques to two data sets, a set of 151 million web pages collected in December 2002 and a set of 96 million web pages collected in June 2004. We found a number of other instances of large-scale phrase-level replication within the two data sets. This paper describes the algorithms we used to discover this type of replication, and highlights the results of our data mining.

Journal ArticleDOI
01 Jun 2005
TL;DR: The robust retrieval track explores methods for improving the consistency of retrieval technology by focusing on poorly performing topics by investigating appropriate evaluation measures to support the focus on ineffective topics.
Abstract: The robust retrieval track explores methods for improving the consistency of retrieval technology by focusing on poorly performing topics. The retrieval task in the track is a traditional ad hoc retrieval task where the evaluation methodology emphasizes a system's least effective topics. The most promising approach to improving poorly performing topics is exploiting text collections other than the target collection such as the web.The track has also investigated appropriate evaluation measures to support the focus on ineffective topics. Traditional measure are dominated by the better-performing topics, and the first two measures used in the track that do emphasize the poorly performing topics are unstable in practice. A third measure, a variant of the traditional MAP measure that uses a geometric mean rather than an arithmetic mean to average individual topic results, shows promise of giving appropriate emphasis to poorly performing topics while being more stable at equal topic set sizes.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: It is claimed that iterative computations over the URM can help overcome the data sparseness problem and detect latent relationships among heterogeneous data objects, thus, can improve the quality of information applications that require com- bination of information from heterogeneous sources.
Abstract: In this paper we use a Unified Relationship Matrix (URM) to represent a set of heterogeneous data objects (e.g., web pages, queries) and their interrelationships (e.g., hyperlinks, user click-through sequences). We claim that iterative computations over the URM can help overcome the data sparseness problem and detect latent relationships among heterogeneous data objects, thus, can improve the quality of information applications that require com- bination of information from heterogeneous sources. To support our claim, we present a unified similarity-calculating algorithm, SimFusion. By iteratively computing over the URM, SimFusion can effectively integrate relationships from heterogeneous sources when measuring the similarity of two data objects. Experiments based on a web search engine query log and a web page collection demonstrate that SimFusion can improve similarity measurement of web objects over both traditional content based algorithms and the cutting edge SimRank algorithm.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: This paper defines a search query's dominant location (QDL) and proposes a solution to correctly detect it and shows that the query location detection solution has consistent high accuracy for all query frequency ranges.
Abstract: Accurately and effectively detecting the locations where search queries are truly about has huge potential impact on increasing search relevance. In this paper, we define a search query's dominant location (QDL) and propose a solution to correctly detect it. QDL is geographical location(s) associated with a query in collective human knowledge, i.e., one or few prominent locations agreed by majority of people who know the answer to the query. QDL is a subjective and collective attribute of search queries and we are able to detect QDLs from both queries containing geographical location names and queries not containing them. The key challenges to QDL detection include false positive suppression (not all contained location names in queries mean geographical locations), and detecting implied locations by the context of the query. In our solution, a query is recursively broken into atomic tokens according to its most popular web usage for reducing false positives. If we do not find a dominant location in this step, we mine the top search results and/or query logs (with different approaches discussed in this paper) to discover implicit query locations. Our large-scale experiments on recent MSN Search queries show that our query location detection solution has consistent high accuracy for all query frequency ranges.

Journal ArticleDOI
01 Jun 2005
TL;DR: This thesis investigating search and retrieval in collections of images and video, where video is defined as a sequence of still images, focuses on retrieval from generic, heterogeneous multimedia collections.
Abstract: This thesis discusses information retrieval from multimedia archives, focusing on documents containing visual material. We investigate search and retrieval in collections of images and video, where video is defined as a sequence of still images. No assumptions are made with respect to the content of the documents; we concentrate on retrieval from generic, heterogeneous multimedia collections. In this research area a user's query typically consists of one or more example images and the implicit request is: "Find images similar to this one." In addition the query may contain a textual description of the information need. The research presented here addresses three issues within this area.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: This paper studies how a retrieval system can perform active feedback, i.e., how to choose documents for relevance feedback so that the system can learn most from the feedback information.
Abstract: Information retrieval is, in general, an iterative search process, in which the user often has several interactions with a retrieval system for an information need. The retrieval system can actively probe a user with questions to clarify the information need instead of just passively responding to user queries. A basic question is thus how a retrieval system should propose questions to the user so that it can obtain maximum benefits from the feedback on these questions. In this paper, we study how a retrieval system can perform active feedback, i.e., how to choose documents for relevance feedback so that the system can learn most from the feedback information. We present a general framework for such an active feedback problem, and derive several practical algorithms as special cases. Empirical evaluation of these algorithms shows that the performance of traditional relevance feedback (presenting the top K documents) is consistently worse than that of presenting documents with more diversity. With a diversity-based selection algorithm, we obtain fewer relevant documents, however, these fewer documents have more learning benefits.

Proceedings ArticleDOI
15 Aug 2005
TL;DR: Investigation of how the use and effectiveness of IRF is affected by three factors: search task complexity, the search experience of the user and the stage in the search suggests that all three of these factors contribute to the utility ofIRF.
Abstract: Implicit relevance feedback (IRF) is the process by which a search system unobtrusively gathers evidence on searcher interests from their interaction with the system. IRF is a new method of gathering information on user interest and, if IRF is to be used in operational IR systems, it is important to establish when it performs well and when it performs poorly. In this paper we investigate how the use and effectiveness of IRF is affected by three factors: search task complexity, the search experience of the user and the stage in the search. Our findings suggest that all three of these factors contribute to the utility of IRF.