scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 2008"


Proceedings ArticleDOI
20 Jul 2008
TL;DR: This paper develops a framework for evaluation that systematically rewards novelty and diversity into a specific evaluation measure, based on cumulative gain, and demonstrates the feasibility of this approach using a test collection based on the TREC question answering track.
Abstract: Evaluation measures act as objective functions to be optimized by information retrieval systems. Such objective functions must accurately reflect user requirements, particularly when tuning IR systems and learning ranking functions. Ambiguity in queries and redundancy in retrieved documents are poorly reflected by current evaluation measures. In this paper, we present a framework for evaluation that systematically rewards novelty and diversity. We develop this framework into a specific evaluation measure, based on cumulative gain. We demonstrate the feasibility of our approach using a test collection based on the TREC question answering track.

988 citations


Journal ArticleDOI
TL;DR: A novel probabilistic retrieval model forms a basis to interpret the TF-IDF term weights as making relevance decisions, and it is shown that the term-frequency factor of the ranking formula can be rendered into different term- frequency factors of existing retrieval systems.
Abstract: A novel probabilistic retrieval model is presented It forms a basis to interpret the TF-IDF term weights as making relevance decisions It simulates the local relevance decision-making for every location of a document, and combines all of these “local” relevance decisions as the “document-wide” relevance decision for the document The significance of interpreting TF-IDF in this way is the potential to: (1) establish a unifying perspective about information retrieval as relevance decision-making; and (2) develop advanced TF-IDF-related term weights for future elaborate retrieval models Our novel retrieval model is simplified to a basic ranking formula that directly corresponds to the TF-IDF term weights In general, we show that the term-frequency factor of the ranking formula can be rendered into different term-frequency factors of existing retrieval systems In the basic ranking formula, the remaining quantity - log p(r¯|t ∈ d) is interpreted as the probability of randomly picking a nonrelevant usage (denoted by r¯) of term t Mathematically, we show that this quantity can be approximated by the inverse document-frequency (IDF) Empirically, we show that this quantity is related to IDF, using four reference TREC ad hoc retrieval data collections

752 citations


Journal ArticleDOI
TL;DR: This work almost settles a long-standing conjecture of Bang-Jensen and Thomassen and shows that unless NP⊆BPP, there is no polynomial time algorithm for the problem of minimum feedback arc set in tournaments.
Abstract: We address optimization problems in which we are given contradictory pieces of input information and the goal is to find a globally consistent solution that minimizes the extent of disagreement with the respective inputs. Specifically, the problems we address are rank aggregation, the feedback arc set problem on tournaments, and correlation and consensus clustering. We show that for all these problems (and various weighted versions of them), we can obtain improved approximation factors using essentially the same remarkably simple algorithm. Additionally, we almost settle a long-standing conjecture of Bang-Jensen and Thomassen and show that unless NP⊆BPP, there is no polynomial time algorithm for the problem of minimum feedback arc set in tournaments.

740 citations


Proceedings ArticleDOI
Fen Xia1, Tie-Yan Liu2, Jue Wang1, Wensheng Zhang1, Hang Li 
05 Jul 2008
TL;DR: A sufficient condition on consistency for ranking is given, which seems to be the first such result obtained in related research, and analysis on three loss functions: likelihood loss, cosine loss, and cross entropy loss are conducted.
Abstract: This paper aims to conduct a study on the listwise approach to learning to rank. The listwise approach learns a ranking function by taking individual lists as instances and minimizing a loss function defined on the predicted list and the ground-truth list. Existing work on the approach mainly focused on the development of new algorithms; methods such as RankCosine and ListNet have been proposed and good performances by them have been observed. Unfortunately, the underlying theory was not sufficiently studied so far. To amend the problem, this paper proposes conducting theoretical analysis of learning to rank algorithms through investigations on the properties of the loss functions, including consistency, soundness, continuity, differentiability, convexity, and efficiency. A sufficient condition on consistency for ranking is given, which seems to be the first such result obtained in related research. The paper then conducts analysis on three loss functions: likelihood loss, cosine loss, and cross entropy loss. The latter two were used in RankCosine and ListNet. The use of the likelihood loss leads to the development of a new listwise method called ListMLE, whose loss function offers better properties, and also leads to better experimental results.

699 citations


Journal ArticleDOI
TL;DR: A new effectiveness metric, rank-biased precision, is introduced that is derived from a simple model of user behavior, is robust if answer rankings are extended to greater depths, and allows accurate quantification of experimental uncertainty, even when only partial relevance judgments are available.
Abstract: A range of methods for measuring the effectiveness of information retrieval systems has been proposed. These are typically intended to provide a quantitative single-value summary of a document ranking relative to a query. However, many of these measures have failings. For example, recall is not well founded as a measure of satisfaction, since the user of an actual system cannot judge recall. Average precision is derived from recall, and suffers from the same problem. In addition, average precision lacks key stability properties that are needed for robust experiments. In this article, we introduce a new effectiveness metric, rank-biased precision, that avoids these problems. Rank-biased pre-cision is derived from a simple model of user behavior, is robust if answer rankings are extended to greater depths, and allows accurate quantification of experimental uncertainty, even when only partial relevance judgments are available.

584 citations


Proceedings ArticleDOI
24 Aug 2008
TL;DR: This paper proposes a novel context-aware query suggestion approach which is in two steps, and outperforms two baseline methods in both coverage and quality of suggestions.
Abstract: Query suggestion plays an important role in improving the usability of search engines. Although some recently proposed methods can make meaningful query suggestions by mining query patterns from search logs, none of them are context-aware - they do not take into account the immediately preceding queries as context in query suggestion. In this paper, we propose a novel context-aware query suggestion approach which is in two steps. In the offine model-learning step, to address data sparseness, queries are summarized into concepts by clustering a click-through bipartite. Then, from session data a concept sequence suffix tree is constructed as the query suggestion model. In the online query suggestion step, a user's search context is captured by mapping the query sequence submitted by the user to a sequence of concepts. By looking up the context in the concept sequence sufix tree, our approach suggests queries to the user in a context-aware manner. We test our approach on a large-scale search log of a commercial search engine containing 1:8 billion search queries, 2:6 billion clicks, and 840 million query sessions. The experimental results clearly show that our approach outperforms two baseline methods in both coverage and quality of suggestions.

545 citations


Journal ArticleDOI
TL;DR: This work shows that a simple (weighted) voting strategy minimizes risk with respect to the well-known Spearman rank correlation and compares RPC to existing label ranking methods, which are based on scoring individual labels instead of comparing pairs of labels.

538 citations


Proceedings ArticleDOI
05 Jul 2008
TL;DR: This work presents two online learning algorithms that directly learn a diverse ranking of documents based on users' clicking behavior and shows that these algorithms minimize abandonment, or alternatively, maximize the probability that a relevant document is found in the top k positions of a ranking.
Abstract: Algorithms for learning to rank Web documents usually assume a document's relevance is independent of other documents. This leads to learned ranking functions that produce rankings with redundant results. In contrast, user studies have shown that diversity at high ranks is often preferred. We present two online learning algorithms that directly learn a diverse ranking of documents based on users' clicking behavior. We show that these algorithms minimize abandonment, or alternatively, maximize the probability that a relevant document is found in the top k positions of a ranking. Moreover, one of our algorithms asymptotically achieves optimal worst-case performance even if users' interests change.

514 citations


Proceedings ArticleDOI
20 Jul 2008
TL;DR: A retrieval model that combines a translation-based language model for the question part with a query likelihood approach for the answer part and incorporates word-to-word translation probabilities learned through exploiting different sources of information is proposed.
Abstract: Retrieval in a question and answer archive involves finding good answers for a user's question. In contrast to typical document retrieval, a retrieval model for this task can exploit question similarity as well as ranking the associated answers. In this paper, we propose a retrieval model that combines a translation-based language model for the question part with a query likelihood approach for the answer part. The proposed model incorporates word-to-word translation probabilities learned through exploiting different sources of information. Experiments show that the proposed translation based language model for the question part outperforms baseline methods significantly. By combining with the query likelihood language model for the answer part, substantial additional effectiveness improvements are obtained.

406 citations


Proceedings ArticleDOI
07 Apr 2008
TL;DR: The paper reports on empirical studies that elicit key properties of SpaceTwist and suggests that the framework offers very good performance and high privacy, at low communication cost.
Abstract: In a mobile service scenario, users query a server for nearby points of interest but they may not want to disclose their locations to the service. Intuitively, location privacy may be obtained at the cost of query performance and query accuracy. The challenge addressed is how to obtain the best possible performance, subjected to given requirements for location privacy and query accuracy. Existing privacy solutions that use spatial cloaking employ complex server query processing techniques and entail the transmission of large quantities of intermediate result. Solutions that use transformation-based matching generally fall short in offering practical query accuracy guarantees. Our proposed framework, called SpaceTwist, rectifies these shortcomings for k nearest neighbor (kNN) queries. Starting with a location different from the user's actual location, nearest neighbors are retrieved incrementally until the query is answered correctly by the mobile terminal. This approach is flexible, needs no trusted middleware, and requires only well-known incremental NN query processing on the server. The framework also includes a server-side granular search technique that exploits relaxed query accuracy guarantees for obtaining better performance. The paper reports on empirical studies that elicit key properties of SpaceTwist and suggest that the framework offers very good performance and high privacy, at low communication cost.

361 citations


Proceedings ArticleDOI
11 Feb 2008
TL;DR: This work presents a new family of training objectives that are derived from the rank distributions of documents, induced by smoothed scores, called SoftRank, and focuses on a smoothed approximation to Normalized Discounted Cumulative Gain (NDCG), called SoftNDCG.
Abstract: We address the problem of learning large complex ranking functions. Most IR applications use evaluation metrics that depend only upon the ranks of documents. However, most ranking functions generate document scores, which are sorted to produce a ranking. Hence IR metrics are innately non-smooth with respect to the scores, due to the sort. Unfortunately, many machine learning algorithms require the gradient of a training objective in order to perform the optimization of the model parameters,and because IR metrics are non-smooth,we need to find a smooth proxy objective that can be used for training. We present a new family of training objectives that are derived from the rank distributions of documents, induced by smoothed scores. We call this approach SoftRank. We focus on a smoothed approximation to Normalized Discounted Cumulative Gain (NDCG), called SoftNDCG and we compare it with three other training objectives in the recent literature. We present two main results. First, SoftRank yields a very good way of optimizing NDCG. Second, we show that it is possible to achieve state of the art test set NDCG results by optimizing a soft NDCG objective on the training set with a different discount function

Proceedings ArticleDOI
26 Oct 2008
TL;DR: A novel query suggestion algorithm based on ranking queries with the hitting time on a large scale bipartite graph that can successfully boost long tail queries, accommodating personalized query suggestion, as well as finding related authors in research.
Abstract: Generating alternative queries, also known as query suggestion, has long been proved useful to help a user explore and express his information need. In many scenarios, such suggestions can be generated from a large scale graph of queries and other accessory information, such as the clickthrough. However, how to generate suggestions while ensuring their semantic consistency with the original query remains a challenging problem.In this work, we propose a novel query suggestion algorithm based on ranking queries with the hitting time on a large scale bipartite graph. Without involvement of twisted heuristics or heavy tuning of parameters, this method clearly captures the semantic consistency between the suggested query and the original query. Empirical experiments on a large scale query log of a commercial search engine and a scientific literature collection show that hitting time is effective to generate semantically consistent query suggestions. The proposed algorithm and its variations can successfully boost long tail queries, accommodating personalized query suggestion, as well as finding related authors in research.

Proceedings ArticleDOI
26 Oct 2008
TL;DR: A sequence of studies investigating the relationship between observable user behavior and retrieval quality for an operational search engine on the arXiv.org e-print archive finds that paired experiment designs adapted from sensory analysis produce accurate and reliable statements about the relative quality of two retrieval functions.
Abstract: Automatically judging the quality of retrieval functions based on observable user behavior holds promise for making retrieval evaluation faster, cheaper, and more user centered. However, the relationship between observable user behavior and retrieval quality is not yet fully understood. We present a sequence of studies investigating this relationship for an operational search engine on the arXiv.org e-print archive. We find that none of the eight absolute usage metrics we explore (e.g., number of clicks, frequency of query reformulations, abandonment) reliably reflect retrieval quality for the sample sizes we consider. However, we find that paired experiment designs adapted from sensory analysis produce accurate and reliable statements about the relative quality of two retrieval functions. In particular, we investigate two paired comparison tests that analyze clickthrough data from an interleaved presentation of ranking pairs, and we find that both give accurate and consistent results. We conclude that both paired comparison tests give substantially more accurate and sensitive evaluation results than absolute usage metrics in our domain.

Proceedings ArticleDOI
21 Apr 2008
TL;DR: A general ranking framework for factual information retrieval from social media is presented and results of a large scale evaluation demonstrate that the method is highly effective at retrieving well-formed, factual answers to questions, as evaluated on a standard factoid QA benchmark.
Abstract: Community Question Answering has emerged as a popular and effective paradigm for a wide range of information needs. For example, to find out an obscure piece of trivia, it is now possible and even very effective to post a question on a popular community QA site such as Yahoo! Answers, and to rely on other users to provide answers, often within minutes. The importance of such community QA sites is magnified as they create archives of millions of questions and hundreds of millions of answers, many of which are invaluable for the information needs of other searchers. However, to make this immense body of knowledge accessible, effective answer retrieval is required. In particular, as any user can contribute an answer to a question, the majority of the content reflects personal, often unsubstantiated opinions. A ranking that combines both relevance and quality is required to make such archives usable for factual information retrieval. This task is challenging, as the structure and the contents of community QA archives differ significantly from the web setting. To address this problem we present a general ranking framework for factual information retrieval from social media. Results of a large scale evaluation demonstrate that our method is highly effective at retrieving well-formed, factual answers to questions, as evaluated on a standard factoid QA benchmark. We also show that our learning framework can be tuned with the minimum of manual labeling. Finally, we provide result analysis to gain deeper understanding of which features are significant for social media search and retrieval. Our system can be used as a crucial building block for combining results from a variety of social media content with general web search results, and to better integrate social media content for effective information access.

Proceedings ArticleDOI
09 Jun 2008
TL;DR: An efficient exact algorithm, a fast sampling algorithm, and a Poisson approximation based algorithm are presented for answering probabilistic threshold top-k queries on uncertain data, which computes uncertain records taking a probability of at least p to be in the top- k list.
Abstract: Uncertain data is inherent in a few important applications such as environmental surveillance and mobile object tracking. Top-k queries (also known as ranking queries) are often natural and useful in analyzing uncertain data in those applications. In this paper, we study the problem of answering probabilistic threshold top-k queries on uncertain data, which computes uncertain records taking a probability of at least p to be in the top-k list where p is a user specified probability threshold. We present an efficient exact algorithm, a fast sampling algorithm, and a Poisson approximation based algorithm. An empirical study using real and synthetic data sets verifies the effectiveness of probabilistic threshold top-k queries and the efficiency of our methods.

Journal ArticleDOI
TL;DR: In this article, the authors formulate the ranking problem in a rigorous statistical framework, where the goal is to learn a ranking rule for deciding, among two instances, which one is "better" with minimum ranking risk.
Abstract: The problem of ranking/ordering instances, instead of simply classifying them, has recently gained much attention in machine learning. In this paper we formulate the ranking problem in a rigorous statistical framework. The goal is to learn a ranking rule for deciding, among two instances, which one is "better," with minimum ranking risk. Since the natural estimates of the risk are of the form of a U-statistic, results of the theory of U-processes are required for investigating the consistency of empirical risk minimizers. We establish in particular a tail inequality for degenerate U-processes, and apply it for showing that fast rates of convergence may be achieved under specific noise assumptions, just like in classification. Convex risk minimization methods are also studied.

Proceedings Article
01 Dec 2008
TL;DR: An answer ranking engine for non-factoid questions built using a large online community-generated question-answer collection (Yahoo! Answers) is described and it is demonstrated that using them in combination leads to considerable improvements in accuracy.
Abstract: This work describes an answer ranking engine for non-factoid questions built using a large online community-generated question-answer collection (Yahoo! Answers). We show how such collections may be used to effectively set up large supervised learning experiments. Furthermore we investigate a wide range of feature types, some exploiting NLP processors, and demonstrate that using them in combination leads to considerable improvements in accuracy.

Journal ArticleDOI
TL;DR: A revised method is proposed which can avoid problems of Chu and Tsao's method for ranking fuzzy numbers, and it is easy to rank fuzzy numbers in a way similar to the original method.
Abstract: In 2002, Chu and Tsao proposed a method to rank fuzzy numbers. They employed an area between the centroid and original points to rank fuzzy numbers; however there were some problems with the ranking method. In this paper, we want to indicate these problems of Chu and Tsao's method, and then propose a revised method which can avoid these problems for ranking fuzzy numbers. Since the revised method is based on the Chu and Tsao's method, it is easy to rank fuzzy numbers in a way similar to the original method.

Patent
14 Feb 2008
TL;DR: In this article, the authors proposed a method to retrieve the pages according to the quality of the individual pages, where the rank of a page for a keyword is a combination of intrinsic and extrinsic ranks.
Abstract: The present invention provides systems and methods of retrieving the pages according to the quality of the individual pages. The rank of a page for a keyword is a combination of intrinsic and extrinsic ranks. Intrinsic rank is the measure of the relevancy of a page to a given keyword as claimed by the author of the page while extrinsic rank is a measure of the relevancy of a page on a given keyword as indicated by other pages. The former is obtained from the analysis of the keyword matching in various parts of the page while the latter is obtained from the context-sensitive connectivity analysis of the links connecting the entire Web. The present invention also provides the methods to solve the self-consistent equation satisfied by the page weights iteratively in a very efficient way. The ranking mechanism for multi-word query is also described. Finally, the present invention provides a method to obtain the more relevant page weights by dividing the entire hypertext pages into distinct number of groups.

Proceedings ArticleDOI
20 Jul 2008
TL;DR: This paper proposes a K-Nearest Neighbor (KNN) method for query-dependent ranking, and proves a theory which indicates that the approximations are accurate in terms of difference in loss of prediction, if the learning algorithm used is stable with respect to minor changes in training examples.
Abstract: Many ranking models have been proposed in information retrieval, and recently machine learning techniques have also been applied to ranking model construction. Most of the existing methods do not take into consideration the fact that significant differences exist between queries, and only resort to a single function in ranking of documents. In this paper, we argue that it is necessary to employ different ranking models for different queries and onduct what we call query-dependent ranking. As the first such attempt, we propose a K-Nearest Neighbor (KNN) method for query-dependent ranking. We first consider an online method which creates a ranking model for a given query by using the labeled neighbors of the query in the query feature space and then rank the documents with respect to the query using the created model. Next, we give two offline approximations of the method, which create the ranking models in advance to enhance the efficiency of ranking. And we prove a theory which indicates that the approximations are accurate in terms of difference in loss of prediction, if the learning algorithm used is stable with respect to minor changes in training examples. Our experimental results show that the proposed online and offline methods both outperform the baseline method of using a single ranking function.

Proceedings ArticleDOI
21 Apr 2008
TL;DR: It is demonstrated that users' post-search browsing activity strongly reflects implicit endorsement of visited pages, which allows estimating topical relevance of Web resources by mining large-scale datasets of search trails.
Abstract: The paper proposes identifying relevant information sources from the history of combined searching and browsing behavior of many Web users. While it has been previously shown that user interactions with search engines can be employed to improve document ranking, browsing behavior that occurs beyond search result pages has been largely overlooked in prior work. The paper demonstrates that users' post-search browsing activity strongly reflects implicit endorsement of visited pages, which allows estimating topical relevance of Web resources by mining large-scale datasets of search trails. We present heuristic and probabilistic algorithms that rely on such datasets for suggesting authoritative websites for search queries. Experimental evaluation shows that exploiting complete post-search browsing trails outperforms alternatives in isolation (e.g., clickthrough logs), and yields accuracy improvements when employed as a feature in learning to rank for Web search.

Journal ArticleDOI
TL;DR: A linear goal programming model is constructed to integrate the fuzzy assessment information and to directly compute the collective ranking values of alternatives without the need of information transformation to solve the group decision making (GDM) problems with multi-granularity linguistic assessment information.

Proceedings ArticleDOI
05 Jul 2008
TL;DR: This work formulate the learning problem of predicting diverse subsets and derive a training method based on structural SVMs that explicitly trains to diversify results.
Abstract: In many retrieval tasks, one important goal involves retrieving a diverse set of results (e.g., documents covering a wide range of topics for a search query). First of all, this reduces redundancy, effectively showing more information with the presented results. Secondly, queries are often ambiguous at some level. For example, the query "Jaguar" can refer to many different topics (such as the car or feline). A set of documents with high topic diversity ensures that fewer users abandon the query because no results are relevant to them. Unlike existing approaches to learning retrieval functions, we present a method that explicitly trains to diversify results. In particular, we formulate the learning problem of predicting diverse subsets and derive a training method based on structural SVMs.

Patent
06 Mar 2008
TL;DR: In this article, a solution for evaluating a plurality of entities includes assigning an attribute score to each entity for each of a multitude of attributes, which can be further processed to identify a set of suspicious entities.
Abstract: A solution for evaluating a plurality of entities includes assigning an attribute score to each entity for each of a multitude of attributes. For one or more of the attributes, the corresponding attribute score is assigned based on a ranking of each entity with respect to the other entities for the attribute. A composite score is generated for each entity based on the attribute scores for the attributes, which can be further processed to, for example, identify a set of suspicious entities.

Journal ArticleDOI
D. Cossock1, Tong Zhang
TL;DR: This work considers a formulation of the statistical ranking problem which it calls subset ranking, and focuses on the discounted cumulated gain (DCG) criterion that measures the quality of items near the top of the rank-list.
Abstract: The ranking problem has become increasingly important in modern applications of statistical methods in automated decision making systems. In particular, we consider a formulation of the statistical ranking problem which we call subset ranking, and focus on the discounted cumulated gain (DCG) criterion that measures the quality of items near the top of the rank-list. Similar to error minimization for binary classification, direct optimization of natural ranking criteria such as DCG leads to a nonconvex optimization problems that can be NP-hard. Therefore, a computationally more tractable approach is needed. We present bounds that relate the approximate optimization of DCG to the approximate minimization of certain regression errors. These bounds justify the use of convex learning formulations for solving the subset ranking problem. The resulting estimation methods are not conventional, in that we focus on the estimation quality in the top-portion of the rank-list. We further investigate the asymptotic statistical behavior of these formulations. Under appropriate conditions, the consistency of the estimation schemes with respect to the DCG metric can be derived.

Proceedings ArticleDOI
20 Jul 2008
TL;DR: This paper presents a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model, and shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback.
Abstract: Typical pseudo-relevance feedback methods assume the top-retrieved documents are relevant and use these pseudo-relevant documents to expand terms. The initial retrieval set can, however, contain a great deal of noise. In this paper, we present a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model. The main idea is to use document clusters to find dominant documents for the initial retrieval set, and to repeatedly feed the documents to emphasize the core topics of a query. Experimental results on large-scale web TREC collections show significant improvements over the relevance model. For justification of the resampling approach, we examine relevance density of feedback documents. A higher relevance density will result in greater retrieval accuracy, ultimately approaching true relevance feedback. The resampling approach shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback. This result indicates that the proposed method is effective for pseudo-relevance feedback.

Proceedings ArticleDOI
20 Jul 2008
TL;DR: A value profile based approach for ranking program statements according to their likelihood of being faulty, which outperforms Tarantula which is the most effective prior approach for statement ranking based fault localization using the benchmark programs the authors studied.
Abstract: We present a value profile based approach for ranking program statements according to their likelihood of being faulty. The key idea is to see which program statements exercised during a failing run use values that can be altered so that the execution instead produces correct output. Our approach is effective in locating statements that are either faulty or directly linked to a faulty statement. We present experimental results showing the effectiveness and efficiency of our approach. Our approach outperforms Tarantula which, to our knowledge, is the most effective prior approach for statement ranking based fault localization using the benchmark programs we studied.

Proceedings ArticleDOI
23 Oct 2008
TL;DR: This paper proposes Social Ranking, a method that exploits recommender system techniques to increase the efficiency of searches within Web 2.0, and proposes a mechanism to answer a user's query that ranks content based on the inferred semantic distance of the query to the tags associated to such content, weighted by the similarity of the querying user to the users who created those tags.
Abstract: Social (or folksonomic) tagging has become a very popular way to describe, categorise, search, discover and navigate content within Web 2.0 websites. Unlike taxonomies, which overimpose a hierarchical categorisation of content, folksonomies empower end users by enabling them to freely create and choose the categories (in this case, tags) that best describe some content. However, as tags are informally defined, continually changing, and ungoverned, social tagging has often been criticised for lowering, rather than increasing, the efficiency of searching, due to the number of synonyms, homonyms, polysemy, as well as the heterogeneity of users and the noise they introduce. In this paper, we propose Social Ranking, a method that exploits recommender system techniques to increase the efficiency of searches within Web 2.0. We measure users' similarity based on their past tag activity. We infer tags' relationships based on their association to content. We then propose a mechanism to answer a user's query that ranks (recommends) content based on the inferred semantic distance of the query to the tags associated to such content, weighted by the similarity of the querying user to the users who created those tags. A thorough evaluation conducted on the CiteULike dataset demonstrates that Social Ranking neatly improves coverage, while not compromising on accuracy.

Proceedings ArticleDOI
26 Oct 2008
TL;DR: Two new methods are proposed in this paper to measure the ranking distance based on the disagreement in terms of pair-wise orders, which represents the disagreement between the objective ranking list and the initial text-based.
Abstract: Content-based video search reranking can be regarded as a process that uses visual content to recover the "true" ranking list from the noisy one generated based on textual information. This paper explicitly formulates this problem in the Bayesian framework, i.e., maximizing the ranking score consistency among visually similar video shots while minimizing the ranking distance, which represents the disagreement between the objective ranking list and the initial text-based. Different from existing point-wise ranking distance measures, which compute the distance in terms of the individual scores, two new methods are proposed in this paper to measure the ranking distance based on the disagreement in terms of pair-wise orders. Specifically, hinge distance penalizes the pairs with reversed order according to the degree of the reverse, while preference strength distance further considers the preference degree. By incorporating the proposed distances into the optimization objective, two reranking methods are developed which are solved using quadratic programming and matrix computation respectively. Evaluation on TRECVID video search benchmark shows that the performance improvement up to 21% on TRECVID 2006 and 61.11% on TRECVID 2007 are achieved relative to text search baseline.

Proceedings ArticleDOI
26 Oct 2008
TL;DR: It is found that searchers are more likely to be successful when the frequencies of the query and destination URL are similar, and it is shown that the benefits obtained by search and navigation actions depend on the frequency of the information goal.
Abstract: We describe results from Web search log studies aimed at elucidating user behaviors associated with queries and destination URLs that appear with different frequencies. We note the diversity of information goals that searchers have and the differing ways that goals are specified. We examine rare and common information goals that are specified using rare or common queries. We identify several significant differences in user behavior depending on the rarity of the query and the destination URL. We find that searchers are more likely to be successful when the frequencies of the query and destination URL are similar. We also establish that the behavioral differences observed for queries and goals of varying rarity persist even after accounting for potential confounding variables, including query length, search engine ranking, session duration, and task difficulty. Finally, using an information-theoretic measure of search difficulty, we show that the benefits obtained by search and navigation actions depend on the frequency of the information goal.