scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 2012"


Posted Content
TL;DR: This paper presents a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem and provides a generic learning algorithm for optimizing models with respect to B PR-Opt.
Abstract: Item recommendation is the task of predicting a personalized ranking on a set of items (e.g. websites, movies, products). In this paper, we investigate the most common scenario with implicit feedback (e.g. clicks, purchases). There are many methods for item recommendation from implicit feedback like matrix factorization (MF) or adaptive knearest-neighbor (kNN). Even though these methods are designed for the item prediction task of personalized ranking, none of them is directly optimized for ranking. In this paper we present a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem. We also provide a generic learning algorithm for optimizing models with respect to BPR-Opt. The learning method is based on stochastic gradient descent with bootstrap sampling. We show how to apply our method to two state-of-the-art recommender models: matrix factorization and adaptive kNN. Our experiments indicate that for the task of personalized ranking our optimization method outperforms the standard learning techniques for MF and kNN. The results show the importance of optimizing models for the right criterion.

1,134 citations


Journal ArticleDOI
TL;DR: This survey presents a unified view of a large number of recent approaches to AQE that leverage various data sources and employ very different principles and techniques.
Abstract: The relative ineffectiveness of information retrieval systems is largely caused by the inaccuracy with which a query formed by a few keywords models the actual user information need. One well known method to overcome this limitation is automatic query expansion (AQE), whereby the user’s original query is augmented by new features with a similar meaning. AQE has a long history in the information retrieval community but it is only in the last years that it has reached a level of scientific and experimental maturity, especially in laboratory settings such as TREC. This survey presents a unified view of a large number of recent approaches to AQE that leverage various data sources and employ very different principles and techniques. The following questions are addressed. Why is query expansion so important to improve search effectiveness? What are the main steps involved in the design and implementation of an AQE component? What approaches to AQE are available and how do they compare? Which issues must still be resolved before AQE becomes a standard component of large operational information retrieval systems (e.g., search engines)?

1,058 citations


Proceedings Article
03 Dec 2012
TL;DR: A new loss-augmented inference algorithm that is quadratic in the code length and inspired by latent structural SVMs is developed, showing strong retrieval performance on CIFAR-10 and MNIST, with promising classification results using no more than kNN on the binary codes.
Abstract: Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity. Binary codes are well suited to large-scale applications as they are storage efficient and permit exact sub-linear kNN search. The framework is applicable to broad families of mappings, and uses a flexible form of triplet ranking loss. We overcome discontinuous optimization of the discrete mappings by minimizing a piecewise-smooth upper bound on empirical loss, inspired by latent structural SVMs. We develop a new loss-augmented inference algorithm that is quadratic in the code length. We show strong retrieval performance on CIFAR-10 and MNIST, with promising classification results using no more than kNN on the binary codes.

562 citations


Journal ArticleDOI
TL;DR: A new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking and a semi- supervised long-term RF algorithm to refine the multimedia data representation.
Abstract: We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose a new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking. In LRGA, for each data point, a local linear regression model is used to predict the ranking scores of its neighboring points. A unified objective function is then proposed to globally align the local models from all the data points so that an optimal ranking score can be assigned to each data point. Second, we propose a semi-supervised long-term Relevance Feedback (RF) algorithm to refine the multimedia data representation. The proposed long-term RF algorithm utilizes both the multimedia data distribution in multimedia feature space and the history RF information provided by users. A trace ratio optimization problem is then formulated and solved by an efficient algorithm. The algorithms have been applied to several content-based multimedia retrieval applications, including cross-media retrieval, image retrieval, and 3D motion/pose data retrieval. Comprehensive experiments on four data sets have demonstrated its advantages in precision, robustness, scalability, and computational efficiency.

405 citations


Journal ArticleDOI
TL;DR: The Leiden Ranking 2011/2012 as discussed by the authors is a ranking of universities based on bibliometric indicators of publication output, citation impact, and scientific collaboration, which includes 500 major universities from 41 different countries.
Abstract: The Leiden Ranking 2011/2012 is a ranking of universities based on bibliometric indicators of publication output, citation impact, and scientific collaboration. The ranking includes 500 major universities from 41 different countries. This paper provides an extensive discussion of the Leiden Ranking 2011/2012. The ranking is compared with other global university rankings, in particular the Academic Ranking of World Universities (commonly known as the Shanghai Ranking) and the Times Higher Education World University Rankings. The comparison focuses on the methodological choices underlying the different rankings. Also, a detailed description is offered of the data collection methodology of the Leiden Ranking 2011/2012 and of the indicators used in the ranking. Various innovations in the Leiden Ranking 2011/2012 are presented. These innovations include (1) an indicator based on counting a university's highly cited publications, (2) indicators based on fractional rather than full counting of collaborative publications, (3) the possibility of excluding non-English language publications, and (4) the use of stability intervals. Finally, some comments are made on the interpretation of the ranking and a number of limitations of the ranking are pointed out. © 2012 Wiley Periodicals, Inc.

376 citations


Posted Content
TL;DR: The Leiden Ranking is compared with other global university rankings, in particular the Academic Ranking of World Universities (commonly known as the Shanghai Ranking) and the Times Higher Education World University Rankings, and the comparison focuses on the methodological choices underlying the different rankings.
Abstract: The Leiden Ranking 2011/2012 is a ranking of universities based on bibliometric indicators of publication output, citation impact, and scientific collaboration. The ranking includes 500 major universities from 41 different countries. This paper provides an extensive discussion of the Leiden Ranking 2011/2012. The ranking is compared with other global university rankings, in particular the Academic Ranking of World Universities (commonly known as the Shanghai Ranking) and the Times Higher Education World University Rankings. Also, a detailed description is offered of the data collection methodology of the Leiden Ranking 2011/2012 and of the indicators used in the ranking. Various innovations in the Leiden Ranking 2011/2012 are presented. These innovations include (1) an indicator based on counting a university's highly cited publications, (2) indicators based on fractional rather than full counting of collaborative publications, (3) the possibility of excluding non-English language publications, and (4) the use of stability intervals. Finally, some comments are made on the interpretation of the ranking, and a number of limitations of the ranking are pointed out.

338 citations


Patent
07 Jan 2012
TL;DR: In this paper, a system and methods for implementing searches using contextual information associated with a Web page (or other document) that a user is viewing when a query is entered is described.
Abstract: Systems and methods, including user interfaces, are provided for implementing searches using contextual information associated with a Web page (or other document) that a user is viewing when a query is entered. The page includes a contextual search interface that has an associated context vector representing content of the page. When the user submits a search query via the contextual search interface, the query and the context vector are both provided to the query processor and used in responding to the query.

331 citations


Proceedings ArticleDOI
09 Sep 2012
TL;DR: This paper proposes a new CF approach, Collaborative Less-is-More Filtering (CLiMF), where the model parameters are learned by directly maximizing the Mean Reciprocal Rank (MRR), which is a well-known information retrieval metric for measuring the performance of top-k recommendations.
Abstract: In this paper we tackle the problem of recommendation in the scenarios with binary relevance data, when only a few (k) items are recommended to individual users. Past work on Collaborative Filtering (CF) has either not addressed the ranking problem for binary relevance datasets, or not specifically focused on improving top-k recommendations. To solve the problem we propose a new CF approach, Collaborative Less-is-More Filtering (CLiMF). In CLiMF the model parameters are learned by directly maximizing the Mean Reciprocal Rank (MRR), which is a well-known information retrieval metric for measuring the performance of top-k recommendations. We achieve linear computational complexity by introducing a lower bound of the smoothed reciprocal rank metric. Experiments on two social network datasets demonstrate the effectiveness and the scalability of CLiMF, and show that CLiMF significantly outperforms a naive baseline and two state-of-the-art CF methods.

323 citations


Book ChapterDOI
01 Jan 2012
TL;DR: In fuzzy multi-criteria decision-making problems, the ranking of alternatives must take into account their fuzzy scores in all criteria, the weights assigned to each decision criterion, the possible difficulties of comparing two alternatives when one is significantly better than the other on at least one criterion from the complementary subset of criteria, and the decision maker's attitude towards the risk associated with evaluation as mentioned in this paper.
Abstract: In fuzzy Multi-criteria decision-making problems, the ranking of alternatives must take into account their fuzzy scores in all criteria, the weights assigned to each decision criterion, the possible difficulties of comparing two alternatives when one is significantly better than the other on a subset of criteria, but much worse on at least one criterion from the complementary subset of criteria, and the decision maker’s attitude towards the risk associated with evaluation.

236 citations


Journal ArticleDOI
TL;DR: An algorithm is presented which not only analyzes the overall sentiment of a document/review, but also identifies the semantic orientation of specific components of the review that lead to a particular sentiment.

214 citations


Patent
24 Feb 2012
TL;DR: In this paper, the authors propose a system to identify a source with which each of the links is associated and rank the list of links based at least in part on the quality of the identified sources.
Abstract: A system ranks results. The system may receive a list of links. The system may identify a source with which each of the links is associated and rank the list of links based at least in part on a quality of the identified sources.

Proceedings ArticleDOI
09 Sep 2012
TL;DR: This paper presents a computationally effective approach for the direct minimization of a ranking objective function, without sampling, and demonstrates by experiments on the Y!Music and Netflix data sets that the proposed method outperforms other implicit feedback recommenders in many cases in terms of the ErrorRate, ARP and Recall evaluation metrics.
Abstract: Two flavors of the recommendation problem are the explicit and the implicit feedback settings. In the explicit feedback case, users rate items and the user-item preference relationship can be modelled on the basis of the ratings. In the harder but more common implicit feedback case, the system has to infer user preferences from indirect information: presence or absence of events, such as a user viewed an item. One approach for handling implicit feedback is to minimize a ranking objective function instead of the conventional prediction mean squared error. The naive minimization of a ranking objective function is typically expensive. This difficulty is usually overcome by a trade-off: sacrificing the accuracy to some extent for computational efficiency by sampling the objective function. In this paper, we present a computationally effective approach for the direct minimization of a ranking objective function, without sampling. We demonstrate by experiments on the Y!Music and Netflix data sets that the proposed method outperforms other implicit feedback recommenders in many cases in terms of the ErrorRate, ARP and Recall evaluation metrics.

Journal ArticleDOI
TL;DR: A mobile recommendation system to answer popular location-related queries in daily life, and proposes three algorithms based on collaborative filtering which can consistently outperform the competing baselines and the newly proposed third algorithm can also outperform the authors' other two previous algorithms.

Patent
01 Jun 2012
TL;DR: In this article, a query from a first user regarding a proposed transaction was sent to the plurality of potential entities for the proposed transaction based on the at least one affinity between the first user and the potential entities.
Abstract: A method includes: receiving information regarding a plurality of completed transactions from a plurality of users; receiving a query from a first user regarding a proposed transaction; determining at least one affinity between the first user and the plurality of users based on the information; determining a ranking or expectation of success for each of a plurality of potential entities for the proposed transaction based on the at least one affinity; selecting a plurality of selected entities based on the ranking or expectation of success for each of the potential entities; and sending, in response to the query, the plurality of selected entities to the first user.

Book
26 Feb 2012
TL;DR: A comprehensive overview of the mathematical algorithms and methods used to rate and rank sports teams, political candidates, products, Web pages, and more can be found in Who's #1? as discussed by the authors.
Abstract: A website's ranking on Google can spell the difference between success and failure for a new business. NCAA football ratings determine which schools get to play for the big money in postseason bowl games. Product ratings influence everything from the clothes we wear to the movies we select on Netflix. Ratings and rankings are everywhere, but how exactly do they work? Who's #1? offers an engaging and accessible account of how scientific rating and ranking methods are created and applied to a variety of uses.Amy Langville and Carl Meyer provide the first comprehensive overview of the mathematical algorithms and methods used to rate and rank sports teams, political candidates, products, Web pages, and more. In a series of interesting asides, Langville and Meyer provide fascinating insights into the ingenious contributions of many of the field's pioneers. They survey and compare the different methods employed today, showing why their strengths and weaknesses depend on the underlying goal, and explaining why and when a given method should be considered. Langville and Meyer also describe what can and can't be expected from the most widely used systems.The science of rating and ranking touches virtually every facet of our lives, and now you don't need to be an expert to understand how it really works. Who's #1? is the definitive introduction to the subject. It features easy-to-understand examples and interesting trivia and historical facts, and much of the required mathematics is included.

01 Jan 2012
TL;DR: This chapter presents the SVMs for binary classification in Section 2, SVR in Section 3, ranking SVM in Section 4, and another recently developed method for learningranking SVM called Ranking Vector Machine (RVM) in Section 5.
Abstract: Support Vector Machines(SVMs) have been extensively researched in the data mining and machine learning communities for the last decade and actively applied to applications in various domains SVMs are typically used for learning classification, regression, or ranking functions, for which they are called classifying SVM, support vector regression (SVR), or ranking SVM (or RankSVM) respectively Two special properties of SVMs are that SVMs achieve (1) high generalization by maximizing the margin and (2) support an efficient learning of nonlinear functions by kernel trick This chapter introduces these general concepts and techniques of SVMs for learning classification, regression, and ranking functions In particular, we first present the SVMs for binary classification in Section 2, SVR in Section 3, ranking SVM in Section 4, and another recently developed method for learning ranking SVM called Ranking Vector Machine (RVM) in Section 5

Journal ArticleDOI
TL;DR: These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features.

Proceedings ArticleDOI
09 Sep 2012
TL;DR: This paper focuses on analyzing social streams in real-time for personalized topic recommendation and discovery and presents Stream Ranking Matrix Factorization - RMFX, which uses a pairwise approach to matrix factorization in order to optimize the personalized ranking of topics.
Abstract: The Social Web is successfully established, and steadily growing in terms of users, content and services. People generate and consume data in real-time within social networking services, such as Twitter, and increasingly rely upon continuous streams of messages for real-time access to fresh knowledge about current affairs. In this paper, we focus on analyzing social streams in real-time for personalized topic recommendation and discovery. We consider collaborative filtering as an online ranking problem and present Stream Ranking Matrix Factorization - RMFX -, which uses a pairwise approach to matrix factorization in order to optimize the personalized ranking of topics. Our novel approach follows a selective sampling strategy to perform online model updates based on active learning principles, that closely simulates the task of identifying relevant items from a pool of mostly uninteresting ones. RMFX is particularly suitable for large scale applications and experiments on the "476 million Twitter tweets" dataset show that our online approach largely outperforms recommendations based on Twitter's global trend, and it is also able to deliver highly competitive Top-N recommendations faster while using less space than Weighted Regularized Matrix Factorization (WRMF), a state-of-the-art matrix factorization technique for Collaborative Filtering, demonstrating the efficacy of our approach.

Journal ArticleDOI
TL;DR: The results of two case studies show that the combined ranking of application descriptions and API documents yields the most-relevant search results from Exemplar.
Abstract: A fundamental problem of finding software applications that are highly relevant to development tasks is the mismatch between the high-level intent reflected in the descriptions of these tasks and low-level implementation details of applications. To reduce this mismatch we created an approach called EXEcutable exaMPLes ARchive (Exemplar) for finding highly relevant software projects from large archives of applications. After a programmer enters a natural-language query that contains high-level concepts (e.g., MIME, datasets), Exemplar retrieves applications that implement these concepts. Exemplar ranks applications in three ways. First, we consider the descriptions of applications. Second, we examine the Application Programming Interface (API) calls used by applications. Third, we analyze the dataflow among those API calls. We performed two case studies (with professional and student developers) to evaluate how these three rankings contribute to the quality of the search results from Exemplar. The results of our studies show that the combined ranking of application descriptions and API documents yields the most-relevant search results. We released Exemplar and our case study data to the public.

Proceedings ArticleDOI
16 Apr 2012
TL;DR: A temporal modeling framework adapted from physics and signal processing that can be used to predict time-varying user behavior using smoothing and trends and a novel learning algorithm that explicitly learns when to apply a given prediction model among a set of such models.
Abstract: User behavior on the Web changes over time. For example, the queries that people issue to search engines, and the underlying informational goals behind the queries vary over time. In this paper, we examine how to model and predict this temporal user behavior. We develop a temporal modeling framework adapted from physics and signal processing that can be used to predict time-varying user behavior using smoothing and trends. We also explore other dynamics of Web behaviors, such as the detection of periodicities and surprises. We develop a learning procedure that can be used to construct models of users' activities based on features of current and historical behaviors. The results of experiments indicate that by using our framework to predict user behavior, we can achieve significant improvements in prediction compared to baseline models that weight historical evidence the same for all queries. We also develop a novel learning algorithm that explicitly learns when to apply a given prediction model among a set of such models. Our improved temporal modeling of user behavior can be used to enhance query suggestions, crawling policies, and result ranking.

Proceedings ArticleDOI
12 Aug 2012
TL;DR: This paper proposes a time-sensitive approach for query auto-completion that applies time-series and rank candidates according their forecasted frequencies, and suggests that modeling the temporal trends of queries can significantly improve the ranking of QAC candidates.
Abstract: Query auto-completion (QAC) is a common feature in modern search engines. High quality QAC candidates enhance search experience by saving users time that otherwise would be spent on typing each character or word sequentially.Current QAC methods rank suggestions according to their past popularity. However, query popularity changes over time, and the ranking of candidates must be adjusted accordingly. For instance, while halloween might be the right suggestion after typing ha in October, harry potter might be better any other time. Surprisingly, despite the importance of QAC as a key feature in most online search engines, its temporal dynamics have been under-studied.In this paper, we propose a time-sensitive approach for query auto-completion. Instead of ranking candidates according to their past popularity, we apply time-series and rank candidates according their forecasted frequencies. Our experiments on 846K queries and their daily frequencies sampled over a period of 4.5 years show that predicting the popularity of queries solely based on their past frequency can be misleading, and the forecasts obtained by time-series modeling are substantially more reliable. Our results also suggest that modeling the temporal trends of queries can significantly improve the ranking of QAC candidates.

Journal ArticleDOI
TL;DR: This article demonstrates that, in order to estimate the mathematical expectations of Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG), it only need to predict the relevance probability of each image.
Abstract: This article studies a novel problem in image search. Given a text query and the image ranking list returned by an image search system, we propose an approach to automatically predict the search performance. We demonstrate that, in order to estimate the mathematical expectations of Average Precision (AP) and Normalized Discounted Cumulative Gain (NDCG), we only need to predict the relevance probability of each image. We accomplish the task with a query-adaptive graph-based learning based on the images’ ranking order and visual content. We validate our approach with a large-scale dataset that contains the image search results of 1,165 queries from 4 popular image search engines. Empirical studies demonstrate that our approach is able to generate predictions that are highly correlated with the real search performance. Based on the proposed image search performance prediction scheme, we introduce three applications: image metasearch, multilingual image search, and Boolean image search. Comprehensive experiments are conducted to validate our approach.

Journal ArticleDOI
TL;DR: A new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores (PADOG), designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets.
Abstract: The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method P athway A nalysis with D own-weighting of O verlapping G enes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/ or http://www.bioconductor.org .

Journal ArticleDOI
TL;DR: A new method for ranking intuitionistic fuzzy values (IFVs) by using the similarity measure and the accuracy degree of IFVs is proposed and applied to multi-attribute decision making.
Abstract: In this paper, we propose a new method for ranking intuitionistic fuzzy values (IFVs) by using the similarity measure and the accuracy degree of IFVs. Then we apply the proposed ranking method to multi-attribute decision making.

Journal ArticleDOI
TL;DR: By tolerating faults of a small part of the most significant components, the reliability of cloud applications can be greatly improved, and an algorithm is proposed to automatically determine an optimal fault-tolerance strategy for the significant cloud components.
Abstract: Cloud computing is becoming a mainstream aspect of information technology. More and more enterprises deploy their software systems in the cloud environment. The cloud applications are usually large scale and include a lot of distributed cloud components. Building highly reliable cloud applications is a challenging and critical research problem. To attack this challenge, we propose a component ranking framework, named FTCloud, for building fault-tolerant cloud applications. FTCloud includes two ranking algorithms. The first algorithm employs component invocation structures and invocation frequencies for making significant component ranking. The second ranking algorithm systematically fuses the system structure information as well as the application designers' wisdom to identify the significant components in a cloud application. After the component ranking phase, an algorithm is proposed to automatically determine an optimal fault-tolerance strategy for the significant cloud components. The experimental results show that by tolerating faults of a small part of the most significant components, the reliability of cloud applications can be greatly improved.

Proceedings ArticleDOI
08 Feb 2012
TL;DR: This work proposes a generative model of relevance which can be used to infer the relevance of a document to a specific user for a search query, and shows how to learn these profiles from a user's long-term search history.
Abstract: We present a new approach for personalizing Web search results to a specific user. Ranking functions for Web search engines are typically trained by machine learning algorithms using either direct human relevance judgments or indirect judgments obtained from click-through data from millions of users. The rankings are thus optimized to this generic population of users, not to any specific user. We propose a generative model of relevance which can be used to infer the relevance of a document to a specific user for a search query. The user-specific parameters of this generative model constitute a compact user profile. We show how to learn these profiles from a user's long-term search history. Our algorithm for computing the personalized ranking is simple and has little computational overhead. We evaluate our personalization approach using historical search data from thousands of users of a major Web search engine. Our findings demonstrate gains in retrieval performance for queries with high ambiguity, with particularly large improvements for acronym queries.

Journal ArticleDOI
TL;DR: This paper proposes a more general framework for handling time-sensitive queries and automatically identifies the important time intervals that are likely to be of interest for a query and builds scoring techniques that seamlessly integrate the temporal aspect into the overall ranking mechanism.
Abstract: Time is an important dimension of relevance for a large number of searches, such as over blogs and news archives. So far, research on searching over such collections has largely focused on locating topically similar documents for a query. Unfortunately, topic similarity alone is not always sufficient for document ranking. In this paper, we observe that, for an important class of queries that we call time-sensitive queries, the publication time of the documents in a news archive is important and should be considered in conjunction with the topic similarity to derive the final document ranking. Earlier work has focused on improving retrieval for “recency” queries that target recent documents. We propose a more general framework for handling time-sensitive queries and we automatically identify the important time intervals that are likely to be of interest for a query. Then, we build scoring techniques that seamlessly integrate the temporal aspect into the overall ranking mechanism. We present an extensive experimental evaluation using a variety of news article data sets, including TREC data as well as real web data analyzed using the Amazon Mechanical Turk. We examine several techniques for detecting the important time intervals for a query over a news archive and for incorporating this information in the retrieval process. We show that our techniques are robust and significantly improve result quality for time-sensitive queries compared to state-of-the-art retrieval techniques.

Journal ArticleDOI
TL;DR: This work proposes a method of Ranking based Multi-correlation Tensor Factorization (RMTF), to jointly model the ternary relations among user, image, and tag, and further to precisely reconstruct the user-aware image-tag associations as a result.
Abstract: Large-scale user contributed images with tags are easily available on photo sharing websites. However, the noisy or incomplete correspondence between the images and tags prohibits them from being leveraged for precise image retrieval and effective management. To tackle the problem of tag refinement, we propose a method of Ranking based Multi-correlation Tensor Factorization (RMTF), to jointly model the ternary relations among user, image, and tag, and further to precisely reconstruct the user-aware image-tag associations as a result. Since the user interest or background can be explored to eliminate the ambiguity of image tags, the proposed RMTF is believed to be superior to the traditional solutions, which only focus on the binary image-tag relations. During the model estimation, we employ a ranking based optimization scheme to interpret the tagging data, in which the pair-wise qualitative difference between positive and negative examples is used, instead of the point-wise 0/1 confidence. Specifically, the positive examples are directly decided by the observed user-image-tag interrelations, while the negative ones are collected with respect to the most semantically and contextually irrelevant tags. Extensive experiments on a benchmark Flickr dataset demonstrate the effectiveness of the proposed solution for tag refinement. We also show attractive performances on two potential applications as the by-products of the ternary relation analysis.

Journal ArticleDOI
TL;DR: It is found that the traditional approach leads to extremely distorted rankings and substantially distorted mappings of authors in this field when based on first- or all-author citation counting, whereas last-author-based citation ranking and cocitation mapping both appear relatively immune to the author name ambiguity problem.
Abstract: In this article, we explore how strongly author name disambiguation (AND) affects the results of an author-based citation analysis study, and identify conditions under which the traditional simplified approach of using surnames and first initials may suffice in practice. We compare author citation ranking and cocitation mapping results in the stem cell research field from 2004 to 2009 using two AND approaches: the traditional simplified approach of using author surname and first initial and a sophisticated algorithmic approach. We find that the traditional approach leads to extremely distorted rankings and substantially distorted mappings of authors in this field when based on first- or all-author citation counting, whereas last-author-based citation ranking and cocitation mapping both appear relatively immune to the author name ambiguity problem. This is largely because Romanized names of Chinese and Korean authors, who are very active in this field, are extremely ambiguous, but few of these researchers consistently publish as last authors in bylines. We conclude that a more earnest effort is required to deal with the author name ambiguity problem in both citation analysis and information retrieval, especially given the current trend toward globalization. In the stem cell research field, in which laboratory heads are traditionally listed as last authors in bylines, last-author-based citation ranking and cocitation mapping using the traditional approach to author name disambiguation may serve as a simple workaround, but likely at the price of largely filtering out Chinese and Korean contributions to the field as well as important contributions by young researchers. © 2012 Wiley Periodicals, Inc.

Journal ArticleDOI
TL;DR: This work describes the framework, explains the challenges, and evaluates the gain over a baseline machine learning approach, showing how this design can be used to implement solutions to particular challenges that arise in applying machine learning for evidence-based hypothesis evaluation.
Abstract: The final stage in the IBM DeepQA pipeline involves ranking all candidate answers according to their evidence scores and judging the likelihood that each candidate answer is correct. In DeepQA, this is done using a machine learning framework that is phase-based, providing capabilities for manipulating the data and applying machine learning in successive applications. We show how this design can be used to implement solutions to particular challenges that arise in applying machine learning for evidence-based hypothesis evaluation. Our approach facilitates an agile development environment for DeepQA; evidence scoring strategies can be easily introduced, revised, and reconfigured without the need for error-prone manual effort to determine how to combine the various evidence scores. We describe the framework, explain the challenges, and evaluate the gain over a baseline machine learning approach.