scispace - formally typeset
Search or ask a question

Showing papers on "Pairwise comparison published in 2013"


Proceedings ArticleDOI
11 Aug 2013
TL;DR: This paper develops a novel, computationally efficient method called FAST for ranking all possible pairs of features as candidates for inclusion into the model, and shows the effectiveness of FAST in ranking candidate pairs of Features.
Abstract: Standard generalized additive models (GAMs) usually model the dependent variable as a sum of univariate models. Although previous studies have shown that standard GAMs can be interpreted by users, their accuracy is significantly less than more complex models that permit interactions. In this paper, we suggest adding selected terms of interacting pairs of features to standard GAMs. The resulting models, which we call GA2{M}$-models, for Generalized Additive Models plus Interactions, consist of univariate terms and a small number of pairwise interaction terms. Since these models only include one- and two-dimensional components, the components of GA2M-models can be visualized and interpreted by users. To explore the huge (quadratic) number of pairs of features, we develop a novel, computationally efficient method called FAST for ranking all possible pairs of features as candidates for inclusion into the model. In a large-scale empirical study, we show the effectiveness of FAST in ranking candidate pairs of features. In addition, we show the surprising result that GA2M-models have almost the same performance as the best full-complexity models on a number of real datasets. Thus this paper postulates that for many problems, GA2M-models can yield models that are both intelligible and accurate.

389 citations


Journal ArticleDOI
TL;DR: In this paper, the authors focus on the primary role of effect modifiers, which are study and patient characteristics associated with treatment effects, and provide a basic explanation when network meta-analysis is as valid as pairwise meta analysis.
Abstract: In the last decade, network meta-analysis of randomized controlled trials has been introduced as an extension of pairwise meta-analysis. The advantage of network meta-analysis over standard pairwise meta-analysis is that it facilitates indirect comparisons of multiple interventions that have not been studied in a head-to-head fashion. Although assumptions underlying pairwise meta-analyses are well understood, those concerning network meta-analyses are perceived to be more complex and prone to misinterpretation. In this paper, we aim to provide a basic explanation when network meta-analysis is as valid as pairwise meta-analysis. We focus on the primary role of effect modifiers, which are study and patient characteristics associated with treatment effects. Because network meta-analysis includes different trials comparing different interventions, the distribution of effect modifiers cannot only vary across studies for a particular comparison (as with standard pairwise meta-analysis, causing heterogeneity), but also between comparisons (causing inconsistency). If there is an imbalance in the distribution of effect modifiers between different types of direct comparisons, the related indirect comparisons will be biased. If it can be assumed that this is not the case, network meta-analysis is as valid as pairwise meta-analysis. The validity of network meta-analysis is based on the underlying assumption that there is no imbalance in the distribution of effect modifiers across the different types of direct treatment comparisons, regardless of the structure of the evidence network.

379 citations


Journal ArticleDOI
TL;DR: A line matching algorithm which utilizes both the local appearance of lines and their geometric attributes to solve the problem of segment fragmentation and geometric variation and is accurate even for low-texture images because of the pairwise geometric consistency evaluation.

375 citations


Posted Content
TL;DR: The validity of network meta-analysis is based on the underlying assumption that there is no imbalance in the distribution of effect modifiers across the different types of direct treatment comparisons, regardless of the structure of the evidence network.
Abstract: Background In the last decade, network meta-analysis of randomized controlled trials has been introduced as an extension of pairwise meta-analysis. The advantage of network meta-analysis over standard pairwise meta-analysis is that it facilitates indirect comparisons of multiple interventions that have not been studied in a head-to-head fashion. Although assumptions underlying pairwise meta-analyses are well understood, those concerning network meta-analyses are perceived to be more complex and prone to misinterpretation. Discussion In this paper, we aim to provide a basic explanation when network meta-analysis is as valid as pairwise meta-analysis. We focus on the primary role of effect modifiers, which are study and patient characteristics associated with treatment effects. Because network meta-analysis includes different trials comparing different interventions, the distribution of effect modifiers cannot only vary across studies for a particular comparison (as with standard pairwise meta-analysis, causing heterogeneity), but also between comparisons (causing inconsistency). If there is an imbalance in the distribution of effect modifiers between different types of direct comparisons, the related indirect comparisons will be biased. If it can be assumed that this is not the case, network meta-analysis is as valid as pairwise meta-analysis. Summary The validity of network meta-analysis is based on the underlying assumption that there is no imbalance in the distribution of effect modifiers across the different types of direct treatment comparisons, regardless of the structure of the evidence network.

358 citations


Proceedings ArticleDOI
04 Feb 2013
TL;DR: This work proposes a new model to predict a gold-standard ranking that hinges on combining pairwise comparisons via crowdsourcing and formalizes this as an active learning strategy that incorporates an exploration-exploitation tradeoff and implements it using an efficient online Bayesian updating scheme.
Abstract: Inferring rankings over elements of a set of objects, such as documents or images, is a key learning problem for such important applications as Web search and recommender systems. Crowdsourcing services provide an inexpensive and efficient means to acquire preferences over objects via labeling by sets of annotators. We propose a new model to predict a gold-standard ranking that hinges on combining pairwise comparisons via crowdsourcing. In contrast to traditional ranking aggregation methods, the approach learns about and folds into consideration the quality of contributions of each annotator. In addition, we minimize the cost of assessment by introducing a generalization of the traditional active learning scenario to jointly select the annotator and pair to assess while taking into account the annotator quality, the uncertainty over ordering of the pair, and the current model uncertainty. We formalize this as an active learning strategy that incorporates an exploration-exploitation tradeoff and implement it using an efficient online Bayesian updating scheme. Using simulated and real-world data, we demonstrate that the active learning strategy achieves significant reductions in labeling cost while maintaining accuracy.

326 citations


Proceedings Article
03 Aug 2013
TL;DR: A new and improved assumption, group Bayesian personalized ranking (GBPR), is proposed, via introducing richer interactions among users, and a novel algorithm is designed correspondingly, which can recommend items more accurately as shown by various ranking-oriented evaluation metrics on four real-world datasets in the authors' experiments.
Abstract: One-class collaborative filtering or collaborative ranking with implicit feedback has been steadily receiving more attention, mostly due to the "one-class" characteristics of data in various services, eg, "like" in Facebook and "bought" in Amazon Previous works for solving this problem include pointwise regression methods based on absolute rating assumptions and pairwise ranking methods with relative score assumptions, where the latter was empirically found performing much better because it models users' ranking-related preferences more directly However, the two fundamental assumptions made in the pairwise ranking methods, (1) individual pairwise preference over two items and (2) independence between two users, may not always hold As a response, we propose a new and improved assumption, group Bayesian personalized ranking (GBPR), via introducing richer interactions among users In particular, we introduce group preference, to relax the aforementioned individual and independence assumptions We then design a novel algorithm correspondingly, which can recommend items more accurately as shown by various ranking-oriented evaluation metrics on four real-world datasets in our experiments

222 citations


Proceedings Article
16 Jun 2013
TL;DR: If an average of O(n log(n) binary comparisons are measured, then one algorithm recovers the true ranking in a uniform sense, while the other predicts the ranking more accurately near the top than the bottom.
Abstract: The ranking of n objects based on pairwise comparisons is a core machine learning problem, arising in recommender systems, ad placement, player ranking, biological applications and others. In many practical situations the true pairwise comparisons cannot be actively measured, but a subset of all n(n-1)/2 comparisons is passively and noisily observed. Optimization algorithms (e.g., the SVM) could be used to predict a ranking with fixed expected Kendall tau distance, while achieving an Ω (n) lower bound on the corresponding sample complexity. However, due to their centralized structure they are difficult to extend to online or distributed settings. In this paper we show that much simpler algorithms can match the same Ω (n) lower bound in expectation. Furthermore, if an average of O(n log(n)) binary comparisons are measured, then one algorithm recovers the true ranking in a uniform sense, while the other predicts the ranking more accurately near the top than the bottom. We discuss extensions to online and distributed ranking, with benefits over traditional alternatives.

189 citations


Journal ArticleDOI
TL;DR: Genton et al. as discussed by the authors generalize their results to the Brown-Resnick model and show that the efficiency gain is substantial only for very smooth processes, which are generally unrealistic in applications.
Abstract: SUMMARY Genton et al. (2011) investigated the gain in efficiency when triplewise, rather than pairwise, likelihood is used to fit the popular Smith max-stable model for spatial extremes. We generalize their results to the Brown–Resnick model and show that the efficiency gain is substantial only for very smooth processes, which are generally unrealistic in applications.

158 citations


Journal ArticleDOI
TL;DR: It is shown that a pairwise maximum entropy model, which takes into account region-specific activity rates and pairwise interactions, can be robustly and accurately fitted to resting-state human brain activities obtained by functional magnetic resonance imaging and reflects anatomical connexions more accurately than the conventional functional connectivity method.
Abstract: The resting-state human brain networks underlie fundamental cognitive functions and consist of complex interactions among brain regions. However, the level of complexity of the resting-state networks has not been quantified, which has prevented comprehensive descriptions of the brain activity as an integrative system. Here, we address this issue by demonstrating that a pairwise maximum entropy model, which takes into account region-specific activity rates and pairwise interactions, can be robustly and accurately fitted to resting-state human brain activities obtained by functional magnetic resonance imaging. Furthermore, to validate the approximation of the resting-state networks by the pairwise maximum entropy model, we show that the functional interactions estimated by the pairwise maximum entropy model reflect anatomical connexions more accurately than the conventional functional connectivity method. These findings indicate that a relatively simple statistical model not only captures the structure of the resting-state networks but also provides a possible method to derive physiological information about various large-scale brain networks.

155 citations


Proceedings ArticleDOI
11 Aug 2013
TL;DR: A method called Pairwise Tag enhAnced and featuRe-based Matrix factorIzation for Group recommendAtioN (PTARMIGAN), which considers location features, social features, and implicit patterns simultaneously in a unified model to provide better group recommendations.
Abstract: Groups play an essential role in many social websites which promote users' interactions and accelerate the diffusion of information. Recommending groups that users are really interested to join is significant for both users and social media. While traditional group recommendation problem has been extensively studied, we focus on a new type of the problem, i.e., event-based group recommendation. Unlike the other forms of groups, users join this type of groups mainly for participating offline events organized by group members or inviting other users to attend events sponsored by them. These characteristics determine that previously proposed approaches for group recommendation cannot be adapted to the new problem easily as they ignore the geographical influence and other explicit features of groups and users. In this paper, we propose a method called Pairwise Tag enhAnced and featuRe-based Matrix factorIzation for Group recommendAtioN (PTARMIGAN), which considers location features, social features, and implicit patterns simultaneously in a unified model. More specifically, we exploit matrix factorization to model interactions between users and groups. Meanwhile, we incorporate their profile information into pairwise enhanced latent factors respectively. We also utilize the linear model to capture explicit features. Due to the reinforcement between explicit features and implicit patterns, our approach can provide better group recommendations. We conducted a comprehensive performance evaluation on real word data sets and the experimental results demonstrate the effectiveness of our method.

152 citations


Journal ArticleDOI
TL;DR: In conclusion, when the attribute of interest is the overall heterogeneity in a pool of sites (i.e. beta diversity) or its turnover or nestedness components, only multiple site dissimilarity measures are recommended.
Abstract: Several measures of multiple site dissimilarity have been proposed to quantify the overall heterogeneity in assemblage composition among any number of sites. It is also a common practice to quantify such overall heterogeneity by averaging pairwise dissimilarities between all pairs of sites in the pool. However, pairwise dissimilarities do not account for patterns of co-occurrence among more than two sites. In consequence, the average of pairwise dissimilarities may not accurately reflect the overall compositional heterogeneity within a pool of more than two sites. Here I use several idealized examples to illustrate why pairwise dissimilarity measures fail to properly quantify overall heterogeneity. Thereafter, the effect of this potential problem in empirical patterns is exemplified with data of world amphibians. In conclusion, when the attribute of interest is the overall heterogeneity in a pool of sites (i.e. beta diversity) or its turnover or nestedness components, only multiple site dissimilarity measures are recommended.

Journal ArticleDOI
TL;DR: This study developed a unique methodology which extends the AHP-SA model proposed by Chen et al. (2010) to a more comprehensive framework to analyze weight sensitivity caused by both direct and indirect weight changes using the one-at-a-time (OAT) technique.
Abstract: Criteria weights determined from pairwise comparisons are often the greatest contributor to the uncertainties in the AHP-based multi-criteria decision making (MCDM). During an MCDM process, the weights can be changed directly by adjusting the output from a pairwise comparison matrix, or indirectly by recalculating the matrix after varying its input. Corresponding weight sensitivity on multi-criteria evaluation results is generally difficult to be quantitatively assessed and spatially visualized. This study developed a unique methodology which extends the AHP-SA model proposed by Chen et al. (2010) to a more comprehensive framework to analyze weight sensitivity caused by both direct and indirect weight changes using the one-at-a-time (OAT) technique. With increased efficiency, improved flexibility and enhanced visualization capability, the spatial framework was developed as AHP-SA2 within a GIS platform. A case study with in-depth discussion is provided to demonstrate the new toolset. It assists stakeholders and researchers with better understanding of weight sensitivity for characterising, reporting and minimising uncertainty in the AHP-based spatial MCDM.

Journal ArticleDOI
TL;DR: The results show that balancing exploration and exploitation can substantially and significantly improve the online retrieval performance of both listwise and pairwise approaches.
Abstract: As retrieval systems become more complex, learning to rank approaches are being developed to automatically tune their parameters. Using online learning to rank, retrieval systems can learn directly from implicit feedback inferred from user interactions. In such an online setting, algorithms must obtain feedback for effective learning while simultaneously utilizing what has already been learned to produce high quality results. We formulate this challenge as an exploration---exploitation dilemma and propose two methods for addressing it. By adding mechanisms for balancing exploration and exploitation during learning, each method extends a state-of-the-art learning to rank method, one based on listwise learning and the other on pairwise learning. Using a recently developed simulation framework that allows assessment of online performance, we empirically evaluate both methods. Our results show that balancing exploration and exploitation can substantially and significantly improve the online retrieval performance of both listwise and pairwise approaches. In addition, the results demonstrate that such a balance affects the two approaches in different ways, especially when user feedback is noisy, yielding new insights relevant to making online learning to rank effective in practice.

Journal ArticleDOI
TL;DR: A new extension of the ELECTRE, known as the elimination and choice translating reality method, for multi-criteria group decision-making problems based on intuitionistic fuzzy sets is designed and a new discordance intuitionistic index is introduced, which is extended from the concept of the fuzzy distance measure.

Journal ArticleDOI
TL;DR: This paper surveys and analyzes ten inconsistency indices from the numerical point of view and investigates degrees of agreement between them to check how similar they are.
Abstract: Evaluating the level of inconsistency of pairwise comparisons is often a crucial step in multi criteria decision analysis. Several inconsistency indices have been proposed in the literature to estimate the deviation of expert’s judgments from a situation of full consistency. This paper surveys and analyzes ten indices from the numerical point of view. Specifically, we investigate degrees of agreement between them to check how similar they are. Results show a wide range of behaviors, ranging from very strong to very weak degrees of agreement.

Journal ArticleDOI
TL;DR: A simple case study that involved reproducing the relative area sizes of six provinces in China shows that the proposed Cloud Delphi hierarchical analysis method can effectively reduce mistakes and improve decision makers' judgments in situations that require subjective expertise and judgmental inputs.

Posted Content
TL;DR: A generic decoupling technique is presented that enables us to provide Rademacher complexity-based generalization error bounds and a novel memory efficient online learning algorithm for higher order learning problems with bounded regret guarantees is proposed.
Abstract: In this paper, we study the generalization properties of online learning based stochastic methods for supervised learning problems where the loss function is dependent on more than one training sample (e.g., metric learning, ranking). We present a generic decoupling technique that enables us to provide Rademacher complexity-based generalization error bounds. Our bounds are in general tighter than those obtained by Wang et al (COLT 2012) for the same problem. Using our decoupling technique, we are further able to obtain fast convergence rates for strongly convex pairwise loss functions. We are also able to analyze a class of memory efficient online learning algorithms for pairwise learning problems that use only a bounded subset of past training samples to update the hypothesis at each step. Finally, in order to complement our generalization bounds, we propose a novel memory efficient online learning algorithm for higher order learning problems with bounded regret guarantees.

Proceedings Article
16 Jun 2013
TL;DR: This work proposes and formally analyze a general preference-based racing algorithm that is instantiate with three specific ranking procedures and corresponding sampling schemes, and assumes that alternatives can be compared in terms of pairwise preferences.
Abstract: We consider the problem of reliably selecting an optimal subset of fixed size from a given set of choice alternatives, based on noisy information about the quality of these alternatives. Problems of similar kind have been tackled by means of adaptive sampling schemes called racing algorithms. However, in contrast to existing approaches, we do not assume that each alternative is characterized by a real-valued random variable, and that samples are taken from the corresponding distributions. Instead, we only assume that alternatives can be compared in terms of pairwise preferences. We propose and formally analyze a general preference-based racing algorithm that we instantiate with three specific ranking procedures and corresponding sampling schemes. Experiments with real and synthetic data are presented to show the efficiency of our approach.

Journal ArticleDOI
TL;DR: This paper presents a novel meta-feature generation method based on rules that compare the performance of individual base learners in a one-against-one manner, and introduces a new meta-learner called Approximate Ranking Tree Forests (ART Forests) that performs very competitively when compared with several state-of-the-art meta-learners.
Abstract: In this paper, we present a novel meta-feature generation method in the context of meta-learning, which is based on rules that compare the performance of individual base learners in a one-against-one manner. In addition to these new meta-features, we also introduce a new meta-learner called Approximate Ranking Tree Forests (ART Forests) that performs very competitively when compared with several state-of-the-art meta-learners. Our experimental results are based on a large collection of datasets and show that the proposed new techniques can improve the overall performance of meta-learning for algorithm ranking significantly. A key point in our approach is that each performance figure of any base learner for any specific dataset is generated by optimising the parameters of the base learner separately for each dataset.

Journal ArticleDOI
TL;DR: In this article, incomplete pairwise comparison matrices are applied in the area of sport tournaments, namely proposing alternative rankings for the 2010 Chess Olympiad Open tournament, and the proposed rankings give in some cases intuitively better outcome than currently used lexicographical orders.
Abstract: Pairwise comparison matrices are widely used in multicriteria decision making This article applies incomplete pairwise comparison matrices in the area of sport tournaments, namely proposing alternative rankings for the 2010 Chess Olympiad Open tournament It is shown that results are robust regarding scaling technique In order to compare different rankings, a distance function is introduced with the aim of taking into account the subjective nature of human perception Analysis of the weight vectors implies that methods based on pairwise comparisons have common roots Visualization of the results is provided by multidimensional scaling on the basis of the defined distance The proposed rankings give in some cases intuitively better outcome than currently used lexicographical orders

Journal ArticleDOI
TL;DR: This work critically examines the spatial arrangement method (SpAM) proposed by Goldstone (1994a), in which similarity ratings are obtained by presenting many stimuli at once, and finds that the SpAM produces high-quality MDS solutions.
Abstract: Although traditional methods to collect similarity data (for multidimensional scaling [MDS]) are robust, they share a key shortcoming. Specifically, the possible pairwise comparisons in any set of objects grow rapidly as a function of set size. This leads to lengthy experimental protocols, or procedures that involve scaling stimulus subsets. We review existing methods of collecting similarity data, and critically examine the spatial arrangement method (SpAM) proposed by Goldstone (1994a), in which similarity ratings are obtained by presenting many stimuli at once. The participant moves stimuli around the computer screen, placing them at distances from one another that are proportional to subjective similarity. This provides a fast, efficient, and user-friendly method for obtaining MDS spaces. Participants gave similarity ratings to artificially constructed visual stimuli (comprising 2–3 perceptual dimensions) and nonvisual stimuli (animal names) with less-defined underlying dimensions. Ratings were obtained with 4 methods: pairwise comparisons, spatial arrangement, and 2 novel hybrid methods. We compared solutions from alternative methods to the pairwise method, finding that the SpAM produces high-quality MDS solutions. Monte Carlo simulations on degraded data suggest that the method is also robust to reductions in sample sizes and granularity. Moreover, coordinates derived from SpAM solutions accurately predicted discrimination among objects in same– different classification. We address the benefits of using a spatial medium to collect similarity measures.

Journal ArticleDOI
TL;DR: By embedding the geometric mean in a larger class of methods, this work sheds light on the choice between it and its traditional AHP competitor, the principal right eigenvector, and suggests how to assess the extent of inconsistency.
Abstract: We study properties of weight extraction methods for pairwise comparison matrices that minimize suitable measures of inconsistency, ‘average error gravity’ measures, including one that leads to the geometric row means. The measures share essential global properties with the AHP inconsistency measure. By embedding the geometric mean in a larger class of methods we shed light on the choice between it and its traditional AHP competitor, the principal right eigenvector. We also suggest how to assess the extent of inconsistency by developing an alternative to the Random Consistency Index, which is not based on random comparison matrices, but based on judgemental error distributions. We define and discuss natural invariance requirements and show that the minimizers of average error gravity generally satisfy them, except a requirement regarding the order in which matrices and weights are synthesized. Only the geometric row mean satisfies this requirement also. For weight extraction we recommend the geometric mean.

Proceedings Article
03 Nov 2013
TL;DR: The proposed crowdranking framework is based on the theory of matrix completion, and it is shown that, on average, only O pairwise queries are needed to accurately recover the ranking list of m items for the target user.
Abstract: Inferring user preferences over a set of items is an important problem that has found numerous applications. This work focuses on the scenario where the explicit feature representation of items is unavailable, a setup that is similar to collaborative filtering. In order to learn a user's preferences from his/her response to only a small number of pairwise comparisons, we propose to leverage the pairwise comparisons made by many crowd users, a problem we refer to as crowdranking. The proposed crowdranking framework is based on the theory of matrix completion, and we present efficient algorithms for solving the related optimization problem. Our theoretical analysis shows that, on average, only O ( r log m ) pairwise queries are needed to accurately recover the ranking list of m items for the target user, where r is the rank of the unknown rating matrix, r << m . Our empirical study with two real-world benchmark datasets for collaborative filtering and one crowdranking dataset we collected via Amazon Mechanical Turk shows the promising performance of the proposed algorithm compared to the state-of-the-art approaches.

Proceedings ArticleDOI
25 Aug 2013
TL;DR: Analysis shows that pre-clustering and a combination of heterogeneous features yield the best trade-off between number of clusters and their quality, demonstrating that a simple combination based on pairwise maximization of similarity is as effective as a non-trivial optimization of parameters.
Abstract: The increasing pervasiveness of social media creates new opportunities to study human social behavior, while challenging our capability to analyze their massive data streams. One of the emerging tasks is to distinguish between different kinds of activities, for example engineered misinformation campaigns versus spontaneous communication. Such detection problems require a formal definition of meme, or unit of information that can spread from person to person through the social network. Once a meme is identified, supervised learning methods can be applied to classify different types of communication. The appropriate granularity of a meme, however, is hardly captured from existing entities such as tags and keywords. Here we present a framework for the novel task of detecting memes by clustering messages from large streams of social data. We evaluate various similarity measures that leverage content, metadata, network features, and their combinations. We also explore the idea of pre-clustering on the basis of existing entities. A systematic evaluation is carried out using a manually curated dataset as ground truth. Our analysis shows that pre-clustering and a combination of heterogeneous features yield the best trade-off between number of clusters and their quality, demonstrating that a simple combination based on pairwise maximization of similarity is as effective as a non-trivial optimization of parameters. Our approach is fully automatic, unsupervised, and scalable for real-time detection of memes in streaming data.

Journal ArticleDOI
TL;DR: In this article, a statistical inference for max-stable space-time processes that are defined in an analogous fashion is proposed, where the pairwise density of the process is used to estimate the model parameters.
Abstract: Max-stable processes have proved to be useful for the statistical modelling of spatial extremes. Several families of max-stable random fields have been proposed in the literature. One such representation is based on a limit of normalized and rescaled pointwise maxima of stationary Gaussian processes that was first introduced by Kabluchko and co-workers. This paper deals with statistical inference for max-stable space–time processes that are defined in an analogous fashion. We describe pairwise likelihood estimation, where the pairwise density of the process is used to estimate the model parameters. For regular grid observations we prove strong consistency and asymptotic normality of the parameter estimates as the joint number of spatial locations and time points tends to ∞. Furthermore, we discuss extensions to irregularly spaced locations. A simulation study shows that the method proposed works well for these models.

Proceedings ArticleDOI
27 Oct 2013
TL;DR: This paper proposes a novel Metric Fusion technique via cross-view graph Random Walk, named MFRW, regarding a multi-view based similarity graphs (with each similarity graph constructed under each view), and seeks a high-order metric yielded by graph random walks over constructed similarity graphs.
Abstract: Many real-world objects described by multiple attributes or features can be decomposed as multiple "views" (e.g., an image can be described by a color view or a shape view), which often provides complementary information to each other. Learning a metric (similarity measures) for multi-view data is primary due to its wide applications in practices. However, leveraging multi-view information to produce a good metric is a great challenge and existing techniques are concerned with pairwise similarities, leading to undesirable fusion metric and high computational complexity. In this paper, we propose a novel Metric Fusion technique via cross-view graph Random Walk, named MFRW, regarding a multi-view based similarity graphs (with each similarity graph constructed under each view). Instead of using pairwise similarities, we seek a high-order metric yielded by graph random walks over constructed similarity graphs. Observing that ``outlier views" may exist in the fusion process, we incorporate the coefficient matrices representing the correlation strength between any two views into MFRW, named WMFRW. The principle of \textsf{WMFRW} is implemented by exploring the ``common latent structure" between views. The empirical studies conducted on real-world databases demonstrate that our approach outperforms the state-of-the-art competitors in terms of effectiveness and efficiency.

Journal ArticleDOI
TL;DR: The results point out that the search is optimal (i.e., the mean first hitting time among searchers is minimum) at intermediate scales of communication, showing that both an excess and a lack of information may worsen it.
Abstract: We investigate the relationship between communication and search efficiency in a biological context by proposing a model of Brownian searchers with long-range pairwise interactions. After a general study of the properties of the model, we show an application to the particular case of acoustic communication among Mongolian gazelles, for which data are available, searching for good habitat areas. Using Monte Carlo simulations and density equations, our results point out that the search is optimal (i.e., the mean first hitting time among searchers is minimum) at intermediate scales of communication, showing that both an excess and a lack of information may worsen it.

Journal ArticleDOI
TL;DR: A class of global potentials defined over all variables in the CRF can be readily optimised using standard graph cut algorithms at little extra expense compared to a standard pairwise field and can be directly used for the problem of class based image segmentation.
Abstract: The Markov and Conditional random fields (CRFs) used in computer vision typically model only local interactions between variables, as this is generally thought to be the only case that is computationally tractable. In this paper we consider a class of global potentials defined over all variables in the CRF. We show how they can be readily optimised using standard graph cut algorithms at little extra expense compared to a standard pairwise field. This result can be directly used for the problem of class based image segmentation which has seen increasing recent interest within computer vision. Here the aim is to assign a label to each pixel of a given image from a set of possible object classes. Typically these methods use random fields to model local interactions between pixels or super-pixels. One of the cues that helps recognition is global object co-occurrence statistics, a measure of which classes (such as chair or motorbike) are likely to occur in the same image together. There have been several approaches proposed to exploit this property, but all of them suffer from different limitations and typically carry a high computational cost, preventing their application on large images. We find that the new model we propose produces a significant improvement in the labelling compared to just using a pairwise model and that this improvement increases as the number of labels increases.

Journal ArticleDOI
TL;DR: A new approach for finding overlapping clusters given pairwise similarities of objects is introduced, which relax the problem of correlation clustering by allowing an object to be assigned to more than one cluster.
Abstract: We introduce a new approach for finding overlapping clusters given pairwise similarities of objects. In particular, we relax the problem of correlation clustering by allowing an object to be assigned to more than one cluster. At the core of our approach is an optimization problem in which each data point is mapped to a small set of labels, representing membership in different clusters. The objective is to find a mapping so that the given similarities between objects agree as much as possible with similarities taken over their label sets. The number of labels can vary across objects. To define a similarity between label sets, we consider two measures: (i) a 0–1 function indicating whether the two label sets have non-zero intersection and (ii) the Jaccard coefficient between the two label sets. The algorithm we propose is an iterative local-search method. The definitions of label set similarity give rise to two non-trivial optimization problems, which, for the measures of set-intersection and Jaccard, we solve using a greedy strategy and non-negative least squares, respectively. We also develop a distributed version of our algorithm based on the BSP model and implement it using a Pregel framework. Our algorithm uses as input pairwise similarities of objects and can thus be applied when clustering structured objects for which feature vectors are not available. As a proof of concept, we apply our algorithms on three different and complex application domains: trajectories, amino-acid sequences, and textual documents.

Journal ArticleDOI
TL;DR: Under the assumption that the time-frequency features of a simultaneous fault are similar to that of its constituent single faults, these issues can be effectively resolved using the proposed framework combining feature extraction, pairwise probabilistic multi-label classification, and decision threshold optimization.
Abstract: Simultaneous-fault diagnosis is a common problem in many applications and well-studied for time-independent patterns. However, most practical applications are of the type of time-dependent patterns. In our study of simultaneous-fault diagnosis for time-dependent patterns, two key issues are identified: 1) the features of the multiple single faults are mixed or combined into one pattern which makes accurate diagnosis difficult, 2) the acquisition of a large sample data set of simultaneous faults is costly because of high number of combinations of single faults, resulting in many possible classes of simultaneous-fault training patterns. Under the assumption that the time-frequency features of a simultaneous fault are similar to that of its constituent single faults, these issues can be effectively resolved using our proposed framework combining feature extraction, pairwise probabilistic multi-label classification, and decision threshold optimization. This framework has been applied and verified in automotive engine-ignition system diagnosis based on time-dependent ignition patterns as a test case. Experimental results show that the proposed framework can successfully resolve the issues.