scispace - formally typeset
Search or ask a question

Showing papers on "Pairwise comparison published in 2012"


Journal ArticleDOI
TL;DR: In this paper, the authors consider what is being implicitly assumed about the mean-variance relationship in distance-based analyses and what the effect is of any misspecification of the mean -variances relationship.
Abstract: Summary 1. A critical property of count data is its mean–variance relationship, yet this is rarely considered in multivariate analysis in ecology. 2. This study considers what is being implicitly assumed about the mean–variance relationship in distance-based analyses – multivariate analyses based on a matrix of pairwise distances – and what the effect is of any misspecification of the mean–variance relationship. 3. It is shown that distance-based analyses make implicit assumptions that are typically out-of-step with what is observed in real data, which has major consequences. 4. Potential consequences of this mean–variance misspecification are: confounding location and dispersion effects in ordinations; misleading results when trying to identify taxa in which an effect is expressed; failure to detect a multivariate effect unless it is expressed in high-variance taxa. 5. Data transformation does not solve the problem. 6. A solution is to use generalised linear models and their recent multivariate generalisations, which is shown here to have desirable properties.

883 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: PCCA learns a projection into a low-dimensional space where the distance between pairs of data points respects the desired constraints, exhibiting good generalization properties in presence of high dimensional data.
Abstract: This paper introduces Pairwise Constrained Component Analysis (PCCA), a new algorithm for learning distance metrics from sparse pairwise similarity/dissimilarity constraints in high dimensional input space, problem for which most existing distance metric learning approaches are not adapted. PCCA learns a projection into a low-dimensional space where the distance between pairs of data points respects the desired constraints, exhibiting good generalization properties in presence of high dimensional data. The paper also shows how to efficiently kernelize the approach. PCCA is experimentally validated on two challenging vision tasks, face verification and person re-identification, for which we obtain state-of-the-art results.

675 citations


Journal ArticleDOI
TL;DR: This paper introduces the concept of a grouping function, i.e., a specific type of aggregation function that combines two degrees of support (weak preference) into adegree of information or, say, a degree of comparability between two alternatives, and relates this new concept to that of incomparability.
Abstract: In this paper, we propose new aggregation functions for the pairwise comparison of alternatives in fuzzy preference modeling. More specifically, we introduce the concept of a grouping function, i.e., a specific type of aggregation function that combines two degrees of support (weak preference) into a degree of information or, say, a degree of comparability between two alternatives, and we relate this new concept to that of incomparability. Grouping functions of this type complement the existing concept of overlap functions in a natural way, since the latter can be used to turn two degrees of weak preference into a degree of indifference. We also define the so-called generalized bientropic functions that allow for a unified representation of overlap and grouping functions. Apart from analyzing mathematical properties of these types of functions and exploring relationships between them, we elaborate on their use in fuzzy preference modeling and decision making. We present an algorithm to elaborate on an alternative preference ranking that penalizes those alternatives for which the expert is not sure of his/her preference.

241 citations


Journal ArticleDOI
01 Feb 2012
TL;DR: A decision support model to aid the group consensus process while keeping an acceptable individual consistency for each decision maker and ensures that each individual multiplicative preference relation is of acceptable consistency when the predefined consensus level is achieved.
Abstract: In group decision making (GDM) with multiplicative preference relations (also known as pairwise comparison matrices in the Analytical Hierarchy Process), to come to a meaningful and reliable solution, it is preferable to consider individual consistency and group consensus in the decision process. This paper provides a decision support model to aid the group consensus process while keeping an acceptable individual consistency for each decision maker. The concept of an individual consistency index and a group consensus index is introduced based on the Hadamard product of two matrices. Two algorithms are presented in the designed support model. The first algorithm is utilized to convert an unacceptable preference relation to an acceptable one. The second algorithm is designed to assist the group in achieving a predefined consensus level. The main characteristics of our model are that: (1) it is independent of the prioritization method used in the consensus process; (2) it ensures that each individual multiplicative preference relation is of acceptable consistency when the predefined consensus level is achieved. Finally, some numerical examples are given to verify the effectiveness of our model.

237 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: This paper converts each confidence score vector obtained from one model into a pairwise relationship matrix, in which each entry characterizes the comparative relationship of scores of two test samples, to fuse the predicted confidence scores of multiple models.
Abstract: In this paper, we propose a rank minimization method to fuse the predicted confidence scores of multiple models, each of which is obtained based on a certain kind of feature. Specifically, we convert each confidence score vector obtained from one model into a pairwise relationship matrix, in which each entry characterizes the comparative relationship of scores of two test samples. Our hypothesis is that the relative score relations are consistent among component models up to certain sparse deviations, despite the large variations that may exist in the absolute values of the raw scores. Then we formulate the score fusion problem as seeking a shared rank-2 pairwise relationship matrix based on which each original score matrix from individual model can be decomposed into the common rank-2 matrix and sparse deviation errors. A robust score vector is then extracted to fit the recovered low rank score relation matrix. We formulate the problem as a nuclear norm and l 1 norm optimization objective function and employ the Augmented Lagrange Multiplier (ALM) method for the optimization. Our method is isotonic (i.e., scale invariant) to the numeric scales of the scores originated from different models. We experimentally show that the proposed method achieves significant performance gains on various tasks including object categorization and video event detection.

211 citations


Proceedings Article
03 Dec 2012
TL;DR: This paper proposes a novel iterative rank aggregation algorithm for discovering scores for objects from pairwise comparisons which performs as well as the Maximum Likelihood Estimator of the BTL model and outperforms a recently proposed algorithm by Ammar and Shah.
Abstract: The question of aggregating pairwise comparisons to obtain a global ranking over a collection of objects has been of interest for a very long time: be it ranking of online gamers (e.g. MSR's TrueSkill system) and chess players, aggregating social opinions, or deciding which product to sell based on transactions. In most settings, in addition to obtaining ranking, finding 'scores' for each object (e.g. player's rating) is of interest to understanding the intensity of the preferences. In this paper, we propose a novel iterative rank aggregation algorithm for discovering scores for objects from pairwise comparisons. The algorithm has a natural random walk interpretation over the graph of objects with edges present between two objects if they are compared; the scores turn out to be the stationary probability of this random walk. The algorithm is model independent. To establish the efficacy of our method, however, we consider the popular Bradley-Terry-Luce (BTL) model in which each object has an associated score which determines the probabilistic outcomes of pairwise comparisons between objects. We bound the finite sample error rates between the scores assumed by the BTL model and those estimated by our algorithm. This, in essence, leads to order-optimal dependence on the number of samples required to learn the scores well by our algorithm. Indeed, the experimental evaluation shows that our (model independent) algorithm performs as well as the Maximum Likelihood Estimator of the BTL model and outperforms a recently proposed algorithm by Ammar and Shah [1].

189 citations


Journal ArticleDOI
TL;DR: This work proposes a method for the inverse process: inferring the pairwise dissimilarities from multiple 2D arrangements of items, based on multiple arrangements of item subsets, designed by an adaptive algorithm that aims to provide optimal evidence for the dissimilarity estimates.
Abstract: The pairwise dissimilarities of a set of items can be intuitively visualized by a 2D arrangement of the items, in which the distances reflect the dissimilarities. Such an arrangement can be obtained by multidimensional scaling (MDS). We propose a method for the inverse process: inferring the pairwise dissimilarities from multiple 2-dimensional arrangements of items. Perceptual dissimilarities are classically measured using pairwise dissimilarity judgments. However, alternative methods including free sorting and 2D arrangements have previously been proposed. The present proposal is novel (a) in that the dissimilarity matrix is estimated by “inverse MDS” based on multiple arrangements of item subsets, and (b) in that the subsets are designed by an adaptive algorithm that aims to provide optimal evidence for the dissimilarity estimates. The subject arranges the items (represented as icons on a computer screen) by means of mouse drag-and-drop operations. The multi-arrangement method can be construed as a generalization of simpler methods: It reduces to pairwise dissimilarity judgments if each arrangement contains only two items, and to free sorting if the items are categorically arranged into discrete piles. Multi-arrangement combines the advantages of these methods. It is efficient (because the subject communicates many dissimilarity judgments with each mouse drag), psychologically attractive (because dissimilarities are judged in context), and can characterize continuous high-dimensional dissimilarity structures. We present two procedures for estimating the dissimilarity matrix: a simple weighted-aligned-average of the partial dissimilarity matrices and a computationally intensive algorithm, which estimates the dissimilarity matrix by iteratively minimizing the error of MDS-predictions of the subject’s arrangements. The Matlab code for interactive arrangement and dissimilarity estimation is available from the authors upon request.

177 citations


Journal ArticleDOI
TL;DR: The investigation proves the usefulness and strength of multiple comparison statistical procedures to analyse and select machine learning algorithms.
Abstract: In the paper we present some guidelines for the application of nonparametric statistical tests and post-hoc procedures devised to perform multiple comparisons of machine learning algorithms. We emphasize that it is necessary to distinguish between pairwise and multiple comparison tests. We show that the pairwise Wilcoxon test, when employed to multiple comparisons, will lead to overoptimistic conclusions. We carry out intensive normality examination employing ten different tests showing that the output of machine learning algorithms for regression problems does not satisfy normality requirements. We conduct experiments on nonparametric statistical tests and post-hoc procedures designed for multiple 1×N and N ×N comparisons with six different neural regression algorithms over 29 benchmark regression data sets. Our investigation proves the usefulness and strength of multiple comparison statistical procedures to analyse and select machine learning algorithms.

154 citations


Journal ArticleDOI
TL;DR: The findings suggest that the proposed model provides more consistent and reliable results which are in line with managers' ranking and implications of the study to the theory and practice and future research have been outlined.

145 citations


Journal ArticleDOI
TL;DR: An updated overview of Thurstonian and Bradley-Terry extensions, including how to account for object- and subject-specific covariates and how to deal with ordinal paired comparison data is provided.
Abstract: Thurstonian and Bradley-Terry models are the most commonly applied models in the analysis of paired comparison data. Since their introduction, numerous developments have been proposed in different areas. This paper provides an updated overview of these extensions, including how to account for object- and subject-specific covariates and how to deal with ordinal paired comparison data. Special emphasis is given to models for dependent comparisons. Although these models are more realistic, their use is complicated by numerical difficulties. We therefore concentrate on implementation issues. In particular, a pairwise likelihood approach is explored for models for dependent paired comparison data, and a simulation study is carried out to compare the performance of maximum pairwise likelihood with other limited information estimation methods. The methodology is illustrated throughout using a real data set about university paired comparisons performed by students.

144 citations


Journal ArticleDOI
TL;DR: Easy-to-use flexible methods for estimating the ‘effective sample size’ in indirect comparison meta-analysis and networkMeta-analysis are developed and will be of high value to regulatory agencies and decision makers who must assess the strength of the evidence supporting comparative effectiveness estimates.
Abstract: Network meta-analysis is becoming increasingly popular for establishing comparative effectiveness among multiple interventions for the same disease. Network meta-analysis inherits all methodological challenges of standard pairwise meta-analysis, but with increased complexity due to the multitude of intervention comparisons. One issue that is now widely recognized in pairwise meta-analysis is the issue of sample size and statistical power. This issue, however, has so far only received little attention in network meta-analysis. To date, no approaches have been proposed for evaluating the adequacy of the sample size, and thus power, in a treatment network. In this article, we develop easy-to-use flexible methods for estimating the ‘effective sample size’ in indirect comparison meta-analysis and network meta-analysis. The effective sample size for a particular treatment comparison can be interpreted as the number of patients in a pairwise meta-analysis that would provide the same degree and strength of evidence as that which is provided in the indirect comparison or network meta-analysis. We further develop methods for retrospectively estimating the statistical power for each comparison in a network meta-analysis. We illustrate the performance of the proposed methods for estimating effective sample size and statistical power using data from a network meta-analysis on interventions for smoking cessation including over 100 trials. The proposed methods are easy to use and will be of high value to regulatory agencies and decision makers who must assess the strength of the evidence supporting comparative effectiveness estimates.

Journal ArticleDOI
TL;DR: In this article, a method for estimating time-varying spike interactions by means of a state-space analysis is proposed. But the method is limited to the case of discrete parallel spike sequences and cannot be applied to non-stationary spike data.
Abstract: Precise spike coordination between the spiking activities of multiple neurons is suggested as an indication of coordinated network activity in active cell assemblies. Spike correlation analysis aims to identify such cooperative network activity by detecting excess spike synchrony in simultaneously recorded multiple neural spike sequences. Cooperative activity is expected to organize dynamically during behavior and cognition; therefore currently available analysis techniques must be extended to enable the estimation of multiple time-varying spike interactions between neurons simultaneously. In particular, new methods must take advantage of the simultaneous observations of multiple neurons by addressing their higher-order dependencies, which cannot be revealed by pairwise analyses alone. In this paper, we develop a method for estimating time-varying spike interactions by means of a state-space analysis. Discretized parallel spike sequences are modeled as multi-variate binary processes using a log-linear model that provides a well-defined measure of higher-order spike correlation in an information geometry framework. We construct a recursive Bayesian filter/smoother for the extraction of spike interaction parameters. This method can simultaneously estimate the dynamic pairwise spike interactions of multiple single neurons, thereby extending the Ising/spin-glass model analysis of multiple neural spike train data to a nonstationary analysis. Furthermore, the method can estimate dynamic higher-order spike interactions. To validate the inclusion of the higher-order terms in the model, we construct an approximation method to assess the goodness-of-fit to spike data. In addition, we formulate a test method for the presence of higher-order spike correlation even in nonstationary spike data, e.g., data from awake behaving animals. The utility of the proposed methods is tested using simulated spike data with known underlying correlation dynamics. Finally, we apply the methods to neural spike data simultaneously recorded from the motor cortex of an awake monkey and demonstrate that the higher-order spike correlation organizes dynamically in relation to a behavioral demand.

Journal ArticleDOI
TL;DR: This study proposes a new type of MSVM classifier that is designed to extend the binary SVMs by applying an ordinal pairwise partitioning (OPP) strategy and shows that the proposed model improves the performance of classification in comparison to other typical multi-class classification techniques and uses fewer computational resources.

Book ChapterDOI
24 Sep 2012
TL;DR: A theoretical study on the effect of diversity on the generalization performance of voting in the PAC-learning framework and applies explicit diversity regularization to ensemble pruning, and proposes the Diversity Regularized Ensemble Pruning (DREP) method.
Abstract: Diversity among individual classifiers is recognized to play a key role in ensemble, however, few theoretical properties are known for classification. In this paper, by focusing on the popular ensemble pruning setting (i.e., combining classifier by voting and measuring diversity in pairwise manner), we present a theoretical study on the effect of diversity on the generalization performance of voting in the PAC-learning framework. It is disclosed that the diversity is closely-related to the hypothesis space complexity, and encouraging diversity can be regarded to apply regularization on ensemble methods. Guided by this analysis, we apply explicit diversity regularization to ensemble pruning, and propose the Diversity Regularized Ensemble Pruning (DREP) method. Experimental results show the effectiveness of DREP.

Journal ArticleDOI
TL;DR: This paper reports the experience on applying t-wise techniques for SPL with two independent toolsets developed by the authors, and derives useful insights for pairwise and t- Wise testing of product lines.
Abstract: Software Product Lines (SPL) are difficult to validate due to combinatorics induced by variability, which in turn leads to combinatorial explosion of the number of derivable products. Exhaustive testing in such a large products space is hardly feasible. Hence, one possible option is to test SPLs by generating test configurations that cover all possible t feature interactions (t-wise). It dramatically reduces the number of test products while ensuring reasonable SPL coverage. In this paper, we report our experience on applying t-wise techniques for SPL with two independent toolsets developed by the authors. One focuses on generality and splits the generation problem according to strategies. The other emphasizes providing efficient generation. To evaluate the respective merits of the approaches, measures such as the number of generated test configurations and the similarity between them are provided. By applying these measures, we were able to derive useful insights for pairwise and t-wise testing of product lines.

Journal ArticleDOI
TL;DR: This paper presents an advanced version of the failure mode effects and criticality analysis (FMECA), whose capabilities are enhanced; in that the criticality assessment takes into account possible interactions among the principal causes of failure.
Abstract: This paper presents an advanced version of the failure mode effects and criticality analysis (FMECA), whose capabilities are enhanced; in that the criticality assessment takes into account possible interactions among the principal causes of failure. This is obtained by integrating FMECA and Analytic Network Process, a multi-criteria decision making technique. Severity, Occurrence and Detectability are split into sub-criteria and arranged in a hybrid (hierarchy/network) decision-structure that, at the lowest level, contains the causes of failure. Starting from this decision-structure, the Risk Priority Number is computed making pairwise comparisons, so that qualitative judgements and reliable quantitative data can be easily included in the analysis, without using vague and unreliable linguistic conversion tables. Pairwise comparison also facilitates the effort of the design/maintenance team, since it is easier to place comparative rather than absolute judgments, to quantify the importance of the causes of failure. In order to clarify and to make evident the rational of the final results, a graphical tool, similar to the House of Quality, is also presented. At the end of the paper, a case study, which confirms the quality of the approach and shows its capability to perform robust and comprehensive criticality analyses, is reported. Copyright © 2011 John Wiley and Sons Ltd.

Journal ArticleDOI
TL;DR: In this paper, the authors provide an updated overview of the Thurstonian and Bradley-Terry models in the analysis of paired comparison data, including how to account for object-and subject-specic covariates and how to deal with ordinal paired comparison.
Abstract: Thurstonian and Bradley-Terry models are the most commonly applied models in the analysis of paired comparison data. Since their introduction, numerous developments of those models have been proposed in different areas. This paper provides an updated overview of these extensions, including how to account for object- and subject-specic covariates and how to deal with ordinal paired comparison data. Special emphasis is given to models for dependent comparisons. Although these models are more realistic, their use is complicated by numerical difficulties. We therefore concentrate on implementation issues. In particular, a pairwise likelihood approach is explored for models for dependent paired comparison data and a simulation study is carried out to compare the performance of maximum pairwise likelihood with other methods, such as limited information estimation. The methodology is illustrated throughout using a real data set about university paired comparisons performed by students.

Proceedings Article
03 Dec 2012
TL;DR: The problem of finding a subset of data points, called representatives or exemplars, that can efficiently describe the data collection is formulated as a row-sparsity regularized trace minimization problem that can be solved efficiently using convex programming.
Abstract: Given pairwise dissimilarities between data points, we consider the problem of finding a subset of data points, called representatives or exemplars, that can efficiently describe the data collection. We formulate the problem as a row-sparsity regularized trace minimization problem that can be solved efficiently using convex programming. The solution of the proposed optimization program finds the representatives and the probability that each data point is associated with each one of the representatives. We obtain the range of the regularization parameter for which the solution of the proposed optimization program changes from selecting one representative for all data points to selecting all data points as representatives. When data points are distributed around multiple clusters according to the dissimilarities, we show that the data points in each cluster select representatives only from that cluster. Unlike metric-based methods, our algorithm can be applied to dissimilarities that are asymmetric or violate the triangle inequality, i.e., it does not require that the pairwise dissimilarities come from a metric. We demonstrate the effectiveness of the proposed algorithm on synthetic data as well as real-world image and text data.

Journal ArticleDOI
TL;DR: Chang’s fuzzy AHP-based multiple attribute decision-making (MADM) method is applied for selection of the best site of landfills based on a set of decision criteria and computational time required for ranking and selecting the suitable landfill site was significantly reduced.
Abstract: Landfill site selection is a complex and time-consuming process, which requires evaluation of several factors where many different attributes are taken into account. Decision makers always have some difficulties in making the right decision in the multiple attribute environments. After identifying candidate sites, these sites should be ranked using decision-making methods. This study applies Chang’s fuzzy AHP-based multiple attribute decision-making (MADM) method for selection of the best site of landfills based on a set of decision criteria. The Fuzzy Analytic Hierarchy Process (FAHP) was designed to make pairwise comparisons of selected criteria by domain experts for assigning weights to the decision criteria. Analytic Hierarchy Process (AHP) is used to make pairwise comparisons and assign weights to the decision criteria. It is easier for a decision maker to describe a value for an alternative by using linguistic terms and fuzzy numbers. In the fuzzy-based AHP method, the rating of each alternative was described using the expression of triangular fuzzy membership functions. Once the global weights of the criteria is calculated by AHP, they are incorporated into the decision matrices composed by decision maker and passed to fuzzy-AHP method which is used to determine preference order of siting alternatives. In this study, a computer program based on the Chang’s fuzzy method was also developed in MATLAB environment for ranking and selecting the landfill site. As an example of the proposed methodology, four different hypothetical areas were chosen and implemented to demonstrate the effectiveness of the program. By using this program, the precision was improved in comparison with traditional methods and computational time required for ranking and selecting the suitable landfill site was significantly reduced.

Journal ArticleDOI
TL;DR: A new pairwise Fermi updating rule is proposed by considering a social average payoff when an agent copies a neighbor's strategy, which proves that the social average of some limited agents realizes more significant cooperation than that of the entire population.
Abstract: We propose a new pairwise Fermi updating rule by considering a social average payoff when an agent copies a neighbor's strategy. In the update rule, a focal agent compares her payoff with the social average payoff of the same strategy that her pairwise opponent has. This concept might be justified by the fact that people reference global and, somehow, statistical information, not local information when imitating social behaviors. We presume several possible ways for the social average. Simulation results prove that the social average of some limited agents realizes more significant cooperation than that of the entire population.

Journal ArticleDOI
TL;DR: The study concludes that the AHP is sensitive to the level of fuzzification and decision-makers should be aware of this sensitivity while using the fuzzy AHP and the methodology described may serve as a guideline on how to perform a sensitivity analysis in spatial MCDA.

Journal ArticleDOI
TL;DR: It is shown that when intensity of preference represented by reciprocal pairwise comparisons is considered, it is always possible to construct an Arrowian social welfare function using a two-stage social choice process.
Abstract: Preferences in Arrow’s conditions are ordinal. Here we show that when intensity of preference represented by reciprocal pairwise comparisons is considered, it is always possible to construct an Arrowian social welfare function using a two-stage social choice process. In stage 1, the individual pairwise relations are mapped into a social pairwise relation. In stage 2, the social pairwise relation is used to generate a cardinal ranking and this ranking is then used to select a particular member of the choice set.

Journal ArticleDOI
TL;DR: The whole set of instances of a preference model that is compatible with preference information provided by the DM is considered, and the best and the worst attained ranks for each alternative are determined.
Abstract: We extend the principle of robust ordinal regression with an analysis of extreme ranking results. In our proposal, we consider the whole set of instances of a preference model that is compatible with preference information provided by the DM. We refer to both, the well-known UTAGMS method, which builds the set of general additive value functions compatible with DM's preferences, and newly introduced in this paper PROMETHEEGKS, which constructs the set of compatible outranking models via robust ordinal regression. Then, we consider all complete rankings that follow the use of the compatible preference models, and we determine the best and the worst attained ranks for each alternative. In this way, we are able to assess its position in an overall ranking, and not only in terms of pairwise comparisons, as it is the case in original robust ordinal regression methods. Additionally, we analyze the ranges of possible comprehensive scores (values or net outranking flows). We also discuss extensions of the presented approach on other multiple criteria problems than ranking. Finally, we show how the presented methodology can be applied in practical decision support, reporting results of three illustrative studies.

Journal Article
TL;DR: In this paper, a learning-to-rank (from pairwise information) algorithm adaptively queries at most O(e-6n log5 n) preference labels for a regret of e times the optimal loss, which is asymptotically better than standard (nonadaptive) learning bounds achievable for the same problem.
Abstract: Given a set V of n elements we wish to linearly order them given pairwise preference labels which may be non-transitive (due to irrationality or arbitrary noise). The goal is to linearly order the elements while disagreeing with as few pairwise preference labels as possible. Our performance is measured by two parameters: The number of disagreements (loss) and the query complexity (number of pairwise preference labels). Our algorithm adaptively queries at most O(e-6n log5 n) preference labels for a regret of e times the optimal loss. As a function of n, this is asymptotically better than standard (non-adaptive) learning bounds achievable for the same problem. Our main result takes us a step closer toward settling an open problem posed by learning-to-rank (from pairwise information) theoreticians and practitioners: What is a provably correct way to sample preference labels? To further show the power and practicality of our solution, we analyze a typical test case in which a large margin linear relaxation is used for efficiently solving the simpler learning problems in our decomposition.

Journal ArticleDOI
TL;DR: A heuristic algorithm is proposed to improve ordinal consistency by identifying and eliminating intransitivities in pairwise comparison matrices and it is shown that ordinal inconsistency does not necessarily decrease in the group aggregation process, in contrast with cardinal inconsistency.

Journal ArticleDOI
TL;DR: This paper investigates efficient methods for computing approximations to the state-of-the-art IsoRank solution for finding pairwise topological similarity between nodes in two networks (or within the same network) and presents a novel approach based on uncoupling and decomposing ranking calculations associated with the computation of similarity scores.
Abstract: As graph-structured data sets become commonplace, there is increasing need for efficient ways of analyzing such data sets. These analyses include conservation, alignment, differentiation, and discrimination, among others. When defined on general graphs, these problems are considerably harder than their well-studied counterparts on sets and sequences. In this paper, we study the problem of global alignment of large sparse graphs. Specifically, we investigate efficient methods for computing approximations to the state-of-the-art IsoRank solution for finding pairwise topological similarity between nodes in two networks (or within the same network). Pairs of nodes with high similarity can be used to seed global alignments. We present a novel approach to this computationally expensive problem based on uncoupling and decomposing ranking calculations associated with the computation of similarity scores. Uncoupling refers to independent preprocessing of each input graph. Decomposition implies that pairwise similarity scores can be explicitly broken down into contributions from different link patterns traced back to a low-rank approximation of the initial conditions for the computation. These two concepts result in significant improvements, in terms of computational cost, interpretability of similarity scores, and nature of supported queries. We show over two orders of magnitude improvement in performance over IsoRank/Random Walk formulations, and over an order of magnitude improvement over constrained matrix-triple-product formulations, in the context of real data sets.

Journal ArticleDOI
01 Apr 2012-Oikos
TL;DR: This article discusses extending the conventional pairwise concept and demonstrates a community module-based approach as an initial step for exploring community consequences of species-specific phenological shifts caused by climate change.
Abstract: Climate change has significant impacts on phenology of various organisms in a species-specific manner. Facing this problem, the match/mismatch hypothesis that phenological (a)synchrony with resource availability strongly influences recruitment success of a consumer population has recently received much attention. In this article, we discuss extending the conventional pairwise concept and demonstrate a community module-based approach as an initial step for exploring community consequences of species-specific phenological shifts caused by climate change. Our multispecies match/mismatch perspective leads to the prediction that phenological (a)synchrony among interacting species critically affects not only population recruitment of species but also key dynamical features of ecological communities such as trophic cascades, competitive hierarchies, and species coexistence. Explicit identification and consideration of species relationships is therefore desirable for a better understanding of seasonal community dynamics and thus community consequences of climate change-induced phenological shifts.

Journal ArticleDOI
TL;DR: In this paper, a pairwise maximum likelihood (PML) estimation method is developed for factor analysis models with ordinal data and fitted both in an exploratory and confirmatory set-up.

Journal ArticleDOI
01 May 2012-Genetics
TL;DR: This article proposes a new likelihood-based method that is computationally efficient enough to handle large data sets and is more accurate than pairwise likelihood and exclusion-based methods, but is slightly less accurate than the full-likelihood method.
Abstract: Quite a few methods have been proposed to infer sibship and parentage among individuals from their multilocus marker genotypes. They are all based on Mendelian laws either qualitatively (exclusion methods) or quantitatively (likelihood methods), have different optimization criteria, and use different algorithms in searching for the optimal solution. The full-likelihood method assigns sibship and parentage relationships among all sampled individuals jointly. It is by far the most accurate method, but is computationally prohibitive for large data sets with many individuals and many loci. In this article I propose a new likelihood-based method that is computationally efficient enough to handle large data sets. The method uses the sum of the log likelihoods of pairwise relationships in a configuration as the score to measure its plausibility, where log likelihoods of pairwise relationships are calculated only once and stored for repeated use. By analyzing several empirical and many simulated data sets, I show that the new method is more accurate than pairwise likelihood and exclusion-based methods, but is slightly less accurate than the full-likelihood method. However, the new method is computationally much more efficient than the full-likelihood method, and for the cases of both sexes polygamous and markers with genotyping errors, it can be several orders faster. The new method can handle a large sample with thousands of individuals and the number of markers limited only by the computer memory.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a system that translates narrative text in the medical domain into structured representation, which consists of five steps: (1) preprocessing sentences, marking noun phrases (NPs) and adjective phrases (APs), (2) extracting concepts that use a dosage-unit dictionary to dynamically switch two models based on CRF, (3) classifying assertions based on voting of five classifiers, and (4) identifying relations using normalized sentences with a set of effective discriminating features.