scispace - formally typeset
Search or ask a question

Showing papers on "Pairwise comparison published in 2008"


Journal ArticleDOI
TL;DR: The Analytic Hierarchy Process (AHP) is a theory of measurement through pairwise comparisons and relies on the judgements of experts to derive priority scales that measure intangibles in relative terms.
Abstract: Decisions involve many intangibles that need to be traded off To do that, they have to be measured along side tangibles whose measurements must also be evaluated as to, how well, they serve the objectives of the decision maker The Analytic Hierarchy Process (AHP) is a theory of measurement through pairwise comparisons and relies on the judgements of experts to derive priority scales It is these scales that measure intangibles in relative terms The comparisons are made using a scale of absolute judgements that represents, how much more, one element dominates another with respect to a given attribute The judgements may be inconsistent, and how to measure inconsistency and improve the judgements, when possible to obtain better consistency is a concern of the AHP The derived priority scales are synthesised by multiplying them by the priority of their parent nodes and adding for all such nodes An illustration is included

6,787 citations


Journal ArticleDOI
TL;DR: In this article, the authors introduce a class of variance allocation models for pairwise measurements, called mixed membership stochastic blockmodels, which combine global parameters that instantiate dense patches of connectivity (blockmodel) with local parameters (mixed membership), and develop a general variational inference algorithm for fast approximate posterior inference.
Abstract: Consider data consisting of pairwise measurements, such as presence or absence of links between pairs of objects. These data arise, for instance, in the analysis of protein interactions and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing pairwise measurements with probabilistic models requires special assumptions, since the usual independence or exchangeability assumptions no longer hold. Here we introduce a class of variance allocation models for pairwise measurements: mixed membership stochastic blockmodels. These models combine global parameters that instantiate dense patches of connectivity (blockmodel) with local parameters that instantiate node-specific variability in the connections (mixed membership). We develop a general variational inference algorithm for fast approximate posterior inference. We demonstrate the advantages of mixed membership stochastic blockmodels with applications to social networks and protein interaction networks.

1,803 citations


Journal Article
TL;DR: The paper correctly introduces the basic procedures and some of the most advanced ones when comparing a control method, but it does not deal with some advanced topics in depth.
Abstract: In a recently published paper in JMLR, Demˇ sar (2006) recommends a set of non-parametric statistical tests and procedures which can be safely used for comparing the performance of classifiers over multiple data sets. After studying the paper, we realize that the paper correctly introduces the basic procedures and some of the most advanced ones when comparing a control method. However, it does not deal with some advanced topics in depth. Regarding these topics, we focus on more powerful proposals of statistical procedures for comparing n n classifiers. Moreover, we illustrate an easy way of obtaining adjusted and comparable p-values in multiple comparison procedures.

1,312 citations


Proceedings ArticleDOI
24 Aug 2008
TL;DR: This model generalizes several existing matrix factorization methods, and therefore yields new large-scale optimization algorithms for these problems, which can handle any pairwise relational schema and a wide variety of error models.
Abstract: Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations among entities. An example of relational learning is movie rating prediction, where entities could include users, movies, genres, and actors. Relations encode users' ratings of movies, movies' genres, and actors' roles in movies. A common prediction technique given one pairwise relation, for example a #users x #movies ratings matrix, is low-rank matrix factorization. In domains with multiple relations, represented as multiple matrices, we may improve predictive accuracy by exploiting information from one relation while predicting another. To this end, we propose a collective matrix factorization model: we simultaneously factor several matrices, sharing parameters among factors when an entity participates in multiple relations. Each relation can have a different value type and error distribution; so, we allow nonlinear relationships between the parameters and outputs, using Bregman divergences to measure error. We extend standard alternating projection algorithms to our model, and derive an efficient Newton update for the projection. Furthermore, we propose stochastic optimization methods to deal with large, sparse matrices. Our model generalizes several existing matrix factorization methods, and therefore yields new large-scale optimization algorithms for these problems. Our model can handle any pairwise relational schema and a wide variety of error models. We demonstrate its efficiency, as well as the benefit of sharing parameters among relations.

1,192 citations


Journal ArticleDOI
TL;DR: In this article, the authors show how relative scales can be derived by making pairwise comparisons using numerical judgments from an absolute scale of numbers, when used to represent comparisons can be related and combined to define a cardinal scale of absolute numbers that is stronger than a ratio scale.
Abstract: According to the great mathematician Henri Lebesgue, making direct comparisons of objects with regard to a property is a fundamental mathematical process for deriving measurements. Measuring objects by using a known scale first then comparing the measurements works well for properties for which scales of measurement exist. The theme of this paper is that direct comparisons are necessary to establish measurements for intangible properties that have no scales of measurement. In that case the value derived for each element depends on what other elements it is compared with. We show how relative scales can be derived by making pairwise comparisons using numerical judgments from an absolute scale of numbers. Such measurements, when used to represent comparisons can be related and combined to define a cardinal scale of absolute numbers that is stronger than a ratio scale. They are necessary to use when intangible factors need to be added and multiplied among themselves and with tangible factors. To derive and synthesize relative scales systematically, the factors are arranged in a hierarchic or a network structure and measured according to the criteria represented within these structures. The process of making comparisons to derive scales of measurement is illustrated in two types of practical real life decisions, the Iran nuclear show-down with the West in this decade and building a Disney park in Hong Kong in 2005. It is then generalized to the case of making a continuum of comparisons by using Fredholm’s equation of the second kind whose solution gives rise to a functional equation. The Fourier transform of the solution of this equation in the complex domain is a sum of Dirac distributions demonstrating that proportionate response to stimuli is a process of firing and synthesis of firings as neurons in the brain do. The Fourier transform of the solution of the equation in the real domain leads to nearly inverse square responses to natural influences. Various generalizations and critiques of the approach are included.

980 citations


Journal ArticleDOI
TL;DR: This work proposes a suitable extension of label ranking that incorporates the calibrated scenario and substantially extends the expressive power of existing approaches and suggests a conceptually novel technique for extending the common learning by pairwise comparison approach to the multilabel scenario, a setting previously not being amenable to the pairwise decomposition technique.
Abstract: Label ranking studies the problem of learning a mapping from instances to rankings over a predefined set of labels. Hitherto existing approaches to label ranking implicitly operate on an underlying (utility) scale which is not calibrated in the sense that it lacks a natural zero point. We propose a suitable extension of label ranking that incorporates the calibrated scenario and substantially extends the expressive power of these approaches. In particular, our extension suggests a conceptually novel technique for extending the common learning by pairwise comparison approach to the multilabel scenario, a setting previously not being amenable to the pairwise decomposition technique. The key idea of the approach is to introduce an artificial calibration label that, in each example, separates the relevant from the irrelevant labels. We show that this technique can be viewed as a combination of pairwise preference learning and the conventional relevance classification technique, where a separate classifier is trained to predict whether a label is relevant or not. Empirical results in the area of text categorization, image classification and gene analysis underscore the merits of the calibrated model in comparison to state-of-the-art multilabel learning methods.

825 citations


Journal ArticleDOI
TL;DR: This paper considers a general problem of learning from pairwise constraints in the form of must-links and cannot-links, and aims to learn a Mahalanobis distance metric.

541 citations


Journal ArticleDOI
TL;DR: This work shows that a simple (weighted) voting strategy minimizes risk with respect to the well-known Spearman rank correlation and compares RPC to existing label ranking methods, which are based on scoring individual labels instead of comparing pairs of labels.

538 citations


Journal ArticleDOI
TL;DR: Distinguishing necessary and possible consequences of preference information on the complete set of actions, UTAGMS answers questions of robustness analysis and can support the decision maker when his/her preference statements cannot be represented in terms of an additive value function.

448 citations


Journal ArticleDOI
TL;DR: A fuzzy AHP approach is proposed to determine the level of faulty behavior risk (FBR) in work systems and faulty behavior is prevented before occurrence and work system safety is improved.

412 citations


Journal ArticleDOI
TL;DR: This study applies fuzzy linguistic preference relations (Fuzzy LinPreRa) to construct a pairwise comparison matrix with additive reciprocal property and consistency to alleviate inconsistencies in fuzzy AHP method.

Journal ArticleDOI
TL;DR: In this paper, the authors present a new method for determining the point values for additive multi-attribute value models with performance categories, which they refer to as PAPRIKA (Potentially All Pairwise Ran Kings of all possible Alternatives).
Abstract: We present a new method for determining the point values for additive multi-attribute value models with performance categories. The method, which we refer to as PAPRIKA (Potentially All Pairwise RanKings of all possible Alternatives), involves the decision-maker pairwise ranking potentially all undominated pairs of all possible alternatives represented by the value model. The number of pairs to be explicitly ranked is minimized by the method identifying all pairs implicitly ranked as corollaries of the explicitly ranked pairs. We report on simulations of the method's use and show that if the decision-maker explicitly ranks pairs defined on just two criteria at-a-time, the overall ranking of alternatives produced by the value model is very highly correlated with the true ranking. Therefore, for most practical purposes decision-makers are unlikely to need to rank pairs defined on more than two criteria, thereby reducing the elicitation burden. We also describe a successful real-world application involving the scoring of a value model for prioritizing patients for cardiac surgery in New Zealand. We conclude that although the new method entails more judgments than traditional scoring methods, the type of judgment (pairwise rankings of undominated pairs) is arguably simpler and might reasonably be expected to reflect the preferences of decision-makers more accurately. Copyright © 2009 John Wiley & Sons, Ltd.

Proceedings ArticleDOI
16 Jun 2008
TL;DR: This paper presents a MapReduce algorithm for computing pairwise document similarity in large document collections that exhibits linear growth in running time and space in terms of the number of documents.
Abstract: This paper presents a MapReduce algorithm for computing pairwise document similarity in large document collections. MapReduce is an attractive framework because it allows us to decompose the inner products involved in computing document similarity into separate multiplication and summation stages in a way that is well matched to efficient disk access patterns across several machines. On a collection consisting of approximately 900,000 newswire articles, our algorithm exhibits linear growth in running time and space in terms of the number of documents.

Journal IssueDOI
TL;DR: This procedure attempts to estimate the missing information in an expert's incomplete preference relation using only the preference values provided by that particular expert using the additive consistency property.
Abstract: In this paper, we present a procedure to estimate missing preference values when dealing with pairwise comparison and heterogeneous information. This procedure attempts to estimate the missing information in an expert's incomplete preference relation using only the preference values provided by that particular expert. Our procedure to estimate missing values can be applied to incomplete fuzzy, multiplicative, interval-valued, and linguistic preference relations. Clearly, it would be desirable to maintain experts' consistency levels. We make use of the additive consistency property to measure the level of consistency and to guide the procedure in the estimation of the missing values. Finally, conditions that guarantee the success of our procedure in the estimation of all the missing values of an incomplete preference relation are given. © 2008 Wiley Periodicals, Inc.

Journal ArticleDOI
TL;DR: An integrated AHP-DEA methodology to evaluate bridge risks of hundreds or thousands of bridge structures, based on which the maintenance priorities of the bridge structures can be decided.

Journal ArticleDOI
TL;DR: This paper proposes to use another form of supervision information for feature selection, i.e. pairwise constraints, which specifies whether a pair of data samples belong to the same class (must-link constraints) or different classes (cannot- link constraints).

Journal ArticleDOI
TL;DR: Taking a semiparametric theory perspective, this work proposes a broadly applicable approach to adjustment for auxiliary covariates to achieve more efficient estimators and tests for treatment parameters in the analysis of randomized clinical trials.
Abstract: The primary goal of a randomized clinical trial is to make comparisons among two or more treatments. For example, in a two-arm trial with continuous response, the focus may be on the difference in treatment means; with more than two treatments, the comparison may be based on pairwise differences. With binary outcomes, pairwise odds ratios or log odds ratios may be used. In general, comparisons may be based on meaningful parameters in a relevant statistical model. Standard analyses for estimation and testing in this context typically are based on the data collected on response and treatment assignment only. In many trials, auxiliary baseline covariate information may also be available, and it is of interest to exploit these data to improve the efficiency of inferences. Taking a semiparametric theory perspective, we propose a broadly applicable approach to adjustment for auxiliary covariates to achieve more efficient estimators and tests for treatment parameters in the analysis of randomized clinical trials. Simulations and applications demonstrate the performance of the methods.

Journal ArticleDOI
TL;DR: This paper presents two performance measure algorithms to evaluate the numerical scales and the prioritization methods and discusses the parameter of the geometrical scale, develop a new prioritization method, and construct an optimization model to select the appropriate numerical scales for the AHP decision makers.

Journal ArticleDOI
TL;DR: This review focuses on real-world problems and empirical results from applying freely available methods and tools for constructing large t-way combination test sets, converting covering arrays into executable tests, and automatically generating test oracles using model checking.
Abstract: With new algorithms and tools, developers can apply high-strength combinatorial testing to detect elusive failures that occur only when multiple components interact. In pairwise testing, all possible pairs of parameter values are covered by at least one test, and good tools are available to generate arrays with the value pairs. In the past few years, advances in covering-array algorithms, integrated with model checking or other testing approaches, have made it practical to extend combinatorial testing beyond pairwise tests. The US National Institute of Standards and Technology (NIST) and the University of Texas, Arlington, are now distributing freely available methods and tools for constructing large t-way combination test sets (known as covering arrays), converting covering arrays into executable tests, and automatically generating test oracles using model checking (http://csrc.nist.gov/acts). In this review, we focus on real-world problems and empirical results from applying these methods and tools.

Proceedings ArticleDOI
23 Jun 2008
TL;DR: Experiments show the proposed method outperforms state-of-the-art constrained clustering methods in getting good clusterings with fewer constraints, and yields good image segmentation with user-specified pairwise constraints.
Abstract: Pairwise constraints specify whether or not two samples should be in one cluster. Although it has been successful to incorporate them into traditional clustering methods, such as K-means, little progress has been made in combining them with spectral clustering. The major challenge in designing an effective constrained spectral clustering is a sensible combination of the scarce pairwise constraints with the original affinity matrix. We propose to combine the two sources of affinity by propagating the pairwise constraints information over the original affinity matrix. Our method has a Gaussian process interpretation and results in a closed-form expression for the new affinity matrix. Experiments show it outperforms state-of-the-art constrained clustering methods in getting good clusterings with fewer constraints, and yields good image segmentation with user-specified pairwise constraints.

Proceedings Article
01 Jan 2008
TL;DR: A novel matrix factorization based approach for semi-supervised clustering, which is extended to co-cluster the data sets of different types with constraints and shows the superiority of the proposed method.
Abstract: The recent years have witnessed a surge of interests of semi-supervised clustering methods, which aim to cluster the data set under the guidance of some supervisory information. Usually those supervisory information takes the form of pairwise constraints that indicate the similarity/dissimilarity between the two points. In this paper, we propose a novel matrix factorization based approach for semi-supervised clustering. In addition, we extend our algorithm to co-cluster the data sets of different types with constraints. Finally the experiments on UCI data sets and real world Bulletin Board Systems (BBS) data sets show the superiority of our proposed method.

Book ChapterDOI
15 Sep 2008
TL;DR: This paper applies multilabel classification algorithms to the EUR-Lex database of legal documents of the European Union and resorts to the dual representation of the perceptron, which makes the pairwise approach feasible for problems of this size.
Abstract: In this paper we applied multilabel classification algorithms to the EUR-Lex database of legal documents of the European Union. On this document collection, we studied three different multilabel classification problems, the largest being the categorization into the EUROVOC concept hierarchy with almost 4000 classes. We evaluated three algorithms: (i) the binary relevance approach which independently trains one classifier per label; (ii) the multiclass multilabel perceptron algorithm, which respects dependencies between the base classifiers; and (iii) the multilabel pairwise perceptron algorithm, which trains one classifier for each pair of labels. All algorithms use the simple but very efficient perceptron algorithm as the underlying classifier, which makes them very suitable for large-scale multilabel classification problems. The main challenge we had to face was that the almost 8,000,000 perceptrons that had to be trained in the pairwise setting could no longer be stored in memory. We solve this problem by resorting to the dual representation of the perceptron, which makes the pairwise approach feasible for problems of this size. The results on the EUR-Lex database confirm the good predictive performance of the pairwise approach and demonstrates the feasibility of this approach for large-scale tasks.

Proceedings ArticleDOI
25 Oct 2008
TL;DR: This paper presents a framework that informs local decisions with two types of implicit global constraints: transitivity and time expression normalization, and shows how these constraints can be used to create a more densely-connected network of events.
Abstract: Previous work on ordering events in text has typically focused on local pairwise decisions, ignoring globally inconsistent labels. However, temporal ordering is the type of domain in which global constraints should be relatively easy to represent and reason over. This paper presents a framework that informs local decisions with two types of implicit global constraints: transitivity (A before B and B before C implies A before C) and time expression normalization (e.g. last month is before yesterday). We show how these constraints can be used to create a more densely-connected network of events, and how global consistency can be enforced by incorporating these constraints into an integer linear programming framework. We present results on two event ordering tasks, showing a 3.6% absolute increase in the accuracy of before/after classification over a pairwise model.

Journal ArticleDOI
TL;DR: In this article, a curve-synchronization method that uses every trajectory in the sample as a reference to obtain pairwise warping functions in the first step is presented. And then, these initial pairwise Warping functions are then used to create improved estimators of the underlying individual warping function in the second step.
Abstract: SUMMARY Data collected by scientists are increasingly in the form of trajectories or curves. Often these can be viewed as realizations of a composite process driven by both amplitude and time variation. We consider the situation in which functional variation is dominated by time variation, and develop a curve-synchronization method that uses every trajectory in the sample as a reference to obtain pairwise warping functions in the first step. These initial pairwise warping functions are then used to create improved estimators of the underlying individual warping functions in the second step. A truncated averaging process is used to obtain robust estimation of individual warping functions. The method compares well with other available time-synchronization approaches and is illustrated with Berkeley growth data and gene expression data for multiple sclerosis.

Journal ArticleDOI
TL;DR: In this article, the authors proposed an adaptive AHP approach (A 3 ) that uses a soft computing scheme, Genetic Algorithms, to recover the real number weightings of the various criteria in AHP and provides a function for automatically improving the consistency ratio of pairwise comparisons.

Proceedings ArticleDOI
05 Jul 2008
TL;DR: The general problem of learning from both pairwise constraints and unlabeled data is considered and it is proposed to learn a mapping that is smooth over the data graph and maps the data onto a unit hypersphere, where two must-link objects are mapped to the same point while two cannot- link objects are maps to be orthogonal.
Abstract: We consider the general problem of learning from both pairwise constraints and unlabeled data. The pairwise constraints specify whether two objects belong to the same class or not, known as the must-link constraints and the cannot-link constraints. We propose to learn a mapping that is smooth over the data graph and maps the data onto a unit hypersphere, where two must-link objects are mapped to the same point while two cannot-link objects are mapped to be orthogonal. We show that such a mapping can be achieved by formulating a semidefinite programming problem, which is convex and can be solved globally. Our approach can effectively propagate pairwise constraints to the whole data set. It can be directly applied to multi-class classification and can handle data labels, pairwise constraints, or a mixture of them in a unified framework. Promising experimental results are presented for classification tasks on a variety of synthetic and real data sets.

Proceedings ArticleDOI
24 Aug 2008
TL;DR: This paper proposes new, almost-linear-time algorithms to optimize for two other criteria widely used to evaluate search systems: MRR (mean reciprocal rank) and NDCG (normalized discounted cumulative gain) in the max-margin structured learning framework.
Abstract: Learning to rank from relevance judgment is an active research area. Itemwise score regression, pairwise preference satisfaction, and listwise structured learning are the major techniques in use. Listwise structured learning has been applied recently to optimize important non-decomposable ranking criteria like AUC (area under ROC curve) and MAP (mean average precision). We propose new, almost-linear-time algorithms to optimize for two other criteria widely used to evaluate search systems: MRR (mean reciprocal rank) and NDCG (normalized discounted cumulative gain) in the max-margin structured learning framework. We also demonstrate that, for different ranking criteria, one may need to use different feature maps. Search applications should not be optimized in favor of a single criterion, because they need to cater to a variety of queries. E.g., MRR is best for navigational queries, while NDCG is best for informational queries. A key contribution of this paper is to fold multiple ranking loss functions into a multi-criteria max-margin optimization. The result is a single, robust ranking model that is close to the best accuracy of learners trained on individual criteria. In fact, experiments over the popular LETOR and TREC data sets show that, contrary to conventional wisdom, a test criterion is often not best served by training with the same individual criterion.

Journal ArticleDOI
TL;DR: Fuzzy analytic network process (FANP) based methodology is discussed to tackle the different decision criteria involved in the selection of competitive priorities in current business scenario and can provide a hierarchical framework for the cleaner production implemented organization to select on its competitive priorities.
Abstract: The aim of this paper is to identify and discuss some of the important and critical decision criteria including cleaner production implementation of an efficient system to prioritize competitive priorities. Fuzzy analytic network process (FANP) based methodology is discussed to tackle the different decision criteria involved in the selection of competitive priorities in current business scenario. FANP is an efficient tool to handle the fuzziness of the data involved in deciding the preferences of different decision variables. The linguistic level of comparisons produced by the professionals and experts for each comparison are tapped in the form of triangular fuzzy numbers to construct fuzzy pairwise comparison matrices. The implementation of the system is demonstrated by a problem having four stages of hierarchy which contains different criteria, attributes and alternatives at wider perspective. The proposed model can provide a hierarchical framework for the cleaner production implemented organization to select on its competitive priorities.

Journal Article
TL;DR: This article proposes a generic framework for computation of similarity measures for sequences, covering various kernel, distance and non-metric similarity functions, and provides linear-time algorithms of different complexity and capabilities using sorted arrays, tries and suffix trees as underlying data structures.
Abstract: Efficient and expressive comparison of sequences is an essential procedure for learning with sequential data. In this article we propose a generic framework for computation of similarity measures for sequences, covering various kernel, distance and non-metric similarity functions. The basis for comparison is embedding of sequences using a formal language, such as a set of natural words, k-grams or all contiguous subsequences. As realizations of the framework we provide linear-time algorithms of different complexity and capabilities using sorted arrays, tries and suffix trees as underlying data structures. Experiments on data sets from bioinformatics, text processing and computer security illustrate the efficiency of the proposed algorithms---enabling peak performances of up to 106 pairwise comparisons per second. The utility of distances and non-metric similarity measures for sequences as alternatives to string kernels is demonstrated in applications of text categorization, network intrusion detection and transcription site recognition in DNA.

ReportDOI
01 Sep 2008
TL;DR: A formula for the pairwise update of arbitrary-order centered statistical moments is presented, of particular interest to compute such moments in parallel for large-scale, distributed data sets.
Abstract: We present a formula for the pairwise update of arbitrary-order centered statistical moments. This formula is of particular interest to compute such moments in parallel for large-scale, distributed data sets. As a corollary, we indicate a specialization of this formula for incremental updates, of particular interest to streaming implementations. Finally, we provide pairwise and incremental update formulas for the covariance. Centered statistical moments are one of the most widely used tools in descriptive statistics. It is therefore essential for statistical analysis packages that robust and efficient algorithms be devised and implemented. However, robustness and speed of execution, in this context as well as in others, tend to be orthogonal. For instance, it is well known1 that algorithms for calculating centered statistical moments that utilize sum of powers for the sake of execution speed (one-pass algorithms) lead to unacceptable numerical instability.