scispace - formally typeset
Search or ask a question

Showing papers on "Metric (mathematics) published in 2009"


Journal ArticleDOI
TL;DR: This paper shows how to learn a Mahalanobis distance metric for kNN classification from labeled examples in a globally integrated manner and finds that metrics trained in this way lead to significant improvements in kNN Classification.
Abstract: The accuracy of k-nearest neighbor (kNN) classification depends significantly on the metric used to compute distances between different examples. In this paper, we show how to learn a Mahalanobis distance metric for kNN classification from labeled examples. The Mahalanobis metric can equivalently be viewed as a global linear transformation of the input space that precedes kNN classification using Euclidean distances. In our approach, the metric is trained with the goal that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. As in support vector machines (SVMs), the margin criterion leads to a convex optimization based on the hinge loss. Unlike learning in SVMs, however, our approach requires no modification or extension for problems in multiway (as opposed to binary) classification. In our framework, the Mahalanobis distance metric is obtained as the solution to a semidefinite program. On several data sets of varying size and difficulty, we find that metrics trained in this way lead to significant improvements in kNN classification. Sometimes these results can be further improved by clustering the training examples and learning an individual metric within each cluster. We show how to learn and combine these local metrics in a globally integrated manner.

4,157 citations


Proceedings ArticleDOI
29 Sep 2009
TL;DR: Two methods for learning robust distance measures are presented: a logistic discriminant approach which learns the metric from a set of labelled image pairs (LDML) and a nearest neighbour approach which computes the probability for two images to belong to the same class (MkNN).
Abstract: Face identification is the problem of determining whether two face images depict the same person or not. This is difficult due to variations in scale, pose, lighting, background, expression, hairstyle, and glasses. In this paper we present two methods for learning robust distance measures: (a) a logistic discriminant approach which learns the metric from a set of labelled image pairs (LDML) and (b) a nearest neighbour approach which computes the probability for two images to belong to the same class (MkNN). We evaluate our approaches on the Labeled Faces in the Wild data set, a large and very challenging data set of faces from Yahoo! News. The evaluation protocol for this data set defines a restricted setting, where a fixed set of positive and negative image pairs is given, as well as an unrestricted one, where faces are labelled by their identity. We are the first to present results for the unrestricted setting, and show that our methods benefit from this richer training data, much more so than the current state-of-the-art method. Our results of 79.3% and 87.5% correct for the restricted and unrestricted setting respectively, significantly improve over the current state-of-the-art result of 78.5%. Confidence scores obtained for face identification can be used for many applications e.g. clustering or recognition from a single training example. We show that our learned metrics also improve performance for these tasks.

913 citations


Proceedings ArticleDOI
14 Jun 2009
TL;DR: It is demonstrated experimentally that commonly-used methods are unlikely to accurately estimate the probability of held-out documents, and two alternative methods that are both accurate and efficient are proposed.
Abstract: A natural evaluation metric for statistical topic models is the probability of held-out documents given a trained model. While exact computation of this probability is intractable, several estimators for this probability have been used in the topic modeling literature, including the harmonic mean method and empirical likelihood method. In this paper, we demonstrate experimentally that commonly-used methods are unlikely to accurately estimate the probability of held-out documents, and propose two alternative methods that are both accurate and efficient.

877 citations


Proceedings ArticleDOI
02 Nov 2009
TL;DR: This work presents a new editorial metric for graded relevance which overcomes this difficulty and implicitly discounts documents which are shown below very relevant documents and calls it Expected Reciprocal Rank (ERR).
Abstract: While numerous metrics for information retrieval are available in the case of binary relevance, there is only one commonly used metric for graded relevance, namely the Discounted Cumulative Gain (DCG). A drawback of DCG is its additive nature and the underlying independence assumption: a document in a given position has always the same gain and discount independently of the documents shown above it. Inspired by the "cascade" user model, we present a new editorial metric for graded relevance which overcomes this difficulty and implicitly discounts documents which are shown below very relevant documents. More precisely, this new metric is defined as the expected reciprocal length of time that the user will take to find a relevant document. This can be seen as an extension of the classical reciprocal rank to the graded relevance case and we call this metric Expected Reciprocal Rank (ERR). We conduct an extensive evaluation on the query logs of a commercial search engine and show that ERR correlates better with clicks metrics than other editorial metrics.

831 citations


Proceedings ArticleDOI
01 Sep 2009
TL;DR: This work proposes TagProp, a discriminatively trained nearest neighbor model that allows the integration of metric learning by directly maximizing the log-likelihood of the tag predictions in the training set, and introduces a word specific sigmoidal modulation of the weighted neighbor tag predictions to boost the recall of rare words.
Abstract: Image auto-annotation is an important open problem in computer vision. For this task we propose TagProp, a discriminatively trained nearest neighbor model. Tags of test images are predicted using a weighted nearest-neighbor model to exploit labeled training images. Neighbor weights are based on neighbor rank or distance. TagProp allows the integration of metric learning by directly maximizing the log-likelihood of the tag predictions in the training set. In this manner, we can optimally combine a collection of image similarity metrics that cover different aspects of image content, such as local shape descriptors, or global color histograms. We also introduce a word specific sigmoidal modulation of the weighted neighbor tag predictions to boost the recall of rare words. We investigate the performance of different variants of our model and compare to existing work. We present experimental results for three challenging data sets. On all three, TagProp makes a marked improvement as compared to the current state-of-the-art.

739 citations


Journal ArticleDOI
TL;DR: This article defines a few intuitive formal constraints on such metrics which shed light on which aspects of the quality of a clustering are captured by different metric families, and proposes a modified version of Bcubed that avoids the problems found with other metrics.
Abstract: There is a wide set of evaluation metrics available to compare the quality of text clustering algorithms. In this article, we define a few intuitive formal constraints on such metrics which shed light on which aspects of the quality of a clustering are captured by different metric families. These formal constraints are validated in an experiment involving human assessments, and compared with other constraints proposed in the literature. Our analysis of a wide range of metrics shows that only BCubed satisfies all formal constraints. We also extend the analysis to the problem of overlapping clustering, where items can simultaneously belong to more than one cluster. As Bcubed cannot be directly applied to this task, we propose a modified version of Bcubed that avoids the problems found with other metrics.

729 citations


Journal ArticleDOI
TL;DR: This work presents a perceptual-based no-reference objective image sharpness/blurriness metric by integrating the concept of just noticeable blur into a probability summation model that is able to predict with high accuracy the relative amount of blurriness in images with different content.
Abstract: This work presents a perceptual-based no-reference objective image sharpness/blurriness metric by integrating the concept of just noticeable blur into a probability summation model. Unlike existing objective no-reference image sharpness/blurriness metrics, the proposed metric is able to predict the relative amount of blurriness in images with different content. Results are provided to illustrate the performance of the proposed perceptual-based sharpness metric. These results show that the proposed sharpness metric correlates well with the perceived sharpness being able to predict with high accuracy the relative amount of blurriness in images with different content.

718 citations


Journal Article
TL;DR: This paper reviews the proper construction of offline experiments for deciding on the most appropriate algorithm, and discusses three important tasks of recommender systems, and classify a set of appropriate well known evaluation metrics for each task.
Abstract: Recommender systems are now popular both commercially and in the research community, where many algorithms have been suggested for providing recommendations. These algorithms typically perform differently in various domains and tasks. Therefore, it is important from the research perspective, as well as from a practical view, to be able to decide on an algorithm that matches the domain and the task of interest. The standard way to make such decisions is by comparing a number of algorithms offline using some evaluation metric. Indeed, many evaluation metrics have been suggested for comparing recommendation algorithms. The decision on the proper evaluation metric is often critical, as each metric may favor a different algorithm. In this paper we review the proper construction of offline experiments for deciding on the most appropriate algorithm. We discuss three important tasks of recommender systems, and classify a set of appropriate well known evaluation metrics for each task. We demonstrate how using an improper evaluation metric can lead to the selection of an improper algorithm for the task of interest. We also discuss other important considerations when designing offline experiments.

580 citations


Journal ArticleDOI
TL;DR: This paper proposes multi-valued semantics for MTL formulas, which capture not only the usual Boolean satisfiability of the formula, but also topological information regarding the distance, @e, from unsatisfiability.

551 citations


Journal ArticleDOI
TL;DR: A new matrix learning scheme to extend relevance learning vector quantization (RLVQ), an efficient prototype-based classification algorithm, toward a general adaptive metric by introducing a full matrix of relevance factors in the distance measure.
Abstract: We propose a new matrix learning scheme to extend relevance learning vector quantization (RLVQ), an efficient prototype-based classification algorithm, toward a general adaptive metric By introducing a full matrix of relevance factors in the distance measure, correlations between different features and their importance for the classification scheme can be taken into account and automated, and general metric adaptation takes place during training In comparison to the weighted Euclidean metric used in RLVQ and its variations, a full matrix is more powerful to represent the internal structure of the data appropriately Large margin generalization bounds can be transferred to this case, leading to bounds that are independent of the input dimensionality This also holds for local metrics attached to each prototype, which corresponds to piecewise quadratic decision boundaries The algorithm is tested in comparison to alternative learning vector quantization schemes using an artificial data set, a benchmark multiclass problem from the VCI repository, and a problem from bioinformatics, the recognition of splice sites for C elegans

344 citations


Journal ArticleDOI
TL;DR: A framework for analyzing the results of a SLAM approach based on a metric for measuring the error of the corrected trajectory is proposed, which overcomes serious shortcomings of approaches using a global reference frame to compute the error.
Abstract: In this paper, we address the problem of creating an objective benchmark for evaluating SLAM approaches. We propose a framework for analyzing the results of a SLAM approach based on a metric for measuring the error of the corrected trajectory. This metric uses only relative relations between poses and does not rely on a global reference frame. This overcomes serious shortcomings of approaches using a global reference frame to compute the error. Our method furthermore allows us to compare SLAM approaches that use different estimation techniques or different sensor modalities since all computations are made based on the corrected trajectory of the robot. We provide sets of relative relations needed to compute our metric for an extensive set of datasets frequently used in the robotics community. The relations have been obtained by manually matching laser-range observations to avoid the errors caused by matching algorithms. Our benchmark framework allows the user to easily analyze and objectively compare different SLAM approaches.

Journal ArticleDOI
TL;DR: The Meteor Automatic Metric for Machine Translation evaluation, originally developed and released in 2004, was designed with the explicit goal of producing sentence-level scores which correlate well with human judgments of translation quality.
Abstract: The Meteor Automatic Metric for Machine Translation evaluation, originally developed and released in 2004, was designed with the explicit goal of producing sentence-level scores which correlate well with human judgments of translation quality. Several key design decisions were incorporated into Meteor in support of this goal. In contrast with IBM's Bleu, which uses only precision-based features, Meteor uses and emphasizes recall in addition to precision, a property that has been confirmed by several metrics as being critical for high correlation with human judgments. Meteor also addresses the problem of reference translation variability by utilizing flexible word matching, allowing for morphological variants and synonyms to be taken into account as legitimate correspondences. Furthermore, the feature ingredients within Meteor are parameterized, allowing for the tuning of the metric's free parameters in search of values that result in optimal correlation with human judgments. Optimal parameters can be separately tuned for different types of human judgments and for different languages. We discuss the initial design of the Meteor metric, subsequent improvements, and performance in several independent evaluations in recent years.

Journal ArticleDOI
TL;DR: It is shown that the similarity provided by TWED is a potentially useful metric in time series retrieval applications since it could benefit from the triangular inequality property to speed up the retrieval process while tuning the parameters of the elastic measure.
Abstract: In a way similar to the string-to-string correction problem, we address discrete time series similarity in light of a time-series-to-time-series-correction problem for which the similarity between two time series is measured as the minimum cost sequence of edit operations needed to transform one time series into another. To define the edit operations, we use the paradigm of a graphical editing process and end up with a dynamic programming algorithm that we call time warp edit distance (TWED). TWED is slightly different in form from dynamic time warping (DTW), longest common subsequence (LCSS), or edit distance with real penalty (ERP) algorithms. In particular, it highlights a parameter that controls a kind of stiffness of the elastic measure along the time axis. We show that the similarity provided by TWED is a potentially useful metric in time series retrieval applications since it could benefit from the triangular inequality property to speed up the retrieval process while tuning the parameters of the elastic measure. In that context, a lower bound is derived to link the matching of time series into down sampled representation spaces to the matching into the original space. The empiric quality of the TWED distance is evaluated on a simple classification task. Compared to edit distance, DTW, LCSS, and ERP, TWED has proved to be quite effective on the considered experimental task.

Journal ArticleDOI
TL;DR: In this paper, a generalized form of the Kurdyka-Lojasiewicz inequality is introduced for nonsmooth lower semicontinuous functions defined on a metric or a real Hilbert space.
Abstract: The classical Lojasiewicz inequality and its extensions for partial differential equation problems (Simon) and to o-minimal structures (Kurdyka) have a considerable impact on the analysis of gradient-like methods and related problems: minimization methods, complexity theory, asymptotic analysis of dissipative partial differential equations, tame geometry. This paper provides alternative characterizations of this type of inequalities for nonsmooth lower semicontinuous functions defined on a metric or a real Hilbert space. In a metric context, we show that a generalized form of the Lojasiewicz inequality (hereby called the Kurdyka-Lojasiewicz inequality) relates to metric regularity and to the Lipschitz continuity of the sublevel mapping, yielding applications to discrete methods (strong convergence of the proximal algorithm). In a Hilbert setting we further establish that asymptotic properties of the semiflow generated by $-\partial f$ are strongly linked to this inequality. This is done by introducing the notion of a piecewise subgradient curve: such curves have uniformly bounded lengths if and only if the Kurdyka-Lojasiewicz inequality is satisfied. Further characterizations in terms of talweg lines -a concept linked to the location of the less steepest points at the level sets of $f$- and integrability conditions are given. In the convex case these results are significantly reinforced, allowing in particular to establish the asymptotic equivalence of discrete gradient methods and continuous gradient curves. On the other hand, a counterexample of a convex C^2 function in in the plane is constructed to illustrate the fact that, contrary to our intuition, and unless a specific growth condition is satisfied, convex functions may fail to fulfill the Kurdyka-Lojasiewicz inequality.

Journal ArticleDOI
15 Jul 2009
TL;DR: A family of signatures for finite metric spaces, possibly endowed with real valued functions, based on the persistence diagrams of suitable filtrations built on top of these spaces are introduced, and the stability of these signatures under Gromov‐Hausdorff perturbations of the spaces is proved.
Abstract: We introduce a family of signatures for finite metric spaces, possibly endowed with real valued functions, based on the persistence diagrams of suitable filtrations built on top of these spaces. We prove the stability of our signatures under Gromov-Hausdorff perturbations of the spaces. We also extend these results to metric spaces equipped with measures. Our signatures are well-suited for the study of unstructured point cloud data, which we illustrate through an application in shape classification.

Journal ArticleDOI
TL;DR: A method that enables scalable similarity search for learned metrics and an indirect solution that enables metric learning and hashing for vector spaces whose high dimensionality makes it infeasible to learn an explicit transformation over the feature dimensions.
Abstract: We introduce a method that enables scalable similarity search for learned metrics. Given pairwise similarity and dissimilarity constraints between some examples, we learn a Mahalanobis distance function that captures the examples' underlying relationships well. To allow sublinear time similarity search under the learned metric, we show how to encode the learned metric parameterization into randomized locality-sensitive hash functions. We further formulate an indirect solution that enables metric learning and hashing for vector spaces whose high dimensionality makes it infeasible to learn an explicit transformation over the feature dimensions. We demonstrate the approach applied to a variety of image data sets, as well as a systems data set. The learned metrics improve accuracy relative to commonly used metric baselines, while our hashing construction enables efficient indexing with learned distances and very large databases.

Journal ArticleDOI
TL;DR: This work applies the force paradigm to create localized versions of MDS stress functions with a tuning parameter to adjust the strength of nonlocal repulsive forces and solves the problem of tuning parameter selection with a meta-criterion that measures how well the sets of K-nearest neighbors agree between the data and the embedding.
Abstract: In the past decade there has been a resurgence of interest in nonlinear dimension reduction. Among new proposals are “Local Linear Embedding,” “Isomap,” and Kernel Principal Components Analysis which all construct global low-dimensional embeddings from local affine or metric information. We introduce a competing method called “Local Multidimensional Scaling” (LMDS). Like LLE, Isomap, and KPCA, LMDS constructs its global embedding from local information, but it uses instead a combination of MDS and “force-directed” graph drawing. We apply the force paradigm to create localized versions of MDS stress functions with a tuning parameter to adjust the strength of nonlocal repulsive forces. We solve the problem of tuning parameter selection with a meta-criterion that measures how well the sets of K-nearest neighbors agree between the data and the embedding. Tuned LMDS seems to be able to outperform MDS, PCA, LLE, Isomap, and KPCA, as illustrated with two well-known image datasets. The meta-criterion can also be ...

Journal ArticleDOI
01 Nov 2009
TL;DR: Experimental evaluations using WordNet indicate that the proposed metric, coupled with the notion of intrinsic IC, yields results above the state of the art, and the intrinsic IC formulation also improves the accuracy of other IC-based metrics.
Abstract: In many research fields such as Psychology, Linguistics, Cognitive Science and Artificial Intelligence, computing semantic similarity between words is an important issue. In this paper a new semantic similarity metric, that exploits some notions of the feature-based theory of similarity and translates it into the information theoretic domain, which leverages the notion of Information Content (IC), is presented. In particular, the proposed metric exploits the notion of intrinsic IC which quantifies IC values by scrutinizing how concepts are arranged in an ontological structure. In order to evaluate this metric, an on line experiment asking the community of researchers to rank a list of 65 word pairs has been conducted. The experiment's web setup allowed to collect 101 similarity ratings and to differentiate native and non-native English speakers. Such a large and diverse dataset enables to confidently evaluate similarity metrics by correlating them with human assessments. Experimental evaluations using WordNet indicate that the proposed metric, coupled with the notion of intrinsic IC, yields results above the state of the art. Moreover, the intrinsic IC formulation also improves the accuracy of other IC-based metrics. In order to investigate the generality of both the intrinsic IC formulation and proposed similarity metric a further evaluation using the MeSH biomedical ontology has been performed. Even in this case significant results were obtained. The proposed metric and several others have been implemented in the Java WordNet Similarity Library.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the space of complete and separable metric spaces which are equipped with a probability measure and give a notion of convergence based on the philosophy that a sequence of metric measure spaces converges if and only if all finite subspaces sampled from these spaces converge.
Abstract: We consider the space of complete and separable metric spaces which are equipped with a probability measure. A notion of convergence is given based on the philosophy that a sequence of metric measure spaces converges if and only if all finite subspaces sampled from these spaces converge. This topology is metrized following Gromov’s idea of embedding two metric spaces isometrically into a common metric space combined with the Prohorov metric between probability measures on a fixed metric space. We show that for this topology convergence in distribution follows—provided the sequence is tight—from convergence of all randomly sampled finite subspaces. We give a characterization of tightness based on quantities which are reasonably easy to calculate. Subspaces of particular interest are the space of real trees and of ultra-metric spaces equipped with a probability measure. As an example we characterize convergence in distribution for the (ultra-)metric measure spaces given by the random genealogies of the Λ-coalescents. We show that the Λ-coalescent defines an infinite (random) metric measure space if and only if the so-called “dust-free”-property holds.

Journal ArticleDOI
TL;DR: Flaws in the variety metric are described and a new metric is proposed to evaluate the quality of design space exploration during concept generation, enabling application of a single metric to compare idea generation processes and methodologies.

Proceedings ArticleDOI
29 Jul 2009
TL;DR: It is shown that the proposed metric results in a very good correlation with subjective scores especially for images with varying foreground and background perceived blur qualities, and with a significantly lower computational complexity as compared to existing methods that take into account the visual attention information.
Abstract: In this paper, a no-reference objective sharpness metric based on a cumulative probability of blur detection is proposed. The metric is evaluated by taking into account the Human Visual System (HVS) response to blur distortions. The perceptual significance of the metric is validated through subjective experiments. It is shown that the proposed metric results in a very good correlation with subjective scores especially for images with varying foreground and background perceived blur qualities. This is accomplished with a significantly lower computational complexity as compared to existing methods that take into account the visual attention information.

Proceedings ArticleDOI
20 Apr 2009
TL;DR: Coupling is no longer a one-dimensional concept with loose coupling found somewhere in between tight coupling and no coupling, and the metric is applied to real-world examples in order to support and improve the design process of service-oriented systems.
Abstract: Loose coupling is often quoted as a desirable property of systems architectures. One of the main goals of building systems using Web technologies is to achieve loose coupling. However, given the lack of a widely accepted definition of this term, it becomes hard to use coupling as a criterion to evaluate alternative Web technology choices, as all options may exhibit, and claim to provide, some kind of "loose" coupling effects. This paper presents a systematic study of the degree of coupling found in service-oriented systems based on a multi-faceted approach. Thanks to the metric introduced in this paper, coupling is no longer a one-dimensional concept with loose coupling found somewhere in between tight coupling and no coupling. The paper shows how the metric can be applied to real-world examples in order to support and improve the design process of service-oriented systems.

Patent
Ming Jin1
31 Aug 2009
TL;DR: In this article, a disk drive is disclosed comprising a head actuated over a disk comprising a plurality of data tracks, and a quality metric is generated in response to the read signal.
Abstract: A disk drive is disclosed comprising a head actuated over a disk comprising a plurality of data tracks. Data is read from one of the data tracks to generate a read signal, and a quality metric is generated in response to the read signal. When the quality metric exceeds a first threshold, a defect is detected in at least part of the data track. When the quality metric exceeds a second threshold different than the first threshold, the data track is reread to regenerate the quality metric, and when the quality metric exceeds the second threshold at least twice, the defect is detected.

Journal ArticleDOI
TL;DR: It turns out that one should use the properties that determine in the more important way the behavior of the amino acids and that the use of the appropriate metric can help in defining the groups into groups.

Posted Content
TL;DR: A novel interpretation is provided for IPMs by relating them to binary classification, where it is shown that the IPM between class-conditional distributions is the negative of the optimal risk associated with a binary classifier.
Abstract: A class of distance measures on probabilities -- the integral probability metrics (IPMs) -- is addressed: these include the Wasserstein distance, Dudley metric, and Maximum Mean Discrepancy. IPMs have thus far mostly been used in more abstract settings, for instance as theoretical tools in mass transportation problems, and in metrizing the weak topology on the set of all Borel probability measures defined on a metric space. Practical applications of IPMs are less common, with some exceptions in the kernel machines literature. The present work contributes a number of novel properties of IPMs, which should contribute to making IPMs more widely used in practice, for instance in areas where $\phi$-divergences are currently popular. First, to understand the relation between IPMs and $\phi$-divergences, the necessary and sufficient conditions under which these classes intersect are derived: the total variation distance is shown to be the only non-trivial $\phi$-divergence that is also an IPM. This shows that IPMs are essentially different from $\phi$-divergences. Second, empirical estimates of several IPMs from finite i.i.d. samples are obtained, and their consistency and convergence rates are analyzed. These estimators are shown to be easily computable, with better rates of convergence than estimators of $\phi$-divergences. Third, a novel interpretation is provided for IPMs by relating them to binary classification, where it is shown that the IPM between class-conditional distributions is the negative of the optimal risk associated with a binary classifier. In addition, the smoothness of an appropriate binary classifier is proved to be inversely related to the distance between the class-conditional distributions, measured in terms of an IPM.

Proceedings ArticleDOI
Elad Hazan1, C. Seshadhri1
14 Jun 2009
TL;DR: A different performance metric is proposed which strengthens the standard metric of regret and measures performance with respect to a changing comparator and can be applied to various learning scenarios, i.e. online portfolio selection, for which there are experimental results showing the advantage of adaptivity.
Abstract: We study online learning in an oblivious changing environment. The standard measure of regret bounds the difference between the cost of the online learner and the best decision in hindsight. Hence, regret minimizing algorithms tend to converge to the static best optimum, clearly a suboptimal behavior in changing environments. On the other hand, various metrics proposed to strengthen regret and allow for more dynamic algorithms produce inefficient algorithms.We propose a different performance metric which strengthens the standard metric of regret and measures performance with respect to a changing comparator. We then describe a series of data-streaming-based reductions which transform algorithms for minimizing (standard) regret into adaptive algorithms albeit incurring only poly-logarithmic computational overhead.Using this reduction, we obtain efficient low adaptive-regret algorithms for the problem of online convex optimization. This can be applied to various learning scenarios, i.e. online portfolio selection, for which we describe experimental results showing the advantage of adaptivity.

Journal ArticleDOI
TL;DR: In this article, the problem of error correction in coherent and non-coherent network coding is considered under an adversarial model, and it is shown that universal network error correcting codes achieving the Singleton bound can be easily constructed and efficiently decoded.
Abstract: The problem of error correction in both coherent and noncoherent network coding is considered under an adversarial model. For coherent network coding, where knowledge of the network topology and network code is assumed at the source and destination nodes, the error correction capability of an (outer) code is succinctly described by the rank metric; as a consequence, it is shown that universal network error correcting codes achieving the Singleton bound can be easily constructed and efficiently decoded. For noncoherent network coding, where knowledge of the network topology and network code is not assumed, the error correction capability of a (subspace) code is given exactly by a new metric, called the injection metric, which is closely related to, but different than, the subspace metric of KOumltter and Kschischang. In particular, in the case of a non-constant-dimension code, the decoder associated with the injection metric is shown to correct more errors then a minimum-subspace-distance decoder. All of these results are based on a general approach to adversarial error correction, which could be useful for other adversarial channels beyond network coding.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: This paper introduces a probabilistic variant of the K-nearest neighbor method for classification that can be seamlessly used for active learning in multi-class scenarios and uses this measure of uncertainty to actively sample training examples that maximize discriminating capabilities of the model.
Abstract: Scarcity and infeasibility of human supervision for large scale multi-class classification problems necessitates active learning. Unfortunately, existing active learning methods for multi-class problems are inherently binary methods and do not scale up to a large number of classes. In this paper, we introduce a probabilistic variant of the K-nearest neighbor method for classification that can be seamlessly used for active learning in multi-class scenarios. Given some labeled training data, our method learns an accurate metric/kernel function over the input space that can be used for classification and similarity search. Unlike existing metric/kernel learning methods, our scheme is highly scalable for classification problems and provides a natural notion of uncertainty over class labels. Further, we use this measure of uncertainty to actively sample training examples that maximize discriminating capabilities of the model. Experiments on benchmark datasets show that the proposed method learns appropriate distance metrics that lead to state-of-the-art performance for object categorization problems. Furthermore, our active learning method effectively samples training examples, resulting in significant accuracy gains over random sampling for multi-class problems involving a large number of classes.

Journal ArticleDOI
TL;DR: A beacon-based embedding algorithm is given that achieves constant distortion on a 1 - /spl epsiv/ fraction of distances; this provides some theoretical justification for the success of the recent global network positioning algorithm of Ng and Zhang.
Abstract: Concurrent with recent theoretical interest in the problem of metric embedding, a growing body of research in the networking community has studied the distance matrix defined by node-to-node latencies in the Internet, resulting in a number of recent approaches that approximately embed this distance matrix into low-dimensional Euclidean space. There is a fundamental distinction, however, between the theoretical approaches to the embedding problem and this recent Internet-related work: in addition to computational limitations, Internet measurement algorithms operate under the constraint that it is only feasible to measure distances for a linear (or near-linear) number of node pairs, and typically in a highly structured way. Indeed, the most common framework for Internet measurements of this type is a beacon-based approach one chooses uniformly at random a constant number of nodes (“beacons”) in the network, each node measures its distance to all beacons, and one then has access to only these measurements for the remainder of the algorithm. Moreover, beacon-based algorithms are often designed not for embedding but for the more basic problem of triangulation, in which one uses the triangle inequality to infer the distances that have not been measured.Here we give algorithms with provable performance guarantees for beacon-based triangulation and embedding. We show that in addition to multiplicative error in the distances, performance guarantees for beacon-based algorithms typically must include a notion of slack—a certain fraction of all distances may be arbitrarily distorted. For metric spaces of bounded doubling dimension (which have been proposed as a reasonable abstraction of Internet latencies), we show that triangulation-based distance reconstruction with a constant number of beacons can achieve multiplicative error 1 + δ on a 1 − e fraction of distances, for arbitrarily small constants δ and e. For this same class of metric spaces, we give a beacon-based embedding algorithm that achieves constant distortion on a 1 − e fraction of distances; this provides some theoretical justification for the success of the recent Global Network Positioning algorithm of Ng and Zhang [2002], and it forms an interesting contrast with lower bounds showing that it is not possible to embed all distances in a doubling metric space with constant distortion. We also give results for other classes of metric spaces, as well as distributed algorithms that require only a sparse set of distances but do not place too much measurement load on any one node.

Journal ArticleDOI
TL;DR: This paper has integrated the linguistic two-tuple representation model, which allows the symbolic translation of a label by only considering one parameter, with an efficient modification of the well known (2 + 2) Pareto archived evolution strategy.
Abstract: In this paper, we propose the use of a multiobjective evolutionary approach to generate a set of linguistic fuzzy-rule-based systems with different tradeoffs between accuracy and interpretability in regression problems. Accuracy and interpretability are measured in terms of approximation error and rule base (RB) complexity, respectively. The proposed approach is based on concurrently learning RBs and parameters of the membership functions of the associated linguistic labels. To manage the size of the search space, we have integrated the linguistic two-tuple representation model, which allows the symbolic translation of a label by only considering one parameter, with an efficient modification of the well known (2 + 2) Pareto archived evolution strategy (PAES). We tested our approach on nine real world datasets of different sizes and with different numbers of variables. Besides the (2 + 2)PAES, we have also used the well known nondominated sorting genetic algorithm (NSGA-II) and an accuracy-driven single-objective evolutionary algorithm (EA). We employed these optimization techniques both to concurrently learn rules and parameters and to learn only rules. We compared the different approaches by applying a nonparametric statistical test for pairwise comparisons, thus taking into consideration three representative points from the obtained Pareto fronts in the case of the multiobjective EAs. Finally, a data complexity measure, which is typically used in pattern recognition to evaluate the data density in terms of average number of patterns per variable, has been introduced to characterize regression problems. Results confirm the effectiveness of our approach, particularly for (possibly high dimensional) datasets with high values of the complexity metric.