scispace - formally typeset
Search or ask a question

Showing papers on "Metric (mathematics) published in 2002"


Proceedings ArticleDOI
01 Jan 2002
TL;DR: The wide-baseline stereo problem, i.e. the problem of establishing correspondences between a pair of images taken from different viewpoints, is studied and an efficient and practically fast detection algorithm is presented for an affinely-invariant stable subset of extremal regions, the maximally stable extremal region (MSER).
Abstract: The wide-baseline stereo problem, i.e. the problem of establishing correspondences between a pair of images taken from different viewpoints is studied. A new set of image elements that are put into correspondence, the so called extremal regions , is introduced. Extremal regions possess highly desirable properties: the set is closed under (1) continuous (and thus projective) transformation of image coordinates and (2) monotonic transformation of image intensities. An efficient (near linear complexity) and practically fast detection algorithm (near frame rate) is presented for an affinely invariant stable subset of extremal regions, the maximally stable extremal regions (MSER). A new robust similarity measure for establishing tentative correspondences is proposed. The robustness ensures that invariants from multiple measurement regions (regions obtained by invariant constructions from extremal regions), some that are significantly larger (and hence discriminative) than the MSERs, may be used to establish tentative correspondences. The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes. Significant change of scale (3.5×), illumination conditions, out-of-plane rotation, occlusion, locally anisotropic scale change and 3D translation of the viewpoint are all present in the test problems. Good estimates of epipolar geometry (average distance from corresponding points to the epipolar line below 0.09 of the inter-pixel distance) are obtained.

3,400 citations


Book ChapterDOI
07 Mar 2002
TL;DR: In this paper, the authors describe a peer-to-peer distributed hash table with provable consistency and performance in a fault-prone environment, which routes queries and locates nodes using a novel XOR-based metric topology.
Abstract: We describe a peer-to-peer distributed hash table with provable consistency and performance in a fault-prone environment. Our system routes queries and locates nodes using a novel XOR-based metric topology that simplifies the algorithm and facilitates our proof. The topology has the property that every message exchanged conveys or reinforces useful contact information. The system exploits this information to send parallel, asynchronous query messages that tolerate node failures without imposing timeout delays on users.

3,196 citations


Proceedings Article
01 Jan 2002
TL;DR: This paper presents an algorithm that, given examples of similar (and, if desired, dissimilar) pairs of points in �”n, learns a distance metric over ℝn that respects these relationships.
Abstract: Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may be for the user to manually tweak the metric until sufficiently good clusters are found. For these and other applications requiring good metrics, it is desirable that we provide a more systematic way for users to indicate what they consider "similar." For instance, we may ask them to provide examples. In this paper, we present an algorithm that, given examples of similar (and, if desired, dissimilar) pairs of points in ℝn, learns a distance metric over ℝn that respects these relationships. Our method is based on posing metric learning as a convex optimization problem, which allows us to give efficient, local-optima-free algorithms. We also demonstrate empirically that the learned metrics can be used to significantly improve clustering performance.

3,176 citations


Journal ArticleDOI
TL;DR: In this paper, a method for combining results across independent-groups and repeated measures designs is described, and the conditions under which such an analysis is appropriate are discussed, and a meta-analysis procedure using design-specific estimates of sampling variance is described.
Abstract: When a meta-analysis on results from experimental studies is conducted, differences in the study design must be taken into consideration. A method for combining results across independent-groups and repeated measures designs is described, and the conditions under which such an analysis is appropriate are discussed. Combining results across designs requires that (a) all effect sizes be transformed into a common metric, (b) effect sizes from each design estimate the same treatment effect, and (c) meta-analysis procedures use design-specific estimates of sampling variance to reflect the precision of the effect size estimates.

1,949 citations


Book ChapterDOI
01 Jan 2002
TL;DR: Dynamic time warping (DTW) is a much more robust distance measure for time series, allowing similar shapes to match even if they are out of phase in the time axis, but does not obey the triangular inequality and, thus, has resisted attempts at exact indexing.
Abstract: Publisher Summary The indexing of very large time series databases has attracted the attention of database community in recent years. The vast majority of work in this area has focused on indexing under the Euclidean distance metric. The problem of indexing time series has attracted much research interest in the database community. Most algorithms that are used to index time series utilize the Euclidean distance or some variation thereof. However, it has been forcefully shown that the Euclidean distance is a very brittle distance measure. Dynamic time warping (DTW) is a much more robust distance measure for time series, allowing similar shapes to match even if they are out of phase in the time axis. Because of this flexibility, DTW is widely used in science, medicine, industry, and finance. Unfortunately, however, DTW does not obey the triangular inequality and, thus, has resisted attempts at exact indexing. Instead, many researchers have introduced approximate indexing techniques, or abandoned the idea of indexing and concentrated on speeding up sequential search.

1,033 citations


Journal ArticleDOI
TL;DR: A simple technique is adopted which ensures metric cancellation and thus ensures freestream preservation even on highly distorted curvilinear meshes, and metric cancellation is guaranteed regardless of the manner in which grid speeds are defined.

950 citations


Proceedings ArticleDOI
07 Nov 2002
TL;DR: An efficient method to estimate the distance between discrete 3D surfaces represented by triangular 3D meshes based on an approximation of the Hausdorff distance is proposed.
Abstract: This paper proposes an efficient method to estimate the distance between discrete 3D surfaces represented by triangular 3D meshes. The metric used is based on an approximation of the Hausdorff distance, which has been appropriately implemented in order to reduce unnecessary computation and memory usage. Results show that when compared to similar tools, a significant gain in both memory and speed can be achieved.

751 citations


Proceedings ArticleDOI
Robert Malouf1
31 Aug 2002
TL;DR: A number of algorithms for estimating the parameters of ME models are considered, including iterative scaling, gradient ascent, conjugate gradient, and variable metric methods.
Abstract: Conditional maximum entropy (ME) models provide a general purpose machine learning technique which has been successfully applied to fields as diverse as computer vision and econometrics, and which is used for a wide variety of classification problems in natural language processing. However, the flexibility of ME models is not without cost. While parameter estimation for ME models is conceptually straightforward, in practice ME models for typical natural language tasks are very large, and may well contain many thousands of free parameters. In this paper, we consider a number of algorithms for estimating the parameters of ME models, including iterative scaling, gradient ascent, conjugate gradient, and variable metric methods. Sur-prisingly, the standardly used iterative scaling algorithms perform quite poorly in comparison to the others, and for all of the test problems, a limited-memory variable metric algorithm outperformed the other choices.

730 citations


Proceedings ArticleDOI
10 Dec 2002
TL;DR: A no-reference blur metric based on the analysis of the spread of the edges in an image is presented, which is shown to perform well over a range of image content.
Abstract: We present a no-reference blur metric for images and video. The blur metric is based on the analysis of the spread of the edges in an image. Its perceptual significance is validated through subjective experiments. The novel metric is near real-time, has low computational complexity and is shown to perform well over a range of image content. Potential applications include optimization of source coding, network resource management and autofocus of an image capturing device.

643 citations


Journal ArticleDOI
TL;DR: The first nontrivial polynomial-time approximation algorithms for a general family of classification problems of this type are provided, the metric labeling problem, which contains as special cases a number of standard classification frameworks, including several arising from the theory of Markov random fields.
Abstract: In a traditional classification problem, we wish to assign one of k labels (or classes) to each of n objects, in a way that is consistent with some observed data that we have about the problem. An active line of research in this area is concerned with classification when one has information about pairwise relationships among the objects to be classified; this issue is one of the principal motivations for the framework of Markov random fields, and it arises in areas such as image processing, biometry, and document analysis. In its most basic form, this style of analysis seeks to find a classification that optimizes a combinatorial function consisting of assignment costs---based on the individual choice of label we make for each object---and separation costs---based on the pair of choices we make for two "related" objects.We formulate a general classification problem of this type, the metric labeling problem; we show that it contains as special cases a number of standard classification frameworks, including several arising from the theory of Markov random fields. From the perspective of combinatorial optimization, our problem can be viewed as a substantial generalization of the multiway cut problem, and equivalent to a type of uncapacitated quadratic assignment problem.We provide the first nontrivial polynomial-time approximation algorithms for a general family of classification problems of this type. Our main result is an O(log k log log k)-approximation algorithm for the metric labeling problem, with respect to an arbitrary metric on a set of k labels, and an arbitrary weighted graph of relationships on a set of objects. For the special case in which the labels are endowed with the uniform metric---all distances are the same---our methods provide a 2-approximation algorithm.

502 citations


Journal ArticleDOI
TL;DR: A simple modification to the Pk metric is proposed, called Window Diff, which moves a fixed-sized window across the text and penalizes the algorithm whenever the number of boundaries within the window does not match the true number of borders for that window of text.
Abstract: The Pk evaluation metric, initially proposed by Beeferman, Berger, and Lafferty (1997), is becoming the standard measure for assessing text segmentation algorithms. However, a theoretical analysis of the metric finds several problems: the metric penalizes false negatives more heavily than false positives, overpenalizes near misses, and is affected by variation in segment size distribution. We propose a simple modification to the Pk metric that remedies these problems. This new metric-called WindowDiff-moves a fixed-sized window across the text and penalizes the algorithm whenever the number of boundaries within the window does not match the true number of boundaries for that window of text.

Posted Content
TL;DR: The method of dual fitting and the idea of factor-revealing LP are formalized and used to design and analyze two greedy algorithms for the metric uncapacitated facility location problem.
Abstract: In this paper, we will formalize the method of dual fitting and the idea of factor-revealing LP. This combination is used to design and analyze two greedy algorithms for the metric uncapacitated facility location problem. Their approximation factors are 1.861 and 1.61, with running times of O(mlog m) and O(n^3), respectively, where n is the total number of vertices and m is the number of edges in the underlying complete bipartite graph between cities and facilities. The algorithms are used to improve recent results for several variants of the problem.

01 Nov 2002
TL;DR: This document refers to a metric for variation in delay of packets across Internet paths based on the difference in the One-Way-Delay of selected packets.
Abstract: This document refers to a metric for variation in delay of packets across Internet paths. The metric is based on the difference in the One-Way-Delay of selected packets. This difference in delay is called "IP Packet Delay Variation (ipdv)".

Journal ArticleDOI
TL;DR: In this paper, a method for representing human movement compactly, in terms of a linear super-imposition of simpler movements termed i>primitives, is described, which is a part of a larger research project aimed at modeling motor control and imitation using the notion of perceptuo-motor primitives.
Abstract: We describe a new method for representing human movement compactly, in terms of a linear super-imposition of simpler movements termed i>primitives. This method is a part of a larger research project aimed at modeling motor control and imitation using the notion of perceptuo-motor primitives, a basis set of coupled perceptual and motor routines. In our model, the perceptual system is biased by the set of motor behaviors the agent can execute. Thus, an agent can automatically classify observed movements into its executable repertoire. In this paper, we describe a method for automatically deriving a set of primitives directly from human movement data. We used movement data gathered from a psychophysical experiment on human imitation to derive the primitives. The data were first filtered, then segmented, and principal component analysis was applied to the segments. The eigenvectors corresponding to a few of the highest eigenvalues provide us with a basis set of primitives. These are used, through superposition and sequencing, to reconstruct the training movements as well as novel ones. The validation of the method was performed on a humanoid simulation with physical dynamics. The effectiveness of the motion reconstruction was measured through an error metric. We also explored and evaluated a technique of clustering in the space of primitives for generating controllers for executing frequently used movements.

Journal ArticleDOI
TL;DR: A chi-squared distance analysis is used to compute a flexible metric for producing neighborhoods that are highly adaptive to query locations and the class conditional probabilities are smoother in the modified neighborhoods, whereby better classification performance can be achieved.
Abstract: Nearest-neighbor classification assumes locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with finite samples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest-neighbor rule. We propose a locally adaptive nearest-neighbor classification method to try to minimize bias. We use a chi-squared distance analysis to compute a flexible metric for producing neighborhoods that are highly adaptive to query locations. Neighborhoods are elongated along less relevant feature dimensions and constricted along most influential ones. As a result, the class conditional probabilities are smoother in the modified neighborhoods, whereby better classification performance can be achieved. The efficacy of our method is validated and compared against other techniques using both simulated and real-world data.

Book ChapterDOI
14 Apr 2002
TL;DR: In this paper, an alternative information theoretic measure of anonymity is proposed, which takes into account the probabilities of users sending and receiving the messages and shows how to calculate it for a message in a standard mix-based anonymity system.
Abstract: In this paper we look closely at the popular metric of anonymity, the anonymity set, and point out a number of problems associated with it. We then propose an alternative information theoretic measure of anonymity which takes into account the probabilities of users sending and receiving the messages and show how to calculate it for a message in a standard mix-based anonymity system. We also use our metric to compare a pool mix to a traditional threshold mix, which was impossible using anonymity sets. We also show how the maximum route length restriction which exists in some fielded anonymity systems can lead to the attacker performing more powerful traffic analysis. Finally, we discuss open problems and future work on anonymity measurements.

01 Jan 2002
TL;DR: A peer-to-peer distributed hash table with provable consistency and performance in a fault-prone environment is described using a novel XOR-based metric topology that simplifies the algorithm and facilitates the proof.
Abstract: We describe a peer-to-peer distributed hash table with provable consistency and performance in a fault-prone environment. Our system routes queries and locates nodes using a novel XOR-based metric topology that simplifies the algorithm and facilitates our proof. The topology has the property that every message exchanged conveys or reinforces useful contact information. The system exploits this information to send parallel, asynchronous query messages that tolerate node failures without imposing timeout delays on users.

Journal ArticleDOI
Pengzi Miao1
TL;DR: In this paper, a class of non-smooth asymptotically flat manifolds on which metrics fails to be $C^1$ across a hypersurface is studied, and the Positive Mass Theorem still holds on these manifolds if a geometric boundary condition is satisfied by metrics separated by σ.
Abstract: We study a class of non-smooth asymptotically flat manifolds on which metrics fails to be $C^1$ across a hypersurface $\Sigma$. We first give an approximation scheme to mollify the metric, then we prove that the Positive Mass Theorem still holds on these manifolds if a geometric boundary condition is satisfied by metrics separated by $\Sigma$.

Journal ArticleDOI
Alexander Barg1, D.Yu. Nogin
TL;DR: The Gilbert-Varshamov and Hamming bounds for packings of spheres (codes) in the Grassmann manifolds over R and C are derived.
Abstract: We derive the Gilbert-Varshamov and Hamming bounds for packings of spheres (codes) in the Grassmann manifolds over R and C. Asymptotic expressions are obtained for the geodesic metric and projection Frobenius (chordal) metric on the manifold.

Journal ArticleDOI
TL;DR: In this paper, a uniformization theory for a different type of generalized conformal structure is developed for a smooth Riemannian surface Z homeomorphic to the 2-sphere.
Abstract: According to the classical uniformization theorem, every smooth Riemannian surface Z homeomorphic to the 2-sphere is conformally diffeomorphic to S2 (the unit sphere inR3 equipped with the Riemannian metric induced by the ambient Euclidean metric). The availability of a similar uniformization procedure for spheres with a “generalized conformal structure” is highly desirable, in particular in connection with Thurston’s hyperbolization conjecture. This was addressed by Cannon in his combinatorial Riemann mapping theorem [7]. He considers topological surfaces equipped with a sequence of “shinglings”—a combinatorial structure that leads to a notion of approximate conformal moduli of rings. He then finds conditions that imply the existence of coordinate systems on the surface that relate these combinatorial moduli to classical analytic moduli in the plane. In this paper we develop a uniformization theory for a different type of generalized conformal structure. We start with a metric space Z homeomorphic to S2 and ask for conditions under which Z can be mapped onto S2 by a quasisymmetric homeomorphism. The class of quasisymmetries is an appropriate analog of conformal1 mappings in a metric space context. Quasisymmetric homeomorphisms also arise in the theory of Gromov hyperbolic metric spaces—quasi-isometries between Gromov hyperbolic spaces induce quasisymmetric boundary homeomorphisms. Our setup has the advantage that we can exploit recent notions and methods from Analysis

Posted Content
TL;DR: This work provides a summary and some new results concerning bounds among some important probability metrics/distances that are used by statisticians and probabilists and examples that show that rates of convergence can strongly depend on the metric chosen.
Abstract: When studying convergence of measures, an important issue is the choice of probability metric. In this review, we provide a summary and some new results concerning bounds among ten important probability metrics/distances that are used by statisticians and probabilists. We focus on these metrics because they are either well-known, commonly used, or admit practical bounding techniques. We summarize these relationships in a handy reference diagram, and also give examples to show how rates of convergence can depend on the metric chosen.

Book ChapterDOI
Maxim Sviridenko1
27 May 2002
TL;DR: A new approximation algorithm for the metric uncapacitated facility location problem is designed, of LP rounding type and is based on a rounding technique developed in [5,6,7].
Abstract: We design a new approximation algorithm for the metric uncapacitated facility location problem. This algorithm is of LP rounding type and is based on a rounding technique developed in [5,6,7].

Journal ArticleDOI
TL;DR: This investigation shows that, even in the large-system limit, jammed systems of hard spheres can be generated with a wide range of packing fractions from phi approximately 0.52 to the fcc limit, indicating that the density alone does not uniquely characterize a packing.
Abstract: Recently the conventional notion of random close packing has been supplanted by the more appropriate concept of the maximally random jammed (MRJ) state. This inevitably leads to the necessity of distinguishing the MRJ state among the entire collection of jammed packings. While the ideal method of addressing this question would be to enumerate and classify all possible jammed hard-sphere configurations, practical limitations prevent such a method from being employed. Instead, we generate numerically a large number of representative jammed hard-sphere configurations (primarily relying on a slight modification of the Lubachevsky-Stillinger algorithm to do so) and evaluate several commonly employed order metrics for each of these packings. Our investigation shows that, even in the large-system limit, jammed systems of hard spheres can be generated with a wide range of packing fractions from phi approximately 0.52 to the fcc limit (phi approximately 0.74). Moreover, at a fixed packing fraction, the variation in the order can be substantial, indicating that the density alone does not uniquely characterize a packing. Interestingly, each order metric evaluated yielded a relatively consistent estimate for the packing fraction of the maximally random jammed state (phi(MRJ) approximately 0.63). This estimate, however, is compromised by the weaknesses in the order metrics available, and we propose several guiding principles for future efforts to define more broadly applicable metrics.

Proceedings ArticleDOI
26 Jul 2002
TL;DR: This work builds a surface parametrization specialized to its signal, derived from a Taylor expansion of signal error, which is pre-integrated over the surface as a metric tensor for fast evaluation.
Abstract: To reduce memory requirements for texture mapping a model, we build a surface parametrization specialized to its signal (such as color or normal). Intuitively, we want to allocate more texture samples in regions with greater signal detail. Our approach is to minimize signal approximation error --- the difference between the original surface signal and its reconstruction from the sampled texture. Specifically, our signal-stretch parametrization metric is derived from a Taylor expansion of signal error. For fast evaluation, this metric is pre-integrated over the surface as a metric tensor. We minimize this nonlinear metric using a novel coarse-to-fine hierarchical solver, further accelerated with a fine-to-coarse propagation of the integrated metric tensor. Use of metric tensors permits anisotropic squashing of the parametrization along directions of low signal gradient. Texture area can often be reduced by a factor of 4 for a desired signal accuracy compared to non-specialized parametrizations.

Proceedings ArticleDOI
03 Jun 2002
TL;DR: The concept of compatibility matrix is introduced as the means to provide a probabilistic connection from the observation to the underlying true value and a new metric match is proposed to capture the "real support" of a pattern which would be expected if a noise-free environment is assumed.
Abstract: Pattern discovery in long sequences is of great importance in many applications including computational biology study, consumer behavior analysis, system performance analysis, etc. In a noisy environment, an observed sequence may not accurately reflect the underlying behavior. For example, in a protein sequence, the amino acid N is likely to mutate to D with little impact to the biological function of the protein. It would be desirable if the occurrence of D in the observation can be related to a possible mutation from N in an appropriate manner. Unfortunately, the support measure (i.e., the number of occurrences) of a pattern does not serve this purpose. In this paper, we introduce the concept of compatibility matrix as the means to provide a probabilistic connection from the observation to the underlying true value. A new metric match is also proposed to capture the "real support" of a pattern which would be expected if a noise-free environment is assumed. In addition, in the context we address, a pattern could be very long. The standard pruning technique developed for the market basket problem may not work efficiently. As a result, a novel algorithm that combines statistical sampling and a new technique (namely border collapsing) is devised to discover long patterns in a minimal number of scans of the sequence database with sufficiently high confidence. Empirical results demonstrate the robustness of the match model (with respect to the noise) and the efficiency of the probabilistic algorithm.

Journal ArticleDOI
TL;DR: Two experiments tested predictions from a theory in which processing load depends on relational complexity (RC), the number of variables related in a single decision, and the RC approach to defining cognitive complexity is applicable to different content domains.

Proceedings ArticleDOI
05 Jun 2002
TL;DR: In the last several years, a number of very interesting results have been proved about finite metric spaces as mentioned in this paper, and many interesting open problems in this area have been discussed in the literature.
Abstract: In the last several years a number of very interesting results were proved about finite metric spaces. Some of this work is motivated by practical considerations: Large data sets (coming e.g. from computational molecular biology, brain research or data mining) can be viewed as large metric spaces that should be analyzed (e.g. correctly clustered).On the other hand, these investigations connect to some classical areas of geometry - the asymptotic theory of finite-dimensional normed spaces and differential geometry. Finally, the metric theory of finite graphs has proved very useful in the study of graphs per se and the design of approximation algorithms for hard computational problems. In this talk I will try to explain some of the results and review some of the emerging new connections and the many fascinating open problems in this area.

Proceedings ArticleDOI
03 Jun 2002
TL;DR: This paper defines what constitutes a good choice of a reference set and proposes sampling based algorithms to identify them and demonstrates the practical utility of the solutions using large collections of real and synthetic XML data sets.
Abstract: XML is widely recognized as the data interchange standard for tomorrow, because of its ability to represent data from a wide variety sources. Hence, XML is likely to be the format through which data from multiple sources is integrated.In this paper we study the problem of integrating XML data sources through correlations realized as join operations. A challenging aspect of this operation is the XML document structure. Two documents might convey approximately or exactly the same information but may be quite different in structure. Consequently approximate match in structure, in addition to, content has to be folded in the join operation. We quantify approximate match in structure and content using well defined notions of distance. For structure, we propose computationally inexpensive lower and upper bounds for the tree edit distance metric between two trees. We then show how the tree edit distance, and other metrics that quantify distance between trees, can be incorporated in a join framework. We introduce the notion of reference sets to facilitate this operation. Intuitively, a reference set consists of data elements used to project the data space. We characterize what constitutes a good choice of a reference set and we propose sampling based algorithms to identify them. This gives rise to a variety of algorithmic approaches for the problem, which we formulate and analyze. We demonstrate the practical utility of our solutions using large collections of real and synthetic XML data sets.

Book ChapterDOI
TL;DR: Non-negative Matrix Factorization (NMF) technique is introduced in the context of face classification and a direct comparison with Principal Component Analysis (PCA) is also analyzed.
Abstract: The computer vision problem of face classification under several ambient and unfavorable conditions is considered in this study Changes in expression, different lighting conditions and occlusions are the relevant factors that are studied in this present contribution Non-negative Matrix Factorization (NMF) technique is introduced in the context of face classification and a direct comparison with Principal Component Analysis (PCA) is also analyzed Two leading techniques in face recognition are also considered in this study noticing that NMF is able to improve these techniques when a high dimensional feature space is used Finally, different distance metrics (L1, L2 and correlation) are evaluated in the feature space defined by NMF in order to determine the best one for this specific problem Experiments demonstrate that the correlation is the most suitable metric for this problem

Journal ArticleDOI
Kentaro Toyama1, Andrew Blake1
TL;DR: A new, exemplar-based, probabilistic paradigm for visual tracking is presented, which provides alternatives to standard learning algorithms by allowing the use of metrics that are not embedded in a vector space and uses a noise model that is learned from training data.
Abstract: A new, exemplar-based, probabilistic paradigm for visual tracking is presented. Probabilistic mechanisms are attractive because they handle fusion of information, especially temporal fusion, in a principled manner. Exemplars are selected representatives of raw training data, used here to represent probabilistic mixture distributions of object configurations. Their use avoids tedious hand-construction of object models, and problems with changes of topology. Using exemplars in place of a parameterized model poses several challenges, addressed here with what we call the “Metric Mixture” (M2) approach, which has a number of attractions. Principally, it provides alternatives to standard learning algorithms by allowing the use of metrics that are not embedded in a vector space. Secondly, it uses a noise model that is learned from training data. Lastly, it eliminates any need for an assumption of probabilistic pixelwise independence. Experiments demonstrate the effectiveness of the M2 model in two domains: tracking walking people using “chamfer” distances on binary edge images, and tracking mouth movements by means of a shuffle distance.