scispace - formally typeset
Search or ask a question

Showing papers on "Metric (mathematics) published in 1999"


Proceedings Article
07 Sep 1999
TL;DR: Experimental results indicate that the novel scheme for approximate similarity search based on hashing scales well even for a relatively large number of dimensions, and provides experimental evidence that the method gives improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition.
Abstract: The nearestor near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and genome databases. Unfortunately, all known techniques for solving this problem fall prey to the \curse of dimensionality." That is, the data structures scale poorly with data dimensionality; in fact, if the number of dimensions exceeds 10 to 20, searching in k-d trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than brute-force linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather heuristic, determining an approximate nearest neighbor should su ce for most practical purposes. In this paper, we examine a novel scheme for approximate similarity search based on hashing. The basic idea is to hash the points Supported by NAVY N00014-96-1-1221 grant and NSF Grant IIS-9811904. Supported by Stanford Graduate Fellowship and NSF NYI Award CCR-9357849. Supported by ARO MURI Grant DAAH04-96-1-0007, NSF Grant IIS-9811904, and NSF Young Investigator Award CCR9357849, with matching funds from IBM, Mitsubishi, Schlumberger Foundation, Shell Foundation, and Xerox Corporation. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 25th VLDB Conference, Edinburgh, Scotland, 1999. from the database so as to ensure that the probability of collision is much higher for objects that are close to each other than for those that are far apart. We provide experimental evidence that our method gives signi cant improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition. Experimental results also indicate that our scheme scales well even for a relatively large number of dimensions (more than 50).

3,705 citations


Journal ArticleDOI
TL;DR: In this paper, a spatial model of dependence among agents using a metric of economic distance is presented, which provides cross-sectional data with a structure similar to that provided by the time index in time-series data.

1,954 citations


Journal ArticleDOI
TL;DR: A theoretical proof is given which shows that the absence of skew in the image plane is sufficient to allow for self-calibration and a method to detect critical motion sequences is proposed.
Abstract: In this paper the theoretical and practical feasibility of self-calibration in the presence of varying intrinsic camera parameters is under investigation. The paper‘s main contribution is to propose a self-calibration method which efficiently deals with all kinds of constraints on the intrinsic camera parameters. Within this framework a practical method is proposed which can retrieve metric reconstruction from image sequences obtained with uncalibrated zooming/focusing cameras. The feasibility of the approach is illustrated on real and synthetic examples. Besides this a theoretical proof is given which shows that the absence of skew in the image plane is sufficient to allow for self-calibration. A counting argument is developed which—depending on the set of constraints—gives the minimum sequence length for self-calibration and a method to detect critical motion sequences is proposed.

829 citations


Journal ArticleDOI
TL;DR: Assessment of the approach on quantitative and qualitative grounds demonstrates its effectiveness in two very different domains, Wall Street Journal news articles and television broadcast news story transcripts, using a new probabilistically motivated error metric.
Abstract: This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The models use two classes of features: topicality features that use adaptive language models in a novel way to detect broad changes of topic, and cue-word features that detect occurrences of specific words, which may be domain-specific, that tend to be used near segment boundaries. Assessment of our approach on quantitative and qualitative grounds demonstrates its effectiveness in two very different domains, Wall Street Journal news articles and television broadcast news story transcripts. Quantitative results on these domains are presented using a new probabilistically motivated error metric, which combines precision and recall in a natural and flexible way. This metric is used to make a quantitative assessment of the relative contributions of the different feature types, as well as a comparison with decision trees and previously proposed text segmentation algorithms.

728 citations


01 Sep 1999
TL;DR: This memo defines a metric for one-way delay of packets across Internet paths as a percentage of the total number of packets sent across internet paths.
Abstract: This memo defines a metric for one-way delay of packets across Internet paths. [STANDARDS-TRACK]

443 citations


Journal ArticleDOI
TL;DR: An implementation of stochastic mapping that uses a delayed nearest neighbor data association strategy to initialize new features into the map, match measurements to map features, and delete out-of-date features is described.
Abstract: The task of building a map of an unknown environment and concurrently using that map to navigate is a central problem in mobile robotics research. This paper addresses the problem of how to perform concurrent mapping and localization (CML) adaptively using sonar. Stochastic mapping is a feature-based approach to CML that generalizes the extended Kalman filter to incorporate vehicle localization and environmental mapping. The authors describe an implementation of stochastic mapping that uses a delayed nearest neighbor data association strategy to initialize new features into the map, match measurements to map features, and delete out-of-date features. The authors introduce a metric for adaptive sensing that is defined in terms of Fisher information and represents the sum of the areas of the error ellipses of the vehicle and feature estimates in the map. Predicted sensor readings and expected dead-reckoning errors are used to estimate the metric for each potential action of the robot, and the action that yi...

373 citations


Journal ArticleDOI
TL;DR: A new method for source localization is described that is based on a modification of the well-known MUSIC algorithm, and a general form of the RAP-MUSIC algorithm is described for the case of diversely polarized sources.
Abstract: A new method for source localization is described that is based on a modification of the well-known MUSIC algorithm. In classical MUSIC, the array manifold vector is projected onto an estimate of the signal subspace. Errors in the estimate of the signal subspace can make localization of multiple sources difficult. Recursively applied and projected (RAP) MUSIC uses each successively located source to form an intermediate array gain matrix and projects both the array manifold and the signal subspace estimate into its orthogonal complement. The MUSIC projection to find the next source is then performed in this reduced subspace. Special assumptions about the array manifold structure, such as Vandermonde or shift invariance, are not required. Using the metric of principal angles, we describe a general form of the RAP-MUSIC algorithm for the case of diversely polarized sources. Through a uniform linear array simulation with two highly correlated sources, we demonstrate the improved Monte Carlo error performance of RAP-MUSIC relative to MUSIC and two other sequential subspace methods: S and IES-MUSIC. We then demonstrate the more general utility of this algorithm for multidimensional array manifolds in a magnetoencephalography (MEG) source localization simulation.

365 citations


Book
01 Jan 1999
TL;DR: A novel approach to the problem of navigating through a collection of images for the purpose of image retrieval is presented, which leads to a new paradigm for image database search and a metric between any two such distributions is defined.
Abstract: The increasing amount of information available in today''s world raises the need to retrieve relevant data efficiently. Unlike text-based retrieval, where keywords are successfully used to index into documents, content-based image retrieval poses up front the fundamental questions how to extract useful image features and how to use them for intuitive retrieval. We present a novel approach to the problem of navigating through a collection of images for the purpose of image retrieval, which leads to a new paradigm for image database search. We summarize the appearance of images by distributions of color or texture features, and we define a metric between any two such distributions. This metric, which we call the "Earth Mover''s Distance" (EMD), represents the least amount of work that is needed to rearrange the mass is one distribution in order to obtain the other. We show that the EMD matches perceptual dissimilarity better than other dissimilarity measures, and argue that it has many desirable properties for image retrieval. Using this metric, we employ Multi-Dimensional Scaling techniques to embed a group of images as points in a two- or three-dimensional Euclidean space so that their distances reflect image dissimilarities as well as possible. Such geometric embeddings exhibit the structure in the image set at hand, allowing the user to understand better the result of a database query and to refine the query in a perceptually intuitive way. By iterating this process, the user can quickly zoom in to the portion of the image space of interest. We also apply these techniques to other modalities such as mug-shot retrieval.

345 citations


Proceedings ArticleDOI
01 Jul 1999
TL;DR: Results show the method preserves visual quality while achieving significant computational gains in areas of images with high frequency texture patterns, geometric details, and lighting variations.
Abstract: We introduce a new concept for accelerating realistic image synthesis algorithms. At the core of this procedure is a novel physical error metric that correctly predicts the perceptual threshold for detecting artifacts in scene features. Built into this metric is a computational model of the human visual system's loss of sensitivity at high background illumination levels, high spatial frequencies, and high contrast levels (visual masking). An important feature of our model is that it handles the luminance-dependent processing and spatiallydependent processing independently. This allows us to precompute the expensive spatially-dependent component, making our model extremely efficient. We illustrate the utility of our procedure with global illumination algorithms used for realistic image synthesis. The expense of global illumination computations is many orders of magnitude higher than the expense of direct illumination computations and can greatly benefit by applying our perceptually based technique. Results show our method preserves visual quality while achieving significant computational gains in areas of images with high frequency texture patterns, geometric details, and lighting variations.

299 citations


01 Sep 1999
TL;DR: This memo defines a metric for one-way packet loss across Internet paths and states that it is likely that the number of packets per path will increase over time.
Abstract: This memo defines a metric for one-way packet loss across Internet paths. [STANDARDS-TRACK]

290 citations


Journal ArticleDOI
TL;DR: The multivantage point tree structure (mvp-tree) that uses more than one vantage point to partiton the space into spherical cuts at each level is introduced and generalize the idea of using multiple vantage points and the results show that, after all, it may be best to use a large number of vantage points in an internal node in order to end up with a single directory node and keep as many of the precomputed distances as possible to provide more efficient filtering during search operations.
Abstract: One of the common queries in many database applications is finding approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance-based index structures are proposed for applications where the distance computations between objects of the data domain are expensive (such as high-dimensional data) and the distance function is metric. In this paper we consider using distance-based index structures for similarity queries on large metric spaces. We elaborate on the approach that uses reference points (vantage points) to partition the data space into spherical shell-like regions in a hierarchical manner. We introduce the multivantage point tree structure (mvp-tree) that uses more than one vantage point to partiton the space into spherical cuts at each level. In answering similarity-based queries, the mvp-tree also utilizes the precomputed (at construction time) distances between the data points and the vantage points.We summarize the experiments comparing mvp-trees to vp-trees that have a similar partitioning strategy, but use only one vantage point at each level and do not make use of the precomputed distances. Empirical studies show that the mvp-tree outperforms the vp-tree by 20% to 80% for varying query ranges and different distance distributions. Next, we generalize the idea of using multiple vantage points and discuss the results of experiments we have made to see how varying the number of vantage points in a node affects affects performance and how much is gained in performance by making use of precomputed distances. The results show that, after all, it may be best to use a large number of vantage points in an internal node in order to end up with a single directory node and keep as many of the precomputed distances as possible to provide more efficient filtering during search operations. Finally, we provide some experimental results that compare mvp-trees with M-trees, which is a dynamic distance-based index structure for metric domains.

Proceedings ArticleDOI
17 Oct 1999
TL;DR: This work presents approximation algorithms for the metric uncapacitated facility location problem and the metric k-median problem achieving guarantees of 3 and 6 respectively, and a new extension of the primal-dual schema.
Abstract: We present approximation algorithms for the metric uncapacitated facility location problem and the metric k-median problem achieving guarantees of 3 and 6 respectively. The distinguishing feature of our algorithms is their low running time: O(m log m) and O(m log m(L+log(n))) respectively, where n and m are the total number of vertices and edges in the underlying graph. The main algorithmic idea is a new extension of the primal-dual schema.

Proceedings ArticleDOI
21 Sep 1999
TL;DR: This work proposes a new data structure, called sa-tree (“spatial approximation tree”), which is based on approaching the searched objects spatially, that is, getting closer and closer to them, rather than the classic divide-and-conquer approach of other data structures.
Abstract: We propose a novel data structure to search in metric spaces. A metric space is formed by a collection of objects and a distance function defined among them, which satisfies the triangular inequality. The goal is, given a set of objects and a query, retrieve those objects close enough to the query. The number of distances computed to achieve this goal is the complexity measure. Our data structure, called sa-tree ("spatial approximation tree"), is based on approaching spatially the searched objects. We analyze our method and show that the number of distance evaluations to search among n objects is o(n). We show experimentally that the sa-tree is the best existing technique when the metric space is high-dimensional or the query has low selectivity. These are the most difficult cases in real applications.

Proceedings ArticleDOI
19 May 1999
TL;DR: In this paper, a distortion metric for color video sequences is presented based on a contrast gain control model of the human visual system that incorporates spatial and temporal aspects of vision as well as color perception.
Abstract: In this paper I present a distortion metric for color video sequences. It is based on a contrast gain control model of the human visual system that incorporates spatial and temporal aspects of vision as well as color perception. The model achieves a close fit to contrast sensitivity and contrast masking data from several different psychophysical experiments for both luminance and color stimuli. The metric is used to assess the quality of MPEG-coded sequences.

Journal ArticleDOI
TL;DR: A technique to measure channel quality in terms of signal-to-interference plus noise ratio (SINR) for the transmission of signals over fading channels and proposes a set of coded modulation schemes which utilize the SINR estimate to adapt between modulations, thus improving the data throughput.
Abstract: We propose a technique to measure channel quality in terms of signal-to-interference plus noise ratio (SINR) for the transmission of signals over fading channels The Euclidean distance (ED) metric, associated with the decoded information sequence or a suitable modification thereof, is used as a channel quality measure Simulations show that the filtered or averaged metric is a reliable channel quality measure which remains consistent across different coded modulation schemes and at different mobile speeds The average scaled ED metric can be mapped to the SINR per symbol We propose the use of this SINR estimate for data rate adaptation, in addition to mobile assisted handoff (MAHO) and power control We particularly focus on data rate adaptation and propose a set of coded modulation schemes which utilize the SINR estimate to adapt between modulations, thus improving the data throughput Simulation results show that the proposed metric works well across the entire range of Dopplers to provide near-optimal rate adaptation to average SINR This method of adaptation averages out short-term variations due to Rayleigh fading and adapts to the long-term effects such as shadowing At low Dopplers, the metric can track Rayleigh fading and match the rate to a short-term average of the SINR, thus further increasing throughput

01 Jan 1999
TL;DR: This dissertation presents a simplification algorithm, based on iterative vertex pair contraction, that can simplify both the geometry and topology of manifold as well as non-manifold surfaces, and proves a direct mathematical connection between the quadric metric and surface curvature.
Abstract: Many applications in computer graphics and related fields can benefit from automatic simplification of complex polygonal surface models. Applications are often confronted with either very densely over-sampled surfaces or models too complex for the limited available hardware capacity. An effective algorithm for rapidly producing high-quality approximations of the original model is a valuable tool for managing data complexity. In this dissertation, I present my simplification algorithm, based on iterative vertex pair contraction. This technique provides an effective compromise between the fastest algorithms, which often produce poor quality results, and the highest-quality algorithms, which are generally very slow. For example, a 1000 face approximation of a 100,000 face model can be produced in about 10 seconds on a PentiumPro 200. The algorithm can simplify both the geometry and topology of manifold as well as non-manifold surfaces. In addition to producing single approximations, my algorithm can also be used to generate multiresolution representations such as progressive meshes and vertex hierarchies for view-dependent refinement. The foundation of my simplification algorithm, is the quadric error metric which I have developed. It provides a useful and economical characterization of local surface shape, and I have proven a direct mathematical connection between the quadric metric and surface curvature. A generalized form of this metric can accommodate surfaces with material properties, such as RGB color or texture coordinates. I have also developed a closely related technique for constructing a hierarchy of well-defined surface regions composed of disjoint sets of faces. This algorithm involves applying a dual form of my simplification algorithm to the dual graph of the input surface. The resulting structure is a hierarchy of face clusters which is an effective multiresolution representation for applications such as radiosity.

Journal ArticleDOI
TL;DR: This work presents a simple yet novel algorithm for reducing dimensionality by identifying which axes (metrics) convey information related to affinity for a given receptor and which axes can be safely discarded as being irrelevant to the given receptor.
Abstract: Following brief comments regarding the advantages of cell-based diversity algorithms and the selection of low-dimensional chemistry-space metrics needed to implement such algorithms, the notion of metric validation is discussed. Activity-seeded, structure-based clustering is presented as an ideal approach for the validation of either high- or low-dimensional chemistry-space metrics when validation by computer-graphic visualization is not possible. Whereas typical methods for reducing the dimensionality of chemistry-space inevitably discard potentially important information, we present a simple yet novel algorithm for reducing dimensionality by identifying which axes (metrics) convey information related to affinity for a given receptor and which axes can be safely discarded as being irrelevant to the given receptor. This algorithm often reveals a three- or two-dimensional subspace of a (typically six-dimensional) BCUT chemistry-space and, thus, enables computer graphic visualization of the actual coordinat...

Proceedings ArticleDOI
Piotr Indyk1
01 May 1999
TL;DR: In this article, the authors gave an algorithm with running time linear in the number of metric space points for the Furthest Pair, kmedian, Minimum Routing Cost Spanning Tree, Multiple Sequence Alignment, Maximum Traveling Salesman Problem and Average Distance.
Abstract: In this paper we give approximation algorithms for the following problems on metric spaces: Furthest Pair, kmedian, Minimum Routing Cost Spanning Tree, Multiple Sequence Alignment, Maximum Traveling Salesman Problem, Maximum Spanning Tree and Average Distance. The key property of our algorithms is that their running time is linear in the number of metric space points. As the full specification o‘f an n-point metric space is of size Q(n’), the complexity of our algorithms is sublinear with respect to the input size. All previous algorithms (exact or approximate) for the problems we consider have running time n(n’). We believe that OUT techniques can be applied to get similar bounds for other problems.

Proceedings ArticleDOI
Satish Rao1
13 Jun 1999
TL;DR: The results give improvements for Feige's and Vempala’s approximation algorithms for planar and Euclidean metrics and an improved bound for volume respecting embeddings for Euclidan metrics.
Abstract: A finite metric space, (S,d) , contains a finite set of points and a distance function on pairs of points. A contraction is an embedding, h, of a finite metric space (S, d) into Rd where for any u, v E S, the Euclidean (&) distance between h(u) and h(v) is no more than d(u, v). The distortion of the embedding is the maximum over pairs of the ratio of d(u, w) and the Euclidean distance between h(u) and h(v). Bourgain showed that any graphical metric could be embedded with distortion O(logn). Linial, London and Rabinovich and Aumman and Rabani used such embeddings to prove an O(log k) approximate max-flow min-cut theorem for k commodity flow problems. A generalization of embeddings that preserve distances between pairs-of points are embeddings that preserve volumes of larger sets. In particular, A (k, c)-volume respecting embedding of n-points in any metric space is a contraction where every subset of k points has within an ck-’ factor of its maximal possible k l-dimensional volume. Feige invented these embeddings in devising a polylogarithmic approximation algorithm for the bandwidth problem using these embeddings. Feige’s methods have subsequently been used by Vempala for approximating versions of the VLSI layout problem. Feise showed that a (k, O(10,g~‘~ n,/m)) volume r&ecting embedding‘ eksted.” Be -recently found improved (k, 0( mdk log k + log n)) volume respecting embeddings. For metrics arising from planar graphs (planar metrics), we give (k,O(m)) volume respecting contractions. As a corollary, we give embeddings for Permission to makkr digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. XC’99 Miami Beach Florida Copyright ACM 1999 I-581 13-068-6/99/06...$5.00 planar metrics with distortion O(e). This gives rise to an O(e)-approximate max-flow min-cut theorem for multicommodity flow problems in planar graphs. We also give an improved bound for volume respecting embeddings for Euclidean metrics. In particular, we give an (k,O(dog klog D)) volume respecting embedding where D is the ratio of the largest distance to the smallest distance in the metric. Our results give improvements for Feige’s and Vempala’s approximation algorithms for planar and Euclidean metrics. For volume respecting embeddings, our embeddings do not degrade very fast when preserving the volumes of large subsets. This may be useful in the future for approximation algorithms or if volume .respecting embeddings prove to be of independent interest.

Posted Content
TL;DR: In contrast to the usual Lipschitz seminorms associated to ordinary metrics on compact spaces, this article showed that Lipschnitz norms on possibly non-commutative compact spaces are usually not determined by the restriction of the metric they define on the state space, to the extreme points of the state spaces.
Abstract: In contrast to the usual Lipschitz seminorms associated to ordinary metrics on compact spaces, we show by examples that Lipschitz seminorms on possibly non-commutative compact spaces are usually not determined by the restriction of the metric they define on the state space, to the extreme points of the state space We characterize the Lipschitz norms which are determined by their metric on the whole state space as being those which are lower semicontinuous We show that their domain of Lipschitz elements can be enlarged so as to form a dual Banach space, which generalizes the situation for ordinary Lipschitz seminorms We give a characterization of the metrics on state spaces which come from Lipschitz seminorms The natural (broader) setting for these results is provided by the ``function spaces'' of Kadison A variety of methods for constructing Lipschitz seminorms is indicated

Journal ArticleDOI
TL;DR: It is argued that far from ignoring geometric and metric properties the ‘line graph’ internalises them into its structure of the graph and in doing so allows the graph analysis to pick up the nonlocal, or extrinsic, properties of spaces that are critical to the movement dynamics through which a city evolves its essential structures.
Abstract: A common objection to the space syntax analysis of cities is that even in its own terms the technique of using a nonuniform line representation of space and analysing it by measures that are essentially topological ignores too much geometric and metric detail to be credible. In this paper it is argued that far from ignoring geometric and metric properties the 'line graph' internalises them into its structure of the graph and in doing so allows the graph analysis to pick up the nonlocal, or extrinsic, properties of spaces that are critical to the movement dynamics through which a city evolves its essential structures. Nonlocal properties are those which are defined by the relation of elements to all others in the system, rather than those which are intrinsic to the element itself. The method also leads to a powerful anslysis of urban stuctures because cities are essentially nonlocal systems.

Journal ArticleDOI
TL;DR: It is shown that in the limit as triangle area goes to zero on a differentiable surface, the quadric error is directly related to surface curvature, and in this limit, a triangulation that minimizes the Quadric error metric achieves the optimal triangle aspect ratio in that it minimized theL2 geometric error.
Abstract: Many algorithms for reducing the number of triangles in a surface model have been proposed, but to date there has been little theoretical analysis of the approximations they produce. Previously we described an algorithm that simplifies polygonal models using a quadric error metric. This method is fast and produces high quality approximations in practice. Here we provide some theory to explain why the algorithm works as well as it does. Using methods from differential geometry and approximation theory, we show that in the limit as triangle area goes to zero on a differentiable surface, the quadric error is directly related to surface curvature. Also, in this limit, a triangulation that minimizes the quadric error metric achieves the optimal triangle aspect ratio in that it minimizes theL2 geometric error. This work represents a new theoretical approach for the analysis of simplification algorithms. © 1999 Elsevier Science B.V. All rights reserved.

Proceedings Article
01 Jan 1999
TL;DR: A new composite similarity metric is presented that combines information from multiple linguistic indicators to measure semantic distance between pairs of small textual units and is evaluated against standard information retrieval techniques, establishing that the new method is more effective in identifying closely related textual units.
Abstract: We present a new composite similarity metric that combines information from multiple linguistic indicators to measure semantic distance between pairs of small textual units. Several potential features are investigated and an optimal combination is selected via machine learning. We discuss a more restrictive definition of similarity than traditional, document-level and information retrieval-oriented, notions of similarity, and motivate it by showing its relevance to the multi-document text summarization problem. Results from our system are evaluated against standard information retrieval techniques, establishing that the new method is more effective in identifying closely related textual units.

Journal ArticleDOI
TL;DR: A weak partial metric on the poset of formal balls of a metric space can be used to construct the completion of classical metric spaces from the domain-theoretic rounded ideal completion.
Abstract: Partial metrics are generalised metrics with non-zero self-distances. We slightly generalise Matthews' original definition of partial metrics, yielding a notion of weak partial metric. After considering weak partial metric spaces in general, we introduce a weak partial metric on the poset of formal balls of a metric space. This weak partial metric can be used to construct the completion of classical metric spaces from the domain-theoretic rounded ideal completion.

Proceedings ArticleDOI
17 Oct 1999
TL;DR: This work provides the first non-trivial polynomial-time approximation algorithms for a general family of classification problems of this type, the metric labeling problem, and shows that it contains as special cases a number of standard classification frameworks, including several arising from the theory of Markov random fields.
Abstract: In a traditional classification problem, we wish to assign one of k labels (or classes) to each of n objects, in a way that is consistent with some observed data that we have about the problem. An active line of research in this area is concerned with classification when one has information about pairwise relationships among the objects to be classified; this issue is one of the principal motivations for the framework of Markov random fields, and it arises in areas such as image processing, biometry: and document analysis. In its most basic form, this style of analysis seeks a classification that optimizes a combinatorial function consisting of assignment costs-based on the individual choice of label we make for each object-and separation costs-based on the pair of choices we make for two "related" objects. We formulate a general classification problem of this type, the metric labeling problem; we show that it contains as special cases a number of standard classification frameworks, including several arising from the theory of Markov random fields. From the perspective of combinatorial optimization, our problem can be viewed as a substantial generalization of the multiway cut problem, and equivalent to a type of uncapacitated quadratic assignment problem. We provide the first non-trivial polynomial-time approximation algorithms for a general family of classification problems of this type. Our main result is an O(log k log log k)-approximation algorithm for the metric labeling problem, with respect to an arbitrary metric on a set of k labels, and an arbitrary weighted graph of relationships on a set of objects. For the special case in which the labels are endowed with the uniform metric-all distances are the same-our methods provide a 2-approximation.

Journal ArticleDOI
TL;DR: The Self-Organizing Map has been applied in monitoring and modeling of complex industrial processes and case studies, including pulp process, steel production, and paper industry are described.
Abstract: The Self-Organizing Map (SOM) is a powerful neural network method for analysis and visualization of high-dimensional data. It maps nonlinear statistical dependencies between high-dimensional measurement data into simple geometric rela- tionships on a usually two-dimensional grid. The mapping roughly preserves the most important topological and metric relationships of the original data elements and, thus, inherently clusters the data. The need for visualization and clustering occurs, for instance, in the analysis of various engineering problems. In this paper, the SOM has been applied in monitoring and modeling of complex industrial processes. Case studies, including pulp process, steel production, and paper industry are described.

01 Sep 1999
TL;DR: This memo defines a metric for round-trip delay of packets across Internet paths as a percentage of the total number of packets going through the network.
Abstract: This memo defines a metric for round-trip delay of packets across Internet paths. [STANDARDS-TRACK]

Journal ArticleDOI
TL;DR: This article argues for an approach to grammar acquisition that builds on the cue-based parametric model of Dresher and Kaye (1990), in which cues to parameters become progressively more abstract and grammar-internal.
Abstract: This article argues for an approach to grammar acquisition that builds on the cue-based parametric model of Dresher and Kaye (1990). On this view, acquisition proceeds by means of an ordered path, in which cues to parameters become progressively more abstract and grammar-internal. A learner does not attempt to match target forms (contra Gibson and Wexler 1994), but uses them as evidence for parameter setting. Cues are local, and there is no global fitness metric (contra Clark and Roberts 1993). Acquisition of representations and acquisition of grammar proceed together and cannot be decoupled in the manner of Tesar and Smolensky (1998).

Proceedings ArticleDOI
06 Oct 1999
TL;DR: This paper presents a metric, called MoJo (Move-Join), that can be used in evaluating the similarity of two different decompositions of a software system, and calculates a distance between two partitions of the same set of software resources.
Abstract: The software clustering problem has attracted much attention recently, since it is an integral part of the process of reverse engineering large software systems. A key problem in this research is the difficulty in comparing different approaches in an objective fashion. In this paper, we present a metric, called MoJo (Move-Join), that can be used in evaluating the similarity of two different decompositions of a software system. Our metric calculates a distance between two partitions of the same set of software resources. We begin by introducing the model we use. Then we present a heuristic algorithm that calculates the distance in an efficient fashion. Finally, we discuss some experiments that showcase the performance of the algorithm and the effectiveness of the metric.

Journal ArticleDOI
TL;DR: A stratified approach is proposed, which goes from projective through affine to metric, which allows retrieval of the affine calibration for constant intrinsic parameters and is also suited for use in conjunction with scene knowledge.
Abstract: In computer vision and especially for 3D reconstruction, one of the key issues is the retrieval of the calibration parameters of the camera. These are needed to obtain metric information about the scene from the camera. Often these parameters are obtained through cumbersome calibration procedures. There is a way to avoid explicit calibration of the camera. Self-calibration is based on finding the set of calibration parameters which satisfy some constraints (e.g., constant calibration parameters). Several techniques have been proposed but it often proved difficult to reach a metric calibration at once. Therefore, in the paper, a stratified approach is proposed, which goes from projective through affine to metric. The key concept to achieve this is the modulus constraint. It allows retrieval of the affine calibration for constant intrinsic parameters. It is also suited for use in conjunction with scene knowledge. In addition, if the affine calibration is known, it can also be used to cope with a changing focal length.