scispace - formally typeset
Search or ask a question
Topic

Metric (mathematics)

About: Metric (mathematics) is a research topic. Over the lifetime, 42617 publications have been published within this topic receiving 836571 citations. The topic is also known as: distance function & metric.


Papers
More filters
Journal ArticleDOI
TL;DR: Probability Binning, as shown here, provides a useful metric for determining the probability that two or more flow cytometric data distributions are different, and can be used to rank distributions to identify which are most similar or dissimilar.
Abstract: Background While several algorithms for the comparison of univariate distributions arising from flow cytometric analyses have been developed and studied for many years, algorithms for comparing multivariate distributions remain elusive. Such algorithms could be useful for comparing differences between samples based on several independent measurements, rather than differences based on any single measurement. It is conceivable that distributions could be completely distinct in multivariate space, but unresolvable in any combination of univariate histograms. Multivariate comparisons could also be useful for providing feedback about instrument stability, when only subtle changes in measurements are occurring. Methods We apply a variant of Probability Binning, described in the accompanying article, to multidimensional data. In this approach, hyper-rectangles of n dimensions (where n is the number of measurements being compared) comprise the bins used for the chi-squared statistic. These hyper-dimensional bins are constructed such that the control sample has the same number of events in each bin; the bins are then applied to the test samples for chi-squared calculations. Results Using a Monte-Carlo simulation, we determined the distribution of chi-squared values obtained by comparing sets of events from the same distribution; this distribution of chi-squared values was identical as for the univariate algorithm. Hence, the same formulae can be used to construct a metric, analogous to a t-score, that estimates the probability with which distributions are distinct. As for univariate comparisons, this metric scales with the difference between two distributions, and can be used to rank samples according to similarity to a control. We apply the algorithm to multivariate immunophenotyping data, and demonstrate that it can be used to discriminate distinct samples and to rank samples according to a biologically-meaningful difference. Conclusion Probability binning, as shown here, provides a useful metric for determining the probability with which two or more multivariate distributions represent distinct sets of data. The metric can be used to identify the similarity or dissimilarity of samples. Finally, as demonstrated in the accompanying paper, the algorithm can be used to gate on events in one sample that are different from a control sample, even if those events cannot be distinguished on the basis of any combination of univariate or bivariate displays. Cytometry 45:47–55, 2001. Published 2001 Wiley-Liss, Inc.

187 citations

Journal ArticleDOI
TL;DR: In this paper, a transformed metric entropy measure of dependence is studied which satisfies many desirable properties, including being a proper measure of distance, and is capable of good performance in identifying dependence even in possibly nonlinear time series.
Abstract: . A transformed metric entropy measure of dependence is studied which satisfies many desirable properties, including being a proper measure of distance. It is capable of good performance in identifying dependence even in possibly nonlinear time series, and is applicable for both continuous and discrete variables. A nonparametric kernel density implementation is considered here for many stylized models including linear and nonlinear MA, AR, GARCH, integrated series and chaotic dynamics. A related permutation test of independence is proposed and compared with several alternatives.

187 citations

Journal ArticleDOI
TL;DR: The computational model BX is used to give domain-theoretic proofs of Banach's fixed point theorem and of two classical results of Hutchinson: on a complete metric space, every hyperbolic iterated function system has a unique non-empty compact attractor, and every iteratedfunction system with probabilities has aunique invariant measure with bounded support.

187 citations

Journal ArticleDOI
01 Sep 2006
TL;DR: In this article, a line simplification technique was proposed to reduce the size of the trajectories of mobile devices by adopting existing linguistic constructs to manage the uncertainty introduced by the trajectory approximation.
Abstract: A common way of storing spatio-temporal information about mobile devices is in the form of a 3D (2D geography + time) trajectory. We argue that when cellular phones and Personal Digital Assistants become location-aware, the size of the spatio-temporal information generated may prohibit efficient processing. We propose to adopt a technique studied in computer graphics, namely line-simplification, as an approximation technique to solve this problem. Line simplification will reduce the size of the trajectories. Line simplification uses a distance function in producing the trajectory approximation. We postulate the desiderata for such a distance-function: it should be sound, namely the error of the answers to spatio-temporal queries must be bounded. We analyze several distance functions, and prove that some are sound in this sense for some types of queries, while others are not. A distance function that is sound for all common spatio-temporal query types is introduced and analyzed. Then we propose an aging mechanism which gradually shrinks the size of the trajectories as time progresses. We also propose to adopt existing linguistic constructs to manage the uncertainty introduced by the trajectory approximation. Finally, we analyze experimentally the effectiveness of line-simplification in reducing the size of a trajectories database.

187 citations

Proceedings ArticleDOI
04 Nov 2008
TL;DR: Wang et al. as mentioned in this paper proposed a generalization-based approach that applies to trajectories and sequences in general and proposed trajectory anonymization techniques to address time and space sensitive applications.
Abstract: Trajectory datasets are becoming more and more popular due to the massive usage of GPS and other location-based devices and services. In this paper, we address privacy issues regarding the identification of individuals in static trajectory datasets. We provide privacy protection by definig trajectory k-anonymity, meaning every released information refers to at least k users/trajectories. We propose a novel generalization-based approach that applies to trajectories and sequences in general. We also suggest the use of a simple random reconstruction of the original dataset from the anonymization, to overcome possible drawbacks of generalization approaches.We present a utility metric that maximizes the probability of a good representation and propose trajectory anonymization techniques to address time and space sensitive applications. The experimental results over synthetic trajectory datasets show the effectiveness of the proposed approach.

186 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
83% related
Optimization problem
96.4K papers, 2.1M citations
83% related
Fuzzy logic
151.2K papers, 2.3M citations
83% related
Robustness (computer science)
94.7K papers, 1.6M citations
83% related
Support vector machine
73.6K papers, 1.7M citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202253
20213,191
20203,141
20192,843
20182,731
20172,341