scispace - formally typeset
Search or ask a question

Showing papers on "Euclidean distance published in 2011"


Journal ArticleDOI
TL;DR: This paper introduces a product quantization-based approach for approximate nearest neighbor search to decompose the space into a Cartesian product of low-dimensional subspaces and to quantize each subspace separately.
Abstract: This paper introduces a product quantization-based approach for approximate nearest neighbor search. The idea is to decompose the space into a Cartesian product of low-dimensional subspaces and to quantize each subspace separately. A vector is represented by a short code composed of its subspace quantization indices. The euclidean distance between two vectors can be efficiently estimated from their codes. An asymmetric version increases precision, as it computes the approximate distance between a vector and a code. Experimental results show that our approach searches for nearest neighbors efficiently, in particular in combination with an inverted file system. Results for SIFT and GIST image descriptors show excellent search accuracy, outperforming three state-of-the-art approaches. The scalability of our approach is validated on a data set of two billion vectors.

2,559 citations


Journal ArticleDOI
TL;DR: This letter describes algorithms for nonnegative matrix factorization (NMF) with the β-divergence, a family of cost functions parameterized by a single shape parameter β that takes the Euclidean distance, the Kullback-Leibler divergence, and the Itakura-Saito divergence as special cases.
Abstract: This letter describes algorithms for nonnegative matrix factorization (NMF) with the β-divergence (β-NMF). The β-divergence is a family of cost functions parameterized by a single shape parameter β that takes the Euclidean distance, the Kullback-Leibler divergence, and the Itakura-Saito divergence as special cases (β = 2, 1, 0 respectively). The proposed algorithms are based on a surrogate auxiliary function (a local majorization of the criterion function). We first describe a majorization-minimization algorithm that leads to multiplicative updates, which differ from standard heuristic multiplicative updates by a β-dependent power exponent. The monotonicity of the heuristic algorithm can, however, be proven for β ∈ (0, 1) using the proposed auxiliary function. Then we introduce the concept of the majorization-equalization (ME) algorithm, which produces updates that move along constant level sets of the auxiliary function and lead to larger steps than MM. Simulations on synthetic and real data illustrate the faster convergence of the ME approach. The letter also describes how the proposed algorithms can be adapted to two common variants of NMF: penalized NMF (when a penalty function of the factors is added to the criterion function) and convex NMF (when the dictionary is assumed to belong to a known subspace).

846 citations


Journal ArticleDOI
TL;DR: A novel distance measure, called a weighted DTW (WDTW), which is a penalty-based DTW that penalizes points with higher phase difference between a reference point and a testing point in order to prevent minimum distance distortion caused by outliers is proposed.

537 citations


Journal ArticleDOI
TL;DR: The methodology is comprised of a C-means-based fuzzy clustering and a fuzzy classification performed using a fuzzy membership matrix and the Euclidean distance to the cluster centers, yielding a unitary index score.
Abstract: This paper proposes a computational technique for the classification of electricity consumption profiles. The methodology is comprised of two steps. In the first one, a C-means-based fuzzy clustering is performed in order to find consumers with similar consumption profiles. Afterwards, a fuzzy classification is performed using a fuzzy membership matrix and the Euclidean distance to the cluster centers. Then, the distance measures are normalized and ordered, yielding a unitary index score, where the potential fraudsters or users with irregular patterns of consumption have the highest scores. The approach was tested and validated on a real database, showing good performance in tasks of fraud and measurement defect detection.

238 citations


Journal ArticleDOI
Deng-Feng Li1
01 Jun 2011
TL;DR: A closeness coefficient based nonlinear programming method for solving multiattribute decision making problems in which ratings of alternatives on attributes are expressed using interval-valued intuitionistic fuzzy (IVIF) sets and preference information on attributes is incomplete is developed.
Abstract: The aim of this paper is to develop a closeness coefficient based nonlinear programming method for solving multiattribute decision making problems in which ratings of alternatives on attributes are expressed using interval-valued intuitionistic fuzzy (IVIF) sets and preference information on attributes is incomplete. In this methodology, nonlinear programming models are constructed on the concept of the closeness coefficient, which is defined as a ratio of the square of the weighted Euclidean distance between an alternative and the IVIF negative ideal solution (IVIFNIS) to the sum of the squares of the weighted Euclidean distances between the alternative and the IVIF positive ideal solution (IVIFPIS) as well as the IVIFNIS. Simpler nonlinear programming models are deduced to calculate closeness intuitionistic fuzzy sets of alternatives to the IVIFPIS, which are used to estimate the optimal degrees of membership and hereby generate ranking order of the alternatives. The derived auxiliary nonlinear programming models are shown to be flexible with different information structures and decision environments. The proposed method is validated and compared with other methods. A real example is examined to demonstrate applicability of the proposed method in this paper.

233 citations


Journal ArticleDOI
TL;DR: This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition, and proposes another two fast parallel methods: α-SNMF and β -SNMF algorithms, which are applied to probabilistic clustering.
Abstract: Nonnegative matrix factorization (NMF) is an unsupervised learning method useful in various applications including image processing and semantic analysis of documents. This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition. Three parallel multiplicative update algorithms using level 3 basic linear algebra subprograms directly are developed for this problem. First, by minimizing the Euclidean distance, a multiplicative update algorithm is proposed, and its convergence under mild conditions is proved. Based on it, we further propose another two fast parallel methods: α-SNMF and β -SNMF algorithms. All of them are easy to implement. These algorithms are applied to probabilistic clustering. We demonstrate their effectiveness for facial image clustering, document categorization, and pattern clustering in gene expression.

184 citations


Journal ArticleDOI
TL;DR: A novel pointwise-adaptive speckle filter based on local homogeneous-region segmentation with pixel-relativity measurement and a novel evaluation metric of edge-preservation degree based on ratio of average is provided for more precise quantitative assessment.
Abstract: This paper provides a novel pointwise-adaptive speckle filter based on local homogeneous-region segmentation with pixel-relativity measurement. A ratio distance is proposed to measure the distance between two speckled-image patches. The theoretical proofs indicate that the ratio distance is valid for multiplicative speckle, while the traditional Euclidean distance failed in this case. The probability density function of the ratio distance is deduced to map the distance into a relativity value. This new relativity-measurement method is free of parameter setting and more functional compared with the Gaussian kernel-projection-based ones. The new measurement method is successfully applied to segment a local shape-adaptive homogeneous region for each pixel, and a simplified strategy for the segmentation implementation is given in this paper. After segmentation, the maximum likelihood rule is introduced to estimate the true signal within every homogeneous region. A novel evaluation metric of edge-preservation degree based on ratio of average is also provided for more precise quantitative assessment. The visual and numerical experimental results show that the proposed filter outperforms the existing state-of-the-art despeckling filters.

157 citations


Journal ArticleDOI
TL;DR: A new Euclidean distance is developed by using the induced OWA operator that considers the Euclidan distance as a particular case and a lot of other possible situations depending on the interests of the decision maker.
Abstract: Research highlights? A new Euclidean distance by using the induced OWA operator. ? The induced Euclidean ordered weighted averaging distance (IEOWAD) operator. ? Most of the previous studies that use the Euclidean distance can be revised with this new approach. ? A new financial decision making method. We develop a new decision making method by using induced aggregation operators in the Euclidean distance. We introduce a new aggregation operator called the induced Euclidean ordered weighted averaging distance (IEOWAD) operator. It is an aggregation operator that parameterizes a wide range of distance measures by using the induced OWA (IOWA) operator such as the maximum distance, the minimum distance, the normalized Euclidean distance (NED) and the weighted Euclidean distance (WED). The main advantage of this operator is that it is able to consider complex attitudinal characters of the decision maker by using order inducing variables in the aggregation of the Euclidean distance. As a result, we get a more general formulation of the Euclidean distance that considers the Euclidean distance as a particular case and a lot of other possible situations depending on the interests of the decision maker. We study some of its main properties giving special attention to the analysis of different particular types of IEOWAD operators. We apply this aggregation operator in a business decision making problem regarding the selection of investments.

140 citations


Proceedings ArticleDOI
21 Aug 2011
TL;DR: A new and simple method to speed up the widely-used Euclidean realization of LSH by the use of randomized Hadamard transforms in a non-linear setting and shows that using the new LSH in nearest-neighbor applications can improve their running times by significant amounts.
Abstract: Locality-sensitive hashing (LSH) is a basic primitive in several large-scale data processing applications, including nearest-neighbor search, de-duplication, clustering, etc. In this paper we propose a new and simple method to speed up the widely-used Euclidean realization of LSH. At the heart of our method is a fast way to estimate the Euclidean distance between two d-dimensional vectors; this is achieved by the use of randomized Hadamard transforms in a non-linear setting. This decreases the running time of a (k, L)-parameterized LSH from O(dkL) to O(dlog d + kL). Our experiments show that using the new LSH in nearest-neighbor applications can improve their running times by significant amounts. To the best of our knowledge, this is the first running time improvement to LSH that is both provable and practical.

135 citations


Proceedings Article
28 Jun 2011
TL;DR: This work addresses the problem of metric learning for multi-view data, namely the construction of embedding projections from data in different representations into a shared feature space, such that the Euclidean distance in this space provides a meaningful within-view as well as between-view similarity.
Abstract: We address the problem of metric learning for multi-view data, namely the construction of embedding projections from data in different representations into a shared feature space, such that the Euclidean distance in this space provides a meaningful within-view as well as between-view similarity. Our motivation stems from the problem of cross-media retrieval tasks, where the availability of a joint Euclidean distance function is a prerequisite to allow fast, in particular hashing-based, nearest neighbor queries. We formulate an objective function that expresses the intuitive concept that matching samples are mapped closely together in the output space, whereas non-matching samples are pushed apart, no matter in which view they are available. The resulting optimization problem is not convex, but it can be decomposed explicitly into a convex and a concave part, thereby allowing efficient optimization using the convex-concave procedure. Experiments on an image retrieval task show that nearest-neighbor based cross-view retrieval is indeed possible, and the proposed technique improves the retrieval accuracy over baseline techniques.

125 citations


Journal ArticleDOI
TL;DR: This paper proposes a new approach to calculate the spectral direction of change, using the Spectral Angle Mapper and Spectral Correlation Mapper spectral-similarity measures, and shows that the distance and similarity measures are complementary and need to be applied together.
Abstract: The need to monitor the Earth’s surface over a range of spatial and temporal scales is fundamental in ecosystems planning and management. Change-Vector Analysis (CVA) is a bi-temporal method of change detection that considers the magnitude and direction of change vector. However, many multispectral applications do not make use of the direction component. The procedure most used to calculate the direction component using multiband data is the direction cosine, but the number of output direction cosine images is equal to the number of original bands and has a complex interpretation. This paper proposes a new approach to calculate the spectral direction of change, using the Spectral Angle Mapper and Spectral Correlation Mapper spectral-similarity measures. The chief advantage of this approach is that it generates a single image of change information insensitive to illumination variation. In this paper the magnitude component of the spectral similarity was calculated in two ways: as the standard Euclidean distance and as the Mahalanobis distance. In this test the best magnitude measure was the Euclidean distance and the best similarity measure was Spectral Angle Mapper. The results show that the distance and similarity measures are complementary and need to be applied together.

Journal ArticleDOI
11 Oct 2011-Sensors
TL;DR: KDIsomap is used to perform nonlinear dimensionality reduction on the extracted local binary patterns (LBP) facial features, and produce low-dimensional discrimimant embedded data representations with striking performance improvement on facial expression recognition tasks.
Abstract: Facial expression recognition is an interesting and challenging subject. Considering the nonlinear manifold structure of facial images, a new kernel-based manifold learning method, called kernel discriminant isometric mapping (KDIsomap), is proposed. KDIsomap aims to nonlinearly extract the discriminant information by maximizing the interclass scatter while minimizing the intraclass scatter in a reproducing kernel Hilbert space. KDIsomap is used to perform nonlinear dimensionality reduction on the extracted local binary patterns (LBP) facial features, and produce low-dimensional discrimimant embedded data representations with striking performance improvement on facial expression recognition tasks. The nearest neighbor classifier with the Euclidean metric is used for facial expression classification. Facial expression recognition experiments are performed on two popular facial expression databases, i.e., the JAFFE database and the Cohn-Kanade database. Experimental results indicate that KDIsomap obtains the best accuracy of 81.59% on the JAFFE database, and 94.88% on the Cohn-Kanade database. KDIsomap outperforms the other used methods such as principal component analysis (PCA), linear discriminant analysis (LDA), kernel principal component analysis (KPCA), kernel linear discriminant analysis (KLDA) as well as kernel isometric mapping (KIsomap).

Posted Content
TL;DR: It is concluded that every bounded subset K of $\mathbb{R}^{n}$ embeds into the Hamming cube {−1,1}m with a small distortion in the Gromov–Haussdorff metric.
Abstract: Given a subset K of the unit Euclidean sphere, we estimate the minimal number m = m(K) of hyperplanes that generate a uniform tessellation of K, in the sense that the fraction of the hyperplanes separating any pair x, y in K is nearly proportional to the Euclidean distance between x and y. Random hyperplanes prove to be almost ideal for this problem; they achieve the almost optimal bound m = O(w(K)^2) where w(K) is the Gaussian mean width of K. Using the map that sends x in K to the sign vector with respect to the hyperplanes, we conclude that every bounded subset K of R^n embeds into the Hamming cube {-1, 1}^m with a small distortion in the Gromov-Haussdorf metric. Since for many sets K one has m = m(K) << n, this yields a new discrete mechanism of dimension reduction for sets in Euclidean spaces.

Journal ArticleDOI
TL;DR: Practical and efficient algorithms to construct iso-contours, bisectors, and Voronoi diagrams of point sites on M, based on an exact geodesic metric, are presented.
Abstract: In the research of computer vision and machine perception, 3D objects are usually represented by 2-manifold triangular meshes M. In this paper, we present practical and efficient algorithms to construct iso-contours, bisectors, and Voronoi diagrams of point sites on M, based on an exact geodesic metric. Compared to euclidean metric spaces, the Voronoi diagrams on M exhibit many special properties that fail all of the existing euclidean Voronoi algorithms. To provide practical algorithms for constructing geodesic-metric-based Voronoi diagrams on M, this paper studies the analytic structure of iso-contours, bisectors, and Voronoi diagrams on M. After a necessary preprocessing of model M, practical algorithms are proposed for quickly obtaining full information about iso--contours, bisectors, and Voronoi diagrams on M. The complexity of the construction algorithms is also analyzed. Finally, three interesting applications-surface sampling and reconstruction, 3D skeleton extraction, and point pattern analysis-are presented that show the potential power of the proposed algorithms in pattern analysis.

Journal ArticleDOI
TL;DR: In this paper, a new adaptive Mahalanobis distance, which takes into account the local structure of dependence of the variables, is proposed to evaluate the distance of an observation to its nearest neighbors in the learning sample constituted of observations under control.
Abstract: In recent years, fault detection has become a crucial issue in semiconductor manufacturing. Indeed, it is necessary to constantly improve equipment productivity. Rapid detection of abnormal behavior is one of the primary objectives. Statistical methods such as control charts are the most widely used approaches for fault detection. Due to the number of variables and the possible correlations between them, these control charts need to be multivariate. Among them, the most popular is probably the Hotelling T2 rule. However, this rule only makes sense when the variables are Gaussian, which is rarely true in practice. A possible alternative is to use nonparametric control charts, such as the k-nearest neighbor detection rule by He and Wang, in 2007, only constructed from the learning sample and without assumption on the variables distribution. This approach consists in evaluating the distance of an observation to its nearest neighbors in the learning sample constituted of observations under control. A fault is declared if this distance is too large. In this paper, a new adaptive Mahalanobis distance, which takes into account the local structure of dependence of the variables, is proposed. Simulation trials are performed to study the benefit of the new distance against the Euclidean distance. The method is applied on the photolithography step of the manufacture of an integrated circuit.

Journal ArticleDOI
TL;DR: In this paper, the principles of one-class classifiers are introduced, together with the distinctions between soft/hard, conjoint/disjoint and modelling/discriminatory methods.
Abstract: The principles of one-class classifiers are introduced, together with the distinctions between one-class/multiclass, soft/hard, conjoint/disjoint and modelling/discriminatory methods The methods are illustrated using case studies, namely from nuclear magnetic resonance metabolomic profiling, thermal analysis of polymers and simulations Two main groups of classifier are described, namely statistically based distance metrics from centroids (Euclidean distance and quadratic discriminant analysis) and support vector domain description (SVDD) The statistical basis of the D statistic and its relationship with the F statistic, χ2, normal distribution and T2 is discussed The SVDD D value is described Methods for assessing the distance of residuals to disjoint principal component models (Q statistic) and their combination with distance-based methods to give the G statistic are outlined Copyright © 2011 John Wiley & Sons, Ltd

Journal ArticleDOI
TL;DR: A global annual cascading failure effect metric as well as a GACFE-based cost improvement metric are introduced to contribute to coupled utility system design or retrofit given that current guidelines or recommended practices in the utility industry mostly rely on minimum Euclidean distances and are yet to include interdependent effects in their provisions.

Journal ArticleDOI
TL;DR: The results confirm the potential of the proposed algorithm to allow reliable segmentation and quantification of breast lesion in mammograms and quantify the traditional watershed transformation to obtain the lesion boundary in the belt between the internal and external markers.
Abstract: Lesion segmentation, which is a critical step in computer-aided diagnosis system, is a challenging task as lesion boundaries are usually obscured, irregular, and low contrast. In this paper, an accurate and robust algorithm for the automatic segmentation of breast lesions in mammograms is proposed. The traditional watershed transformation is applied to the smoothed (by the morphological reconstruction) morphological gradient image to obtain the lesion boundary in the belt between the internal and external markers. To automatically determine the internal and external markers, the rough region of the lesion is identified by a template matching and a thresholding method. Then, the internal marker is determined by performing a distance transform and the external marker by morphological dilation. The proposed algorithm is quantitatively compared to the dynamic programming boundary tracing method and the plane fitting and dynamic programming method on a set of 363 lesions (size range, 5–42 mm in diameter; mean, 15 mm), using the area overlap metric (AOM), Hausdorff distance (HD), and average minimum Euclidean distance (AMED). The mean ± SD of the values of AOM, HD, and AMED for our method were respectively 0.72 ± 0.13, 5.69 ± 2.85 mm, and 1.76 ± 1.04 mm, which is a better performance than two other proposed segmentation methods. The results also confirm the potential of the proposed algorithm to allow reliable segmentation and quantification of breast lesion in mammograms.

Journal ArticleDOI
TL;DR: In this article, a GWR model is established to explore spatially varying relationships between house price and floor area with sampled house prices in London, and the output using network distance with a fixed kernel makes a significant improvement.
Abstract: Geographically Weighted Regression (GWR) is a local modelling technique to estimate regression models with spatially varying relationships. Generally, the Euclidean distance is the default metric for calibrating a GWR model in previous research and applications; however, it may not always be the most reasonable choice due to a partition by some natural or man-made features. Thus, we attempt to use a non-Euclidean distance metric in GWR. In this study, a GWR model is established to explore spatially varying relationships between house price and floor area with sampled house prices in London. To calibrate this GWR model, network distance is adopted. Compared with the other results from calibrations with Euclidean distance or adaptive kernels, the output using network distance with a fixed kernel makes a significant improvement, and the river Thames has a clear cut-off effect on the parameter estimations.

Journal ArticleDOI
TL;DR: This work implements the proposed methods to segment the breast medical images into different regions, each corresponding to a different tissue, based on the signal enhancement-time information.
Abstract: This paper presents an automatic effective fuzzy c-means segmentation method for segmenting breast cancer MRI based on standard fuzzy c-means. To introduce a new effective segmentation method, this paper introduced a novel objective function by replacing original Euclidean distance on feature space using new hyper tangent function. This paper obtains the new hyper tangent function from exited hyper tangent function to perform effectively with large number of data from more noised medical images and to have strong clusters. It derives an effective method to construct the membership matrix for objects, and it derives a robust method for updating centers from proposed novel objective function. Experiments will be done with an artificially generated data set to show how effectively the new fuzzy c-means obtain clusters, and then this work implements the proposed methods to segment the breast medical images into different regions, each corresponding to a different tissue, based on the signal enhancement-time information. This paper compares the results with results of standard fuzzy c-means algorithm. The correct classification rate of proposed fuzzy c-means segmentation method is obtained using silhouette method.

Journal ArticleDOI
TL;DR: This paper presents an alternative procedure, that can be interpreted as a refinement of Rippa’s algorithm for a cost function based on the euclidean norm, and points out how this method is related to the procedure of maximum likelihood estimation, which is used for identifying covariance parameters of stochastic processes in spatial statistics.
Abstract: The impact of the scaling parameter c on the accuracy of interpolation schemes using radial basis functions (RBFs) has been pointed out by several authors. Rippa (Adv Comput Math 11:193---210, 1999) proposes an algorithm based on the idea of cross validation for selecting a good such parameter value. In this paper we present an alternative procedure, that can be interpreted as a refinement of Rippa's algorithm for a cost function based on the euclidean norm. We point out how this method is related to the procedure of maximum likelihood estimation, which is used for identifying covariance parameters of stochastic processes in spatial statistics. Using the same test functions as Rippa we show that our algorithm compares favorably with cross validation in many cases and discuss its limitations. Finally we present some computational aspects of our algorithm.

Journal ArticleDOI
TL;DR: This paper proposes three of such distance measures based on the audio content: first, a low-level measure based on tempo-related description; second, a high-level semantic measurebased on the inference of different musical dimensions by support vector machines; and third, a hybrid measure which combines the above-mentioned distance measures.
Abstract: Measuring music similarity is essential for multimedia retrieval. For music items, this task can be regarded as obtaining a suitable distance measurement between songs defined on a certain feature space. In this paper, we propose three of such distance measures based on the audio content: first, a low-level measure based on tempo-related description; second, a high-level semantic measure based on the inference of different musical dimensions by support vector machines. These dimensions include genre, culture, moods, instruments, rhythm, and tempo annotations. Third, a hybrid measure which combines the above-mentioned distance measures with two existing low-level measures: a Euclidean distance based on principal component analysis of timbral, temporal, and tonal descriptors, and a timbral distance based on single Gaussian Mel-frequency cepstral coefficient (MFCC) modeling. We evaluate our proposed measures against a number of baseline measures. We do this objectively based on a comprehensive set of music collections, and subjectively based on listeners' ratings. Results show that the proposed methods achieve accuracies comparable to the baseline approaches in the case of the tempo and classifier-based measures. The highest accuracies are obtained by the hybrid distance. Furthermore, the proposed classifier-based approach opens up the possibility to explore distance measures that are based on semantic notions.

Posted Content
TL;DR: BoostMetric as mentioned in this paper uses rank-one positive semidefinite matrices as weak learners within an efficient and scalable boosting-based learning process to learn a valid Mahalanobis distance metric.
Abstract: The success of many machine learning and pattern recognition methods relies heavily upon the identification of an appropriate distance metric on the input data. It is often beneficial to learn such a metric from the input training data, instead of using a default one such as the Euclidean distance. In this work, we propose a boosting-based technique, termed BoostMetric, for learning a quadratic Mahalanobis distance metric. Learning a valid Mahalanobis distance metric requires enforcing the constraint that the matrix parameter to the metric remains positive definite. Semidefinite programming is often used to enforce this constraint, but does not scale well and easy to implement. BoostMetric is instead based on the observation that any positive semidefinite matrix can be decomposed into a linear combination of trace-one rank-one matrices. BoostMetric thus uses rank-one positive semidefinite matrices as weak learners within an efficient and scalable boosting-based learning process. The resulting methods are easy to implement, efficient, and can accommodate various types of constraints. We extend traditional boosting algorithms in that its weak learner is a positive semidefinite matrix with trace and rank being one rather than a classifier or regressor. Experiments on various datasets demonstrate that the proposed algorithms compare favorably to those state-of-the-art methods in terms of classification accuracy and running time.

Journal ArticleDOI
Duhu Man1, Kenji Uda1, Hironobu Ueyama1, Yasuaki Ito1, Koji Nakano1 
TL;DR: A simple parallel algorithm for the EDM is developed and implemented and it achieves a speedup factor of 18 over the performance of a sequential algorithm using a single processor in the same system.
Abstract: Given a 2-D binary image of size n×n, Euclidean Distance Map (EDM) is a 2-D array of the same size such that each element is storing the Euclidean distance to the nearest black pixel. It is known that a sequential algorithm can compute the EDM in O(n2) and thus this algorithm is optimal. Also, work-time optimal parallel algorithms for shared memory model have been presented. However, the presented parallel algorithms are too complicated to implement in existing shared memory parallel machines. The main contribution of this paper is to develop a simple parallel algorithm for the EDM and implement it in two different parallel platforms: multicore processors and Graphics Processing Units (GPUs). We have implemented our parallel algorithm in a Linux server with four Intel hexad-core processors (Intel Xeon X7460 2.66GHz). We have also implemented it in the following two modern GPU systems, Tesla C1060 and GTX 480, respectively. The experimental results have shown that, for an input binary image with size of 9216×9216, our implementation in the multicore system achieves a speedup factor of 18 over the performance of a sequential algorithm using a single processor in the same system. Meanwhile, for the same input binary image, our implementation on the GPU achieves a speedup factor of 26 over the sequential algorithm implementation.

Journal ArticleDOI
TL;DR: This paper finds the nearest trapezoidal approximation and the nearest symmetric trapezoid approximation to a given fuzzy number, with respect to the average Euclidean distance, preserving the value and ambiguity.
Abstract: Value and ambiguity are two parameters which were introduced to represent fuzzy numbers. In this paper, we find the nearest trapezoidal approximation and the nearest symmetric trapezoidal approximation to a given fuzzy number, with respect to the average Euclidean distance, preserving the value and ambiguity. To avoid the laborious calculus associated with the Karush-Kuhn-Tucker theorem, the working tool in some recent papers, a less sophisticated method is proposed. Algorithms for computing the approximations, many examples, proofs of continuity and two applications to ranking of fuzzy numbers and estimations of the defect of additivity for approximations are given.

Journal ArticleDOI
TL;DR: The local binary pattern (LBP) approach for feature extraction is used which is a very effective feature descriptor for classification Euclidean distance and Changed Manhattan distance methods are used.
Abstract: Down syndrome has a private facial view, thus it can be recognized by using facial features. But this is a very challenging problem when the similarity between the faces of people with Down syndrome and not Down syndrome people are considered. Therefore, we used the local binary pattern (LBP) approach for feature extraction which is a very effective feature descriptor. For classification Euclidean distance and Changed Manhattan distance methods are used. In this way, we improved an efficient system to recognize Down syndrome.

Journal ArticleDOI
TL;DR: This paper proposes an NC system, so-called Phoenix, which is based on the matrix factorization model, and shows that Phoenix achieves a scalable yet accurate end-to-end distances monitoring and is able to characterize TIV better than other existing NC systems.
Abstract: Network coordinate (NC) systems provide a lightweight and scalable way for predicting the distances, i.e., round-trip latencies among Internet hosts. Most existing NC systems embed hosts into a low dimensional Euclidean space. Unfortunately, the persistent occurrence of Triangle Inequality Violation (TIV) on the Internet largely limits the distance prediction accuracy of those NC systems. Some alternative systems aim at handling the persistent TIV, however, they only achieve comparable prediction accuracy with Euclidean distance based NC systems. In this paper, we propose an NC system, so-called Phoenix, which is based on the matrix factorization model. Phoenix introduces a weight to each reference NC and trusts the NCs with higher weight values more than the others. The weight-based mechanism can substantially reduce the impact of the error propagation. Using the representative aggregate data sets and the newly measured dynamic data set collected from the Internet, our simulations show that Phoenix achieves significantly higher prediction accuracy than other NC systems. We also show that Phoenix quickly converges to steady state, performs well under host churn, handles the drift of the NCs successfully by using regularization, and is robust against measurement anomalies. Phoenix achieves a scalable yet accurate end-to-end distances monitoring. In addition, we study how well an NC system can characterize the TIV property on the Internet by introducing two new quantitative metrics, so-called RERPL and AERPL. We show that Phoenix is able to characterize TIV better than other existing NC systems.

Proceedings ArticleDOI
13 Jun 2011
TL;DR: In a general metric space the tail bounds of the distribution of the MST length cannot be approximated to any multiplicative factor in polynomial time under the assumption that P ≠ NP.
Abstract: We study the complexity of geometric minimum spanning trees under a stochastic model of input: Suppose we are given a master set of points s1,s_2,...,sn in d-dimensional Euclidean space, where each point si is active with some independent and arbitrary but known probability pi. We want to compute the expected length of the minimum spanning tree (MST) of the active points. This particular form of stochastic problems is motivated by the uncertainty inherent in many sources of geometric data but has not been investigated before in computational geometry to the best of our knowledge. Our main results include the following.We show that the stochastic MST problem is SPHARD for any dimension d ≥ 2. We present a simple fully polynomial randomized approximation scheme (FPRAS) for a metric space, and thus also for any Euclidean space. For d=2, we present two deterministic approximation algorithms: an O(n4)-time constant-factor algorithm, and a PTAS based on a combination of shifted quadtrees and dynamic programming. We show that in a general metric space the tail bounds of the distribution of the MST length cannot be approximated to any multiplicative factor in polynomial time under the assumption that P ≠ NP.In addition to this existential model of stochastic input, we also briefly consider a locational model where each point is present with certainty but its location is probabilistic.

Journal ArticleDOI
TL;DR: It was found that the ROCED parameter gets a better balance between sensitivity and specificity for both the training and prediction sets than other indices such as the Matthews correlation coefficient, the Wilk's lambda, or parameters like the area under the Roc curve.
Abstract: There are several indices that provide an indication of different types on the performance of QSAR classification models, being the area under a Receiver Operating Characteristic (ROC) curve still the most powerful test to overall assess such performance. All ROC related parameters can be calculated for both the training and test sets, but, nevertheless, neither of them constitutes an absolute indicator of the classification performance by themselves. Moreover, one of the biggest drawbacks is the computing time needed to obtain the area under the ROC curve, which naturally slows down any calculation algorithm. The present study proposes two new parameters based on distances in a ROC curve for the selection of classification models with an appropriate balance in both training and test sets, namely the following: the ROC graph Euclidean distance (ROCED) and the ROC graph Euclidean distance corrected with Fitness Function (FIT(λ)) (ROCFIT). The behavior of these indices was observed through the study on the ...

Journal ArticleDOI
TL;DR: In this paper, generalized Majorana spinors for arbitrary dimensions and signature of the metric were defined for d = 2,3,4,8,9mod8, independently of the signature.