scispace - formally typeset
Search or ask a question

Showing papers on "Euclidean distance published in 2017"


Proceedings ArticleDOI
16 Mar 2017
TL;DR: SVDNet as mentioned in this paper proposes to optimize the deep representation learning process with Singular Vector Decomposition (SVD) to reduce the correlation among the projection vectors, which produces more discriminative FC descriptors, and significantly improves the reID accuracy.
Abstract: This paper proposes the SVDNet for retrieval problems, with focus on the application of person re-identification (reID). We view each weight vector within a fully connected (FC) layer in a convolutional neuron network (CNN) as a projection basis. It is observed that the weight vectors are usually highly correlated. This problem leads to correlations among entries of the FC descriptor, and compromises the retrieval performance based on the Euclidean distance. To address the problem, this paper proposes to optimize the deep representation learning process with Singular Vector Decomposition (SVD). Specifically, with the restraint and relaxation iteration (RRI) training scheme, we are able to iteratively integrate the orthogonality constraint in CNN training, yielding the so-called SVDNet. We conduct experiments on the Market-1501, CUHK03, and DukeMTMC-reID datasets, and show that RRI effectively reduces the correlation among the projection vectors, produces more discriminative FC descriptors, and significantly improves the re-ID accuracy. On the Market-1501 dataset, for instance, rank-1 accuracy is improved from 55.3% to 80.5% for CaffeNet, and from 73.8% to 82.3% for ResNet-50.

625 citations


Posted Content
TL;DR: This paper proposes the SVDNet for retrieval problems, with focus on the application of person re-identification (reID), and shows that RRI effectively reduces the correlation among the projection vectors, produces more discriminative FC descriptors, and significantly improves the re-ID accuracy.
Abstract: This paper proposes the SVDNet for retrieval problems, with focus on the application of person re-identification (re-ID). We view each weight vector within a fully connected (FC) layer in a convolutional neuron network (CNN) as a projection basis. It is observed that the weight vectors are usually highly correlated. This problem leads to correlations among entries of the FC descriptor, and compromises the retrieval performance based on the Euclidean distance. To address the problem, this paper proposes to optimize the deep representation learning process with Singular Vector Decomposition (SVD). Specifically, with the restraint and relaxation iteration (RRI) training scheme, we are able to iteratively integrate the orthogonality constraint in CNN training, yielding the so-called SVDNet. We conduct experiments on the Market-1501, CUHK03, and Duke datasets, and show that RRI effectively reduces the correlation among the projection vectors, produces more discriminative FC descriptors, and significantly improves the re-ID accuracy. On the Market-1501 dataset, for instance, rank-1 accuracy is improved from 55.3% to 80.5% for CaffeNet, and from 73.8% to 82.3% for ResNet-50.

544 citations


Proceedings ArticleDOI
01 Jul 2017
TL;DR: In this paper, a 2D-to-3D distance matrix regression model is proposed for 3D human pose estimation from a single image, where the 2D position of the N body joints is first detected using a CNN-based detector, and then these observations are used to infer 3D pose.
Abstract: This paper addresses the problem of 3D human pose estimation from a single image. We follow a standard two-step pipeline by first detecting the 2D position of the N body joints, and then using these observations to infer 3D pose. For the first step, we use a recent CNN-based detector. For the second step, most existing approaches perform 2N-to-3N regression of the Cartesian joint coordinates. We show that more precise pose estimates can be obtained by representing both the 2D and 3D human poses using NxN distance matrices, and formulating the problem as a 2D-to-3D distance matrix regression. For learning such a regressor we leverage on simple Neural Network architectures, which by construction, enforce positivity and symmetry of the predicted matrices. The approach has also the advantage to naturally handle missing observations and allowing to hypothesize the position of non-observed joints. Quantitative results on Humaneva and Human3.6M datasets demonstrate consistent performance gains over state-of-the-art. Qualitative evaluation on the images in-the-wild of the LSP dataset, using the regressor learned on Human3.6M, reveals very promising generalization results.

402 citations


Proceedings ArticleDOI
01 Jul 2017
TL;DR: This paper proposes a new metric learning scheme, based on structured prediction, that is aware of the global structure of the embedding space, and which is designed to optimize a clustering quality metric (NMI).
Abstract: Learning image similarity metrics in an end-to-end fashion with deep networks has demonstrated excellent results on tasks such as clustering and retrieval. However, current methods, all focus on a very local view of the data. In this paper, we propose a new metric learning scheme, based on structured prediction, that is aware of the global structure of the embedding space, and which is designed to optimize a clustering quality metric (NMI). We show state of the art performance on standard datasets, such as CUB200-2011 [37], Cars196 [18], and Stanford online products [30] on NMI and R@K evaluation metrics.

278 citations


Journal ArticleDOI
01 Apr 2017
TL;DR: In this paper, an expository paper on the theory of gradient flows, and in particular of those PDEs which can be interpreted as gradient flows for the Wasserstein metric on the space of probability measures, is presented.
Abstract: This is an expository paper on the theory of gradient flows, and in particular of those PDEs which can be interpreted as gradient flows for the Wasserstein metric on the space of probability measures (a distance induced by optimal transport). The starting point is the Euclidean theory, and then its generalization to metric spaces, according to the work of Ambrosio, Gigli and Savare. Then comes an independent exposition of the Wasserstein theory, with a short introduction to the optimal transport tools that are needed and to the notion of geodesic convexity, followed by a precise desciption of the Jordan-Kinderleher-Otto scheme, with proof of convergence in the easiest case: the linear Fokker-Planck equation. A discussion of other gradient flows PDEs and of numerical methods based on these ideas is also provided. The paper ends with a new, theoretical, development, due to Ambrosio, Gigli, Savare, Kuwada and Ohta: the study of the heat flow in metric measure spaces.

213 citations


Proceedings ArticleDOI
05 Mar 2017
TL;DR: TristouNet as mentioned in this paper is a neural network architecture based on Long Short-Term Memory recurrent networks, meant to project speech sequences into a fixed-dimensional euclidean space, thanks to the triplet loss paradigm used for training.
Abstract: TristouNet is a neural network architecture based on Long Short-Term Memory recurrent networks, meant to project speech sequences into a fixed-dimensional euclidean space. Thanks to the triplet loss paradigm used for training, the resulting sequence embeddings can be compared directly with the euclidean distance, for speaker comparison purposes. Experiments on short (between 500ms and 5s) speech turn comparison and speaker change detection show that TristouNet brings significant improvements over the current state-of-the-art techniques for both tasks.

197 citations


Journal ArticleDOI
TL;DR: A metric transfer learning framework (MTF) is proposed to encode metric learning in transfer learning to make knowledge transfer across domains more effective and develops general solutions to both classification and regression problems on top of MTLF.
Abstract: Transfer learning has been proven to be effective for the problems where training data from a source domain and test data from a target domain are drawn from different distributions. To reduce the distribution divergence between the source domain and the target domain, many previous studies have been focused on designing and optimizing objective functions with the Euclidean distance to measure dissimilarity between instances. However, in some real-world applications, the Euclidean distance may be inappropriate to capture the intrinsic similarity or dissimilarity between instances. To deal with this issue, in this paper, we propose a metric transfer learning framework (MTLF) to encode metric learning in transfer learning. In MTLF, instance weights are learned and exploited to bridge the distributions of different domains, while Mahalanobis distance is learned simultaneously to maximize the intra-class distances and minimize the inter-class distances for the target domain. Unlike previous work where instance weights and Mahalanobis distance are trained in a pipelined framework that potentially leads to error propagation across different components, MTLF attempts to learn instance weights and a Mahalanobis distance in a parallel framework to make knowledge transfer across domains more effective. Furthermore, we develop general solutions to both classification and regression problems on top of MTLF, respectively. We conduct extensive experiments on several real-world datasets on object recognition, handwriting recognition, and WiFi location to verify the effectiveness of MTLF compared with a number of state-of-the-art methods.

170 citations


Proceedings ArticleDOI
01 Oct 2017
TL;DR: This work proposes a regularization term to maximize the spread in feature descriptor inspired by the property of uniform distribution and shows that the proposed regularization with triplet loss outperforms existing Euclidean distance based descriptor learning techniques by a large margin.
Abstract: We propose a simple, yet powerful regularization technique that can be used to significantly improve both the pairwise and triplet losses in learning local feature descriptors. The idea is that in order to fully utilize the expressive power of the descriptor space, good local feature descriptors should be sufficiently “spread-out” over the space. In this work, we propose a regularization term to maximize the spread in feature descriptor inspired by the property of uniform distribution. We show that the proposed regularization with triplet loss outperforms existing Euclidean distance based descriptor learning techniques by a large margin. As an extension, the proposed regularization technique can also be used to improve image-level deep feature embedding.

138 citations


Proceedings Article
06 Aug 2017
TL;DR: In this article, a differentiable learning loss between time series is proposed, based on the celebrated dynamic time warping (DTW) discrepancy, which is robust to shifts or dilatations across the time dimension.
Abstract: We propose in this paper a differentiable learning loss between time series, building upon the celebrated dynamic time warping (DTW) discrepancy. Unlike the Euclidean distance, DTW can compare time series of variable size and is robust to shifts or dilatations across the time dimension. To compute DTW, one typically solves a minimal-cost alignment problem between two time series using dynamic programming. Our work takes advantage of a smoothed formulation of DTW, called soft-DTW, that computes the soft-minimum of all alignment costs. We show in this paper that soft-DTW is a differentiable loss function, and that both its value and gradient can be computed with quadratic time/space complexity (DTW has quadratic time but linear space complexity). We show that this regularization is particularly well suited to average and cluster time series under the DTW geometry, a task for which our proposal significantly outperforms existing baselines (Petitjean et al., 2011). Next, we propose to tune the parameters of a machine that outputs time series by minimizing its fit with ground-truth labels in a soft-DTW sense.

134 citations


Journal ArticleDOI
13 Jan 2017-PLOS ONE
TL;DR: This paper argues that the generalisation of Ward’s linkage method to incorporate Manhattan distances is theoretically sound and provides an example of where this method outperforms the method using Euclidean distances.
Abstract: The claim that Ward's linkage algorithm in hierarchical clustering is limited to use with Euclidean distances is investigated. In this paper, Ward's clustering algorithm is generalised to use with l1 norm or Manhattan distances. We argue that the generalisation of Ward's linkage method to incorporate Manhattan distances is theoretically sound and provide an example of where this method outperforms the method using Euclidean distances. As an application, we perform statistical analyses on languages using methods normally applied to biology and genetic classification. We aim to quantify differences in character traits between languages and use a statistical language signature based on relative bi-gram (sequence of two letters) frequencies to calculate a distance matrix between 32 Indo-European languages. We then use Ward's method of hierarchical clustering to classify the languages, using the Euclidean distance and the Manhattan distance. Results obtained from using the different distance metrics are compared to show that the Ward's algorithm characteristic of minimising intra-cluster variation and maximising inter-cluster variation is not violated when using the Manhattan metric.

134 citations


Proceedings ArticleDOI
15 Aug 2017
TL;DR: The results show that in semantic segmentation the method can replace DenseCRF inference with a cascade of segmentation-aware filters, and in optical flow the results are clearly sharper responses than the ones obtained with comparable networks that do not use segmentation.
Abstract: We introduce an approach to integrate segmentation information within a convolutional neural network (CNN). This counter-acts the tendency of CNNs to smooth information across regions and increases their spatial precision. To obtain segmentation information, we set up a CNN to provide an embedding space where region co-membership can be estimated based on Euclidean distance. We use these embeddings to compute a local attention mask relative to every neuron position. We incorporate such masks in CNNs and replace the convolution operation with a “segmentation-aware” variant that allows a neuron to selectively attend to inputs coming from its own region. We call the resulting network a segmentation-aware CNN because it adapts its filters at each image point according to local segmentation cues, while at the same time remaining fully-convolutional. We demonstrate the merit of our method on two widely different dense prediction tasks, that involve classification (semantic segmentation) and regression (optical flow). Our results show that in semantic segmentation we can replace DenseCRF inference with a cascade of segmentation-aware filters, and in optical flow we obtain clearly sharper responses than the ones obtained with comparable networks that do not use segmentation. In both cases segmentation-aware convolution yields systematic improvements over strong baselines.

Journal ArticleDOI
TL;DR: A multi-viewpoint remote sensing image registration method which contains a geometric constraint term introduced into the L2E-based energy function for better behaving the non-rigid transformation and compared with five state-of-the-art methods.
Abstract: Remote sensing image registration plays an important role in military and civilian fields, such as natural disaster damage assessment, military damage assessment and ground targets identification, etc. However, due to the ground relief variations and imaging viewpoint changes, non-rigid geometric distortion occurs between remote sensing images with different viewpoint, which further increases the difficulty of remote sensing image registration. To address the problem, we propose a multi-viewpoint remote sensing image registration method which contains the following contributions. (i) A multiple features based finite mixture model is constructed for dealing with different types of image features. (ii) Three features are combined and substituted into the mixture model to form a feature complementation, i.e., the Euclidean distance and shape context are used to measure the similarity of geometric structure, and the SIFT (scale-invariant feature transform) distance which is endowed with the intensity information is used to measure the scale space extrema. (iii) To prevent the ill-posed problem, a geometric constraint term is introduced into the L2E-based energy function for better behaving the non-rigid transformation. We evaluated the performances of the proposed method by three series of remote sensing images obtained from the unmanned aerial vehicle (UAV) and Google Earth, and compared with five state-of-the-art methods where our method shows the best alignments in most cases.

Journal ArticleDOI
TL;DR: The discriminant information was introduced into SPP to arrive at a novel supervised feather extraction method that named Uncorrelated Discriminant SPP (UDSPP) algorithm, which can effectively express discriminant Information, while preserving local neighbor relationship.
Abstract: Feature extraction has always been an important step in face recognition, the quality of which directly determines recognition result Based on making full use of advantages of Sparse Preserving Projection (SPP) on feature extraction, the discriminant information was introduced into SPP to arrive at a novel supervised feather extraction method that named Uncorrelated Discriminant SPP (UDSPP) algorithm The obtained projection with the method by sparse preserving intra-class and maximizing distance inter-class can effectively express discriminant information, while preserving local neighbor relationship Moreover, statistics uncorrelated constraint was also added to decrease redundancy among feature vectors so as to obtain more information as possible with little vectors as possible The experimental results show that the recognition rate improved compared with SPP The method is also superior to recognition methods based on Euclidean distance in processing face database in light

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed to exploit depth information to provide more invariant body shape and skeleton information regardless of illumination and color change, and further proposed a locally rotation invariant depth shape descriptor called Eigen-depth feature to describe pedestrian body shape.
Abstract: Person re-identification (re-id) aims to match people across non-overlapping camera views. So far the RGB-based appearance is widely used in most existing works. However, when people appeared in extreme illumination or changed clothes, the RGB appearance-based re-id methods tended to fail. To overcome this problem, we propose to exploit depth information to provide more invariant body shape and skeleton information regardless of illumination and color change. More specifically, we exploit depth voxel covariance descriptor and further propose a locally rotation invariant depth shape descriptor called Eigen-depth feature to describe pedestrian body shape. We prove that the distance between any two covariance matrices on the Riemannian manifold is equivalent to the Euclidean distance between the corresponding Eigen-depth features. Furthermore, we propose a kernelized implicit feature transfer scheme to estimate Eigen-depth feature implicitly from RGB image when depth information is not available. We find that combining the estimated depth features with RGB-based appearance features can sometimes help to better reduce visual ambiguities of appearance features caused by illumination and similar clothes. The effectiveness of our models was validated on publicly available depth pedestrian datasets as compared to related methods for re-id.

Journal ArticleDOI
TL;DR: This research proposes a hybrid strategy for efficient classification of human activities from a given video sequence by integrating four major steps: segment the moving objects by fusing novel uniform segmentation and expectation maximization, extract a new set of fused features using local binary patterns with histogram oriented gradient and Harlick features, and feature classification using multi-class support vector machine.
Abstract: Human activity monitoring in the video sequences is an intriguing computer vision domain which incorporates colossal applications, e.g., surveillance systems, human-computer interaction, and traffic control systems. In this research, our primary focus is in proposing a hybrid strategy for efficient classification of human activities from a given video sequence. The proposed method integrates four major steps: (a) segment the moving objects by fusing novel uniform segmentation and expectation maximization, (b) extract a new set of fused features using local binary patterns with histogram oriented gradient and Harlick features, (c) feature selection by novel Euclidean distance and joint entropy-PCA-based method, and (d) feature classification using multi-class support vector machine. The three benchmark datasets (MIT, CAVIAR, and BMW-10) are used for training the classifier for human classification; and for testing, we utilized multi-camera pedestrian videos along with MSR Action dataset, INRIA, and CASIA dataset. Additionally, the results are also validated using dataset recorded by our research group. For action recognition, four publicly available datasets are selected such as Weizmann, KTH, UIUC, and Muhavi to achieve recognition rates of 95.80, 99.30, 99, and 99.40%, respectively, which confirm the authenticity of our proposed work. Promising results are achieved in terms of greater precision compared to existing techniques.

Journal ArticleDOI
Deng Sheng1, Lan Du1, Chen Li1, Jun Ding1, Hongwei Liu1 
TL;DR: A deep learning method based on a multilayer autoencoder (AE) combined with a supervised constraint to use the limited training images well and to prevent overfitting caused by supervised learning is proposed.
Abstract: Deep learning algorithms have been introduced into target recognition of synthetic aperture radar (SAR) images for extracting deep features because of its accuracy on various recognition problems with sufficient training samples. However, applying deep structures in recognizing SAR images may suffer lack of training samples. Therefore, a deep learning method is proposed in this study based on a multilayer autoencoder (AE) combined with a supervised constraint. We bind the original AE algorithm with a restriction based on Euclidean distance to use the limited training images well. Moreover, a dropout step is added to our algorithm, which is designed to prevent overfitting caused by supervised learning. Experimental results on the MSTAR dataset demonstrate the effectiveness of the proposed method on real SAR images.

Posted Content
TL;DR: It is shown that the vector representations of nodes obtained by spectral embedding, using the largest eigenvalues by magnitude, provide strongly consistent latent position estimates with asymptotically Gaussian error, up to indefinite orthogonal transformation.
Abstract: A generalisation of a latent position network model known as the random dot product graph is considered. We show that, whether the normalised Laplacian or adjacency matrix is used, the vector representations of nodes obtained by spectral embedding, using the largest eigenvalues by magnitude, provide strongly consistent latent position estimates with asymptotically Gaussian error, up to indefinite orthogonal transformation. The mixed membership and standard stochastic block models constitute special cases where the latent positions live respectively inside or on the vertices of a simplex, crucially, without assuming the underlying block connectivity probability matrix is positive-definite. Estimation via spectral embedding can therefore be achieved by respectively estimating this simplicial support, or fitting a Gaussian mixture model. In the latter case, the use of $K$-means (with Euclidean distance), as has been previously recommended, is suboptimal and for identifiability reasons unsound. Indeed, Euclidean distances and angles are not preserved under indefinite orthogonal transformation, and we show stochastic block model examples where such quantities vary appreciably. Empirical improvements in link prediction (over the random dot product graph), as well as the potential to uncover richer latent structure (than posited under the mixed membership or standard stochastic block models) are demonstrated in a cyber-security example.

Journal ArticleDOI
TL;DR: Experimental results show that EED outperforms existing methods that estimate Euclidean distances in an indirect manner and the application of EED to the Minimal Learning Machine (MLM), a distance-based supervised learning method, provides promising results.

Journal ArticleDOI
TL;DR: A distance measure named weighted heterogeneous value distance metric, which can better deal with both continuous and discrete attributes simultaneously than the standard Euclidean distance, and a genetic algorithm for learning the attribute weights involved in this distance measure automatically are used.

Journal ArticleDOI
TL;DR: To solve the correntropy-based joint sparsity model, a half-quadratic optimization technique is developed to convert the original nonconvex and nonlinear optimization problem into an iteratively reweighted JSR problem.
Abstract: Joint sparse representation (JSR) has been a popular technique for hyperspectral image classification, where a testing pixel and its spatial neighbors are simultaneously approximated by a sparse linear combination of all training samples, and the testing pixel is classified based on the joint reconstruction residual of each class. Due to the least-squares representation of the approximation error, the JSR model is usually sensitive to outliers, such as background, noisy pixels, and outlying bands. In order to eliminate such effects, we propose three correntropy-based robust JSR (RJSR) models, i.e., RJSR for handling pixel noise, RJSR for handling band noise, and RJSR for handling both pixel and band noise. The proposed RJSR models replace the traditional square of the Euclidean distance with the correntropy-based metric in measuring the joint approximation error. To solve the correntropy-based joint sparsity model, a half-quadratic optimization technique is developed to convert the original nonconvex and nonlinear optimization problem into an iteratively reweighted JSR problem. As a result, the optimization of our models can handle the noise in neighboring pixels and the noise in spectral bands. It can adaptively assign small weights to noisy pixels or bands and put more emphasis on noise-free pixels or bands. The experimental results using real and simulated data demonstrate the effectiveness of our models in comparison with the related state-of-the-art JSR models.

Journal ArticleDOI
TL;DR: A new LHF-VIKOR (linguistic hesitant fuzzy Vlsekriterijumska Optimizacija I Kompromisno Resenje) method for solving multiple criteria decision-making problems with LHFSs is developed and an intelligent transportation system evaluation example is analyzed to demonstrate the effectiveness and feasibility of the proposed method.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a novel single and multiple transistors open-circuit fault diagnosis method for two-level three-phase pulse-width modulating rectifier based on topology symmetry analysis in healthy and faulty conditions.
Abstract: This paper proposed a novel single and multiple transistors open-circuit fault diagnosis method for two-level three-phase pulse-width modulating rectifier based on topology symmetry analysis in healthy and faulty conditions. First, similarity measurements between any two phase current linklists contained samples in a period after reconstruction and shape analysis, such as Euclidean distance, correlation coefficient, and cosine angle, are used to describe the symmetry of topology and divide the faults into three classes, then two extra features are extracted to locate the fault to broken leg and transistors, respectively. The proposed diagnostic method is robust, low cost, and easy to insert to the existed control system. The effectiveness and merit were evaluated by experimental results.

Journal ArticleDOI
TL;DR: A novel selective Euclidean distance approach in the nondominated sorting genetic algorithm II (NSGA-II) is proposed to steer the candidate solutions toward a better solution to improve the beampattern.
Abstract: Collaborative beamforming is usually characterized by high, asymmetrical sidelobe levels due to the randomness of node locations. Previous works have shown that the optimization methods aiming to reduce the peak sidelobe level (PSL) alone do not guarantee the overall sidelobe reduction of the beampattern, especially when the nodes are random and cannot be manipulated. Hence, this paper proposes a multiobjective amplitude and phase optimization technique with two objective functions: PSL minimization and directivity maximization, in order to improve the beampattern. A novel selective Euclidean distance approach in the nondominated sorting genetic algorithm II (NSGA-II) is proposed to steer the candidate solutions toward a better solution. Results obtained by the proposed NSGA with selective distance (NSGA-SD) are compared with the single-objective PSL optimization performed using both GA and particle swarm optimization. The proposed multiobjective NSGA provides up to 40% improvement in PSL reduction and 50% improvement in directivity maximization and up to 10% increased performance compared to the legacy NSGA-II. The analysis of the optimization method when considering mutual coupling between the nodes shows that this improvement is valid when the inter-node Euclidean separations are large.

Journal ArticleDOI
31 Jul 2017
TL;DR: It is proved that, for d ≥3, the problem of computing DBSCAN clusters from scratch requires ω(n 4/3) time to solve, unless very significant breakthroughs—ones widely believed to be impossible—could be made in theoretical computer science.
Abstract: DBSCAN is a method proposed in 1996 for clustering multi-dimensional points, and has received extensive applications. Its computational hardness is still unsolved to this date. The original KDDl96 paper claimed an algorithm of O(n log n) raverage runtime complexityl (where n is the number of data points) without a rigorous proof. In 2013, a genuine O(n log n)-time algorithm was found in 2D space under Euclidean distance. The hardness of dimensionality d ≥3 has remained open ever since.This article considers the problem of computing DBSCAN clusters from scratch (assuming no existing indexes) under Euclidean distance. We prove that, for d ≥3, the problem requires ω(n4/3) time to solve, unless very significant breakthroughs—ones widely believed to be impossible—could be made in theoretical computer science. Motivated by this, we propose a relaxed version of the problem called ρ-approximate DBSCAN, which returns the same clusters as DBSCAN, unless the clusters are runstablel (i.e., they change once the input parameters are slightly perturbed). The ρ-approximate problem can be settled in O(n) expected time regardless of the constant dimensionality d.The article also enhances the previous result on the exact DBSCAN problem in 2D space. We show that, if the n data points have been pre-sorted on each dimension (i.e., one sorted list per dimension), the problem can be settled in O(n) worst-case time. As a corollary, when all the coordinates are integers, the 2D DBSCAN problem can be solved in O(n log log n) time deterministically, improving the existing O(n log n) bound.

Journal ArticleDOI
TL;DR: In this work, a graph-based discriminant analysis with spectral similarity (denoted as GDA-SS) measurement is proposed, which fully considers curves changing description among spectral bands and demonstrates that the proposed method is superior to traditional methods, such as supervised LPP, and the state-of-the-art sparse graph- based discriminantAnalysis (SGDA).
Abstract: Recently, graph embedding has drawn great attention for dimensionality reduction in hyperspectral imagery. For example, locality preserving projection (LPP) utilizes typical Euclidean distance in a heat kernel to create an affinity matrix and projects the high-dimensional data into a lower-dimensional space. However, the Euclidean distance is not sufficiently correlated with intrinsic spectral variation of a material, which may result in inappropriate graph representation. In this work, a graph-based discriminant analysis with spectral similarity (denoted as GDA-SS) measurement is proposed, which fully considers curves changing description among spectral bands. Experimental results based on real hyperspectral images demonstrate that the proposed method is superior to traditional methods, such as supervised LPP, and the state-of-the-art sparse graph-based discriminant analysis (SGDA).

Journal ArticleDOI
TL;DR: The proposed improved sqrt-cosine similarity measure is applied to a variety of document-understanding tasks, such as text classification, clustering, and query search, and experimental results show that the proposed method is indeed effective.
Abstract: Text similarity measurement aims to find the commonality existing among text documents, which is fundamental to most information extraction, information retrieval, and text mining problems. Cosine similarity based on Euclidean distance is currently one of the most widely used similarity measurements. However, Euclidean distance is generally not an effective metric for dealing with probabilities, which are often used in text analytics. In this paper, we propose a new similarity measure based on sqrt-cosine similarity. We apply the proposed improved sqrt-cosine similarity to a variety of document-understanding tasks, such as text classification, clustering, and query search. Comprehensive experiments are then conducted to evaluate our new similarity measurement in comparison to existing methods. These experimental results show that our proposed method is indeed effective.

Journal ArticleDOI
TL;DR: The Manhattan distance is introduced to the WKNN algorithm to distinguish the influence of different reference nodes and a new method is proposed to increase the accuracy of the algorithm by adjusting the weight of adjacent reference nodes.
Abstract: The weighted K-nearest neighbor algorithm (WKNN) is widely used in indoor positioning based on Wi-Fi. However, the accuracy of this traditional algorithm using Euclidean distance is not high enough due to the ignorance of statistical regularities from the training set. In this paper, the Manhattan distance is introduced to the WKNN algorithm to distinguish the influence of different reference nodes. Simultaneously, a new method is proposed to increase the accuracy of the algorithm by adjusting the weight of adjacent reference nodes. The simulation and experiment results show that the improved algorithm can have a better performance by increasing the accuracy by 33.82%.

Journal ArticleDOI
TL;DR: New stochastic distances for the Fisher–Tippett distribution are derived and used as patch distance measures in a modified version of the BM3D algorithm for despeckling log-compressed ultrasound images.
Abstract: Ultrasound image despeckling is an important research field, since it can improve the interpretability of one of the main categories of medical imaging. Many techniques have been tried over the years for ultrasound despeckling, and more recently, a great deal of attention has been focused on patch-based methods, such as non-local means and block-matching collaborative filtering (BM3D). A common idea in these recent methods is the measure of distance between patches, originally proposed as the Euclidean distance, for filtering additive white Gaussian noise. In this paper, we derive new stochastic distances for the Fisher–Tippett distribution, based on well-known statistical divergences, and use them as patch distance measures in a modified version of the BM3D algorithm for despeckling log-compressed ultrasound images. State-of-the-art results in filtering simulated, synthetic, and real ultrasound images confirm the potential of the proposed approach.

Proceedings ArticleDOI
21 Jul 2017
TL;DR: This paper proposes the coupling of a Gaussian mixture of linear inverse regressions with a ConvNet and describes the methodological foundations and the associated algorithm to jointly train the deep network and the regression function, and is the first to incorporate inverse regression into deep learning for computer vision applications.
Abstract: Convolutional Neural Networks (ConvNets) have become the state-of-the-art for many classification and regression problems in computer vision. When it comes to regression, approaches such as measuring the Euclidean distance of target and predictions are often employed as output layer. In this paper, we propose the coupling of a Gaussian mixture of linear inverse regressions with a ConvNet, and we describe the methodological foundations and the associated algorithm to jointly train the deep network and the regression function. We test our model on the head-pose estimation problem. In this particular problem, we show that inverse regression outperforms regression models currently used by state-of-the-art computer vision methods. Our method does not require the incorporation of additional data, as it is often proposed in the literature, thus it is able to work well on relatively small training datasets. Finally, it outperforms state-of-the-art methods in head-pose estimation using a widely used head-pose dataset. To the best of our knowledge, we are the first to incorporate inverse regression into deep learning for computer vision applications.

Journal ArticleDOI
01 Jun 2017
TL;DR: D-NN is applied to synthetic datasets, which have different statistical distributions, and 4 benchmark datasets, and results showed the superiority of d-NN in terms of accuracy and computation cost as compared to other employed popular machine learning methods.
Abstract: Display Omitted dNN combines the similarity and dependency between the query and labeled samples.dNN maps the samples that are more similar and dependent on the query to the near of the origin, toward to +x axis.dNN has an adaptive dependency region, which is determined by the dependency angle and radius.Higher accuracies were obtained by using adaptive dependency region instead of a constant number of nearest neighbors (k). k nearest neighbor (kNN) is one of the basic processes behind various machine learning methods In kNN, the relation of a query to a neighboring sample is basically measured by a similarity metric, such as Euclidean distance. This process starts with mapping the training dataset onto a one-dimensional distance space based on the calculated similarities, and then labeling the query in accordance with the most dominant or mean of the labels of the k nearest neighbors, in classification or regression issues, respectively. The number of nearest neighbors (k) is chosen according to the desired limit of success. Nonetheless, two distinct samples may have equal distances to query but, with different angles in the feature space. The similarity of the query to these two samples needs to be weighted in accordance with the angle going between the query and each of the samples to differentiate between the two distances in reference to angular information. This opinion can be analyzed in the context of dependency and can be utilized to increase the precision of classifier. With this point of view, instead of kNN, the query is labeled according to its nearest dependent neighbors that are determined by a joint function, which is built on the similarity and the dependency. This method, therefore, may be called dependent NN (d-NN). To demonstrate d-NN, it is applied to synthetic datasets, which have different statistical distributions, and 4 benchmark datasets, which are Pima Indian, Hepatitis, approximate Sinc and CASP datasets. Results showed the superiority of d-NN in terms of accuracy and computation cost as compared to other employed popular machine learning methods.