scispace - formally typeset
Search or ask a question

Showing papers on "Euclidean distance published in 2020"


Journal ArticleDOI
TL;DR: The proposed model defeats the state-of-the-art deep learning approaches applied to place recognition and is easily trained via the standard backpropagation method.
Abstract: We propose an end-to-end place recognition model based on a novel deep neural network. First, we propose to exploit the spatial pyramid structure of the images to enhance the vector of locally aggregated descriptors (VLAD) such that the enhanced VLAD features can reflect the structural information of the images. To encode this feature extraction into the deep learning method, we build a spatial pyramid-enhanced VLAD (SPE-VLAD) layer. Next, we impose weight constraints on the terms of the traditional triplet loss (T-loss) function such that the weighted T-loss (WT-loss) function avoids the suboptimal convergence of the learning process. The loss function can work well under weakly supervised scenarios in that it determines the semantically positive and negative samples of each query through not only the GPS tags but also the Euclidean distance between the image representations. The SPE-VLAD layer and the WT-loss layer are integrated with the VGG-16 network or ResNet-18 network to form a novel end-to-end deep neural network that can be easily trained via the standard backpropagation method. We conduct experiments on three benchmark data sets, and the results demonstrate that the proposed model defeats the state-of-the-art deep learning approaches applied to place recognition.

281 citations


Journal ArticleDOI
TL;DR: A new Grey Wolf Optimizer algorithm integrated with a Two-phase Mutation to solve the feature selection for classification problems based on the wrapper methods to reduce the number of selected features while preserving high classification accuracy.
Abstract: Because of their high dimensionality, dealing with large datasets can hinder the data mining process. Thus, the feature selection is a pre-process mandatory phase for reducing the dimensionality of datasets through using the most informative features and at the same time maximizing the classification accuracy. This paper proposes a new Grey Wolf Optimizer algorithm integrated with a Two-phase Mutation to solve the feature selection for classification problems based on the wrapper methods. The sigmoid function is used to transform the continuous search space to the binary one in order to match the binary nature of the feature selection problem. The two-phase mutation enhances the exploitation capability of the algorithm. The purpose of the first mutation phase is to reduce the number of selected features while preserving high classification accuracy. The purpose of the second mutation phase is to attempt to add more informative features that increase the classification accuracy. As the mutation phase can be time-consuming, the two-phase mutation can be done with a small probability. The wrapper methods can give high-quality solutions so we use one of the most famous wrapper methods which called k-Nearest Neighbor (k-NN) classifier. The Euclidean distance is computed to search for the k-NN. Each dataset is split into training and testing data using K-fold cross-validation to overcome the overfitting problem. Several comparisons with the most famous and modern algorithms such as flower algorithm, particle swarm optimization algorithm, multi-verse optimizer algorithm, whale optimization algorithm, and bat algorithm are done. The experiments are done using 35 datasets. Statistical analyses are made to prove the effectiveness of the proposed algorithm and its outperformance.

213 citations


Proceedings ArticleDOI
14 Jun 2020
TL;DR: An adaptive dilated convolution and a novel supervised learning framework named self-correction (SC) supervision are proposed that achieves better performance than the state-of-the-art methods on all benchmark datasets.
Abstract: The counting problem aims to estimate the number of objects in images. Due to large scale variation and labeling deviations, it remains a challenging task. The static density map supervised learning framework is widely used in existing methods, which uses the Gaussian kernel to generate a density map as the learning target and utilizes the Euclidean distance to optimize the model. However, the framework is intolerable to the labeling deviations and can not reflect the scale variation. In this paper, we propose an adaptive dilated convolution and a novel supervised learning framework named self-correction (SC) supervision. In the supervision level, the SC supervision utilizes the outputs of the model to iteratively correct the annotations and employs the SC loss to simultaneously optimize the model from both the whole and the individuals. In the feature level, the proposed adaptive dilated convolution predicts a continuous value as the specific dilation rate for each location, which adapts the scale variation better than a discrete and static dilation rate. Extensive experiments illustrate that our approach has achieved a consistent improvement on four challenging benchmarks. Especially, our approach achieves better performance than the state-of-the-art methods on all benchmark datasets.

132 citations


Journal ArticleDOI
TL;DR: A sparse-adaptive hypergraph discriminant analysis (SAHDA) method is proposed to obtain the embedding features of the HSI and achieves better classification accuracies than the traditional graph learning methods.
Abstract: Hyperspectral image (HSI) contains complex multiple structures. Therefore, the key problem analyzing the intrinsic properties of an HSI is how to represent the structure relationships of the HSI effectively. Hypergraph is very effective to describe the intrinsic relationships of the HSI. In general, Euclidean distance is adopted to construct the hypergraph. However, this method cannot effectively represent the structure properties of high-dimensional data. To address this problem, we propose a sparse-adaptive hypergraph discriminant analysis (SAHDA) method to obtain the embedding features of the HSI in this letter. SAHDA uses the sparse representation to reveal the structure relationships of the HSI adaptively. Then, an adaptive hypergraph is constructed by using the intraclass sparse coefficients. Finally, we develop an adaptive dimensionality reduction mode to calculate the weights of the hyperedges and the projection matrix. SAHDA can adaptively reveal the intrinsic properties of the HSI and enhance the performance of the embedding features. Some experiments on the Washington DC Mall hyperspectral data set demonstrate the effectiveness of the proposed SAHDA method, and SAHDA achieves better classification accuracies than the traditional graph learning methods.

113 citations


Journal ArticleDOI
TL;DR: The proposed multi-layer neural network architecture encodes transferable knowledge extracted from a large annotated dataset of base categories and is applied to novel categories containing only a few samples, which produces competitive performance compared to previous work.
Abstract: This paper proposes a multi-layer neural network structure for few-shot image recognition of novel categories. The proposed multi-layer neural network architecture encodes transferable knowledge extracted from a large annotated dataset of base categories. This architecture is then applied to novel categories containing only a few samples. The transfer of knowledge is carried out at the feature-extraction and the classification levels distributed across the two training stages. In the first-training stage, we introduce the relative feature to capture the structure of the data as well as obtain a low-dimensional discriminative space. Secondly, we account for the variable variance of different categories by using a network to predict the variance of each class. Classification is then performed by computing the Mahalanobis distance to the mean-class representation in contrast to previous approaches that used the Euclidean distance. In the second-training stage, a category-agnostic mapping is learned from the mean-sample representation to its corresponding class-prototype representation. This is because the mean-sample representation may not accurately represent the novel category prototype. Finally, we evaluate the proposed network structure on four standard few-shot image recognition datasets, where our proposed few-shot learning system produces competitive performance compared to previous work. We also extensively studied and analyzed the contribution of each component of our proposed framework.

102 citations


Journal ArticleDOI
TL;DR: The approach GARUDA is based on clustering feature patterns incrementally and then representing features in different transformation space through using a novel fuzzy Gaussian dissimilarity measure, which resulted in the improved accuracy and detection rates for U2R and R2L attack classes when compared to other approaches.
Abstract: The objective of any anomaly detection system is to efficiently detect several types of malicious traffic patterns that cannot be detected by conventional firewall systems. Designing an efficient intrusion detection system has three primary challenges that include addressing high dimensionality problem, choice of learning algorithm, and distance or similarity measure used to find the similarity value between any two traffic patterns or input observations. Feature representation and dimensionality reduction have been studied and addressed widely in the literature and have also been applied for the design of intrusion detection systems (IDS). The choice of classifiers is also studied and applied widely in the design of IDS. However, at the heart of IDS lies the choice of distance measure that is required for an IDS to judge an incoming observation as normal or abnormal. This challenge has been understudied and relatively less addressed in the research literature both from academia and from industry. This research aims at introducing a novel distance measure that can be used to perform feature clustering and feature representation for efficient intrusion detection. Recent studies such as CANN proposed feature reduction techniques for improving detection and accuracy rates of IDS that used Euclidean distance. However, accuracies of attack classes such as U2R and R2L are not significantly promising. Our approach GARUDA is based on clustering feature patterns incrementally and then representing features in different transformation space through using a novel fuzzy Gaussian dissimilarity measure. Experiments are conducted on both KDD and NSL-KDD datasets. The accuracy and detection rates of proposed approach are compared for classifiers such as kNN, J48, naive Bayes, along with CANN and CLAPP approaches. Experiment results proved that proposed approach resulted in the improved accuracy and detection rates for U2R and R2L attack classes when compared to other approaches.

93 citations


Journal ArticleDOI
15 Apr 2020
TL;DR: The multi-criteria decision-making methods provide decision makers the necessary information to make decisions about whether or not to adopt a particular strategy or strategy to solve a specific problem.
Abstract: Decision-making is an important part of daily and business life for both individuals and organizations. Although the multi-criteria decision-making methods provide decision makers the necessary too...

89 citations


Proceedings Article
30 Apr 2020
TL;DR: This paper characterize the norm required to realize a function as a single hidden-layer ReLU network with an unbounded number of units, but where the Euclidean norm of the weights is bounded, including precisely characterizing which functions can be realized with finite norm.
Abstract: A key element of understanding the efficacy of overparameterized neural networks is characterizing how they represent functions as the number of weights in the network approaches infinity. In this paper, we characterize the norm required to realize any function as a single hidden-layer ReLU network with an unbounded number of units (infinite width), but where the Euclidean norm of the weights is bounded, including precisely characterizing which functions can be realized with finite norm. This was settled for univariate functions in Savarese et al. (2019), where it was shown that the required norm is determined by the L1-norm of the second derivative of the function. We extend the characterization to multi-variate functions (i.e., multiple input units), relating the required norm to the L1-norm of the Radon transform of a higher-order Laplacian of the function. This characterization allows us to show that all functions in a Sobolev space, can be represented with bounded norm, to calculate the required norm for several specific functions, and to obtain a depth separation result. These results have important implications for understanding generalization performance and the distinction between neural networks and more traditional kernel learning.

85 citations


Journal ArticleDOI
TL;DR: The diagnostic outcomes describe that the suggested feature reduction technique improves the classification accuracy with fewer feature subset along with considerable time-saving.
Abstract: Bearing failure can cause hazardous effects on rotating machinery. The diagnosis of the fault is very critical for reliable operation. The main steps for the machine learning process involve feature extraction, selection, and classification. Feature selection contains an identification of noble features that performs for better classification accuracy with fewer features and with less computational time. For a large feature dimension; a critical study is required to catch the best feature subset for proper diagnosis. So, this paper presents a unique feature ordering and selection technique called Feature Ranking and Subset Selection based on Euclidean distance (FRSSED). Two bearing databases have considered for verification of the robustness of the proposed technique. One database was obtained from the experiment, and the other publicly available database was collected from Case Western Reserve University (CWRU). Initially, the vibration signals have captured from bearings having an individual as well as combined defects in various components along with healthy bearing. EEMD was applied to these signals, and then, the sensitive IMF was selected by the envelope spectrum. In the later stage, the feature extraction was carried out from the selected IMF using fifteen statistical features. Afterward, the extracted features were introduced into FRSSED algorithm for feature ordering. These ordered features were fed into various classifiers. The comparison was made for classification accuracy and time consumption among generalized method (without feature ordering), principal component analysis (PCA), and FRSSED. The diagnostic outcomes describe that the suggested feature reduction technique improves the classification accuracy with fewer feature subset along with considerable time-saving.

78 citations


Journal ArticleDOI
TL;DR: A distributed model to compute a score that measures the quality of each feature with respect to multiple labels on Apache Spark is proposed and results validated through statistical analysis indicate that ENM is able to outperform the reference methods by maximizing the relevance while minimizing the redundancy of the selected features in constant selection time.
Abstract: Multi-label learning generalizes traditional learning by allowing an instance to belong to multiple labels simultaneously. This causes multi-label data to be characterized by its large label space dimensionality and the dependencies among labels. These challenges have been addressed by feature selection techniques which improve the final model accuracy. However, the large number of features along with a large number of labels call for new approaches to manage data effectively and efficiently in distributed computing environments. This paper proposes a distributed model to compute a score that measures the quality of each feature with respect to multiple labels on Apache Spark. We propose two different approaches that study how to aggregate the mutual information of multiple labels: Euclidean Norm Maximization (ENM) and Geometric Mean Maximization (GMM). The former selects the features with the largest L 2 -norm whereas the latter selects the features with the largest geometric mean. Experiments compare 9 distributed multi-label feature selection methods on 12 datasets and 12 metrics. Results validated through statistical analysis indicate that ENM is able to outperform the reference methods by maximizing the relevance while minimizing the redundancy of the selected features in constant selection time.

76 citations


Journal ArticleDOI
TL;DR: This article proposes two enhanced RF classifiers, namely the Euclidean distance based reduced kernel RF (RK-RF$_{\text{ED}}$) and K-means clustering based reduced Kernel RF, for FDD.
Abstract: The random forest (RF) classifier, which is a combination of tree predictors, is one of the most powerful classification algorithms that has been recently applied for fault detection and diagnosis (FDD) of industrial processes. However, RF is still suffering from some limitations such as the noncorrelation between variables. These limitations are due to the direct use of variables measured at nodes and therefore the only use of static information from the process data. Thus, this article proposes two enhanced RF classifiers, namely the Euclidean distance based reduced kernel RF (RK-RF $_{\text{ED}}$ ) and K-means clustering based reduced kernel RF (RK-RF $_{\text{Kmeans}}$ ), for FDD. Based on the kernel principal component analysis, the proposed classifiers consist of two main stages: feature extraction and selection, and fault classification. In the first stage, the number of observations in the training data set is reduced using two methods: the first method consists of using the Euclidean distance as dissimilarity metric so that only one measurement is kept in case of redundancy between samples. The second method aims at reducing the amount of the training data based on the K-means clustering technique. Once the characteristics of the process are extracted, the most sensitive features are selected. During the second phase, the selected features are fed to an RF classifier. An emulated grid-connected PV system is used to validate the performance of the proposed RK-RF $_{\text{ED}}$ and RK-RF $_{\text{Kmeans}}$ classifiers. The presented results confirm the high classification accuracy of the developed techniques with low computation time.

Proceedings ArticleDOI
14 Jun 2020
TL;DR: The proposed DualConvMesh-Nets (DCM-Net) is a family of deep hierarchical convolutional networks over 3D geometric data that combines two types of convolutions, geodesic and Euclidean, that borrow well-established mesh simplification methods from the geometry processing domain and adapt them to define mesh-preserving pooling and unpooling operations.
Abstract: We propose DualConvMesh-Nets (DCM-Net) a family of deep hierarchical convolutional networks over 3D geometric data that combines two types of convolutions. The first type, Geodesic convolutions, defines the kernel weights over mesh surfaces or graphs. That is, the convolutional kernel weights are mapped to the local surface of a given mesh. The second type, Euclidean convolutions, is independent of any underlying mesh structure. The convolutional kernel is applied on a neighborhood obtained from a local affinity representation based on the Euclidean distance between 3D points. Intuitively, geodesic convolutions can easily separate objects that are spatially close but have disconnected surfaces, while Euclidean convolutions can represent interactions between nearby objects better, as they are oblivious to object surfaces. To realize a multi-resolution architecture, we borrow well-established mesh simplification methods from the geometry processing domain and adapt them to define mesh-preserving pooling and unpooling operations. We experimentally show that combining both types of convolutions in our architecture leads to significant performance gains for 3D semantic segmentation, and we report competitive results on three scene segmentation benchmarks. Models and code will be made publicly available.

Journal ArticleDOI
TL;DR: This paper has designed a fast algorithm for feature selection on the multi-label data using the PageRank algorithm, which is an effective method used to calculate the importance of web pages on the Internet and shows the superiority of the proposed method in the classification criteria and run-time.
Abstract: In multi-label data, each instance corresponds to a set of labels instead of one label whereby the instances belonging to a label in the corresponding column of that label are assigned 1, while instances that do not belong to that label are assigned 0 in the data set. This type of data is usually considered as high-dimensional data, so many methods, using machine learning algorithms, seek to choose the best subset of features for reducing the dimensionality of data and then to create an acceptable model for classification. In this paper, we have designed a fast algorithm for feature selection on the multi-label data using the PageRank algorithm, which is an effective method used to calculate the importance of web pages on the Internet. This algorithm, which is called multi-label graph-based feature selection (MGFS), first constructs an M × L matrix, called Correlation Distance Matrix (CDM), where M is the number of features and L represents the number of class labels. Then, MGFS creates a complete weighted graph, called Feature-Label Graph (FLG), where each feature is considered as a vertex, and the weight between two vertices (or features) represents their Euclidean distance in CDM. Finally, the importance of each graph vertex (or feature) is estimated via the PageRank algorithm. In the proposed method, the number of features can be determined by the user. To prove the performance of the proposed algorithm, we have tested this algorithm with several methods for multi-label feature selection and on several multi-label datasets with different dimensions. The results show the superiority of the proposed method in the classification criteria and run-time.

Journal ArticleDOI
TL;DR: The results show that the proposed algorithm can significantly improve the positioning accuracy of WKNN algorithm and the nearest RPs can be selected more accurately based on the APDs between the user and different RPs.
Abstract: In Wi-Fi fingerprint positioning, what we should most care about is the distance relationship between the user and the reference points (RP). However, most of the existing weighted k-nearest neighbor (WKNN) algorithms use the Euclidean distance of received signal strengths (RSS) as distance measure for fingerprint matching, and the RSS Euclidean distance is not consistent with the position distance. To address this issue, this paper analyzes the relationship between RSS similarity and position distance, propose a novel WKNN based on signal similarity and spatial position. Firstly, we obtain the weighted Euclidean distance (WED) by balancing the size between the RSS difference and the signal propagation distance difference according to the attenuation law of the spatial signal. Then, we obtain the approximate position distance (APD) by making full use of the position distances and WEDs between RPs. Finally, the nearest RPs can be selected more accurately based on the APDs between the user and different RPs, and the position of user can be estimated by the proposed WKNN based on the APD (APD-WKNN) algorithm. In order to fully evaluate the proposed algorithm, we use three fingerprint databases for comparison experiments with eight fingerprint positioning algorithms. The results show that the proposed algorithm can significantly improve the positioning accuracy of WKNN algorithm.

Journal ArticleDOI
TL;DR: It is proposed that the use of the geodesic distance is an effective way to compare the correlation structure of the brain across a broad range of studies and suggested that low-dimensional distance visualizations based on the Geodesic approach help uncover the geometry of task functional connectivity in relation to that during resting-state.

Journal ArticleDOI
TL;DR: A novel automated pixel clustering and color image segmentation algorithm that can find an appropriate set of clusters for a set of well-known benchmark images is presented.
Abstract: Color image segmentation is a fundamental challenge in the field of image analysis and pattern recognition. In this paper, a novel automated pixel clustering and color image segmentation algorithm is presented. The proposed method operates in three successive stages. In the first stage, a three-dimensional histogram of pixel colors based on the RGB model is smoothened using a Gaussian filter. This process helps to eliminate unreliable and non-dominating peaks that are too close to one another in the histogram. In the next stage, the peaks representing different clusters in the histogram are identified using a multimodal particle swarm optimization algorithm. Finally, pixels are assigned to the most appropriate cluster based on Euclidean distance. Determining the number of clusters to be used is often a manual process left for a user and represents a challenge for various segmentation algorithms. The proposed method is designed to determine an appropriate number of clusters, in addition to the actual peaks, automatically. Experiments confirm that the proposed approach yields desirable results, demonstrating that it can find an appropriate set of clusters for a set of well-known benchmark images.

Journal ArticleDOI
TL;DR: This paper applies the particle swarm optimization-based variational mode decomposition to decompose the raw vibration signals into a series of intrinsic modes, and selects ten time-domain indicators and five frequency-domain statistical characteristics for feature extraction.
Abstract: The data-driven fault indicator for rotating machinery is designed to reveal the possible fault scenarios from the observed statistical vibration signals. This study develops a novel ensemble extreme learning machine (EELM) network to replace the conventional layout by combining binary classifiers (e.g., binary relevance) for compound-fault diagnosis of rotating machinery. The proposed EELMs consist of two sub-networks, namely, the first extreme learning machine (ELM) for clustering, and the second for multi-label classification. The first network generates the Euclidean distance representations from each point to every centroid with unsupervised clustering, and the second identifies potential output tags through multiple-output-node multi-label learning. Compared to the existing multi-label classifiers (e.g., multi-label radial basis function, rank support vector machine, back-propagation multi-label learning, and binary classifiers with binary relevance), the theoretical verification reveals EELMs perform the best in hamming loss, one-error, training time, and achieves the best overall evaluation for the two real-world databases (e.g., Yeast and Image). Regarding the real test for the compound-fault diagnosis of rotating machinery, this paper applies the particle swarm optimization-based variational mode decomposition to decompose the raw vibration signals into a series of intrinsic modes, and selects ten time-domain indicators and five frequency-domain statistical characteristics for feature extraction. The experimental results illustrate that the EELM-based fault diagnosis method achieves the best overall performance.

Proceedings Article
12 Jul 2020
TL;DR: A probabilistic model that generates statistically independent samples for molecules from their graph representations that learns a low-dimensional manifold that preserves the geometry of local atomic neighborhoods through a principled learning representation that is based on Euclidean distance geometry.
Abstract: Great computational effort is invested in generating equilibrium states for molecular systems using, for example, Markov chain Monte Carlo. We present a probabilistic model that generates statistically independent samples for molecules from their graph representations. Our model learns a low-dimensional manifold that preserves the geometry of local atomic neighborhoods through a principled learning representation that is based on Euclidean distance geometry. In a new benchmark for molecular conformation generation, we show experimentally that our generative model achieves state-of-the-art accuracy. Finally, we show how to use our model as a proposal distribution in an importance sampling scheme to compute molecular properties.

Journal ArticleDOI
TL;DR: In this article, an unsupervised manifold learning method was proposed to retrieve topological quantum phase transitions in momentum and real space, where the Chebyshev distance between two data points sharpens the characteristic features of quantum phase transition in momentum space.
Abstract: The discovery of topological features of quantum states plays an important role in modern condensed matter physics and various artificial systems. Due to the absence of local order parameters, the detection of topological quantum phase transitions remains a challenge. Machine learning may provide effective methods for identifying topological features. In this work we show that the unsupervised manifold learning can successfully retrieve topological quantum phase transitions in momentum and real space. Our results show that the Chebyshev distance between two data points sharpens the characteristic features of topological quantum phase transitions in momentum space, while the widely used Euclidean distance is in general suboptimal. Then a diffusion map or isometric map can be applied to implement the dimensionality reduction, and to learn about topological quantum phase transitions in an unsupervised manner. We demonstrate this method on the prototypical Su-Schrieffer-Heeger (SSH) model, the Qi-Wu-Zhang (QWZ) model, and the quenched SSH model in momentum space, and further provide implications and demonstrations for learning in real space, where the topological invariants could be unknown or hard to compute. The interpretable good performance of our approach shows the capability of manifold learning, when equipped with a suitable distance metric, in exploring topological quantum phase transitions.

Proceedings ArticleDOI
01 May 2020
TL;DR: This paper presents a novel self-supervised scale-aware framework for learning Euclidean distance and ego-motion from raw monocular fisheye videos without applying rectification and obtained state-of-the-art results comparable to other self- supervised monocular methods.
Abstract: Fisheye cameras are commonly used in applications like autonomous driving and surveillance to provide a large field of view (> 180o). However, they come at the cost of strong non-linear distortions which require more complex algorithms. In this paper, we explore Euclidean distance estimation on fisheye cameras for automotive scenes. Obtaining accurate and dense depth supervision is difficult in practice, but self-supervised learning approaches show promising results and could potentially overcome the problem. We present a novel self-supervised scale-aware framework for learning Euclidean distance and ego-motion from raw monocular fisheye videos without applying rectification. While it is possible to perform piece-wise linear approximation of fisheye projection surface and apply standard rectilinear models, it has its own set of issues like re-sampling distortion and discontinuities in transition regions. To encourage further research in this area, we will release our dataset as part of the WoodScape project [1]. We further evaluated the proposed algorithm on the KITTI dataset and obtained state-of-the-art results comparable to other self-supervised monocular methods. Qualitative results on an unseen fisheye video demonstrate impressive performance1.

Journal ArticleDOI
TL;DR: A probabilistic approach is proposed to generalize Song's approach, such that Euclidean distance in the latent space is now represented by KL divergence, and as a consequence of this generalization the authors can now use probability distributions as inputs rather than points inThe latent space.
Abstract: An autoencoder that learns a latent space in an unsupervised manner has many applications in signal processing. However, the latent space of an autoencoder does not pursue the same clustering goal as Kmeans or GMM. A recent work proposes to artificially re-align each point in the latent space of an autoencoder to its nearest class neighbors during training (Song et al. 2013). The resulting new latent space is found to be much more suitable for clustering, since clustering information is used. Inspired by previous works (Song et al . 2013), in this letter we propose several extensions to this technique. First, we propose a probabilistic approach to generalize Song's approach, such that Euclidean distance in the latent space is now represented by KL divergence. Second, as a consequence of this generalization we can now use probability distributions as inputs rather than points in the latent space. Third, we propose using Bayesian Gaussian mixture model for clustering in the latent space. We demonstrated our proposed method on digit recognition datasets, MNIST, USPS and SHVN as well as scene datasets, Scene15 and MIT67 with interesting findings.

Journal ArticleDOI
TL;DR: This study will discuss the calculation of the euclidean distance formula in KNN compared with the normalized euclidesan distance, manhattan and normalized manhattan to achieve optimization results or optimal value in finding the distance of the nearest neighbor.
Abstract: K-Nearest Neighbor (KNN) is a method applied in classifying objects based on learning data that is closest to the object based on comparison between previous and current data. In the learning process, KNN calculates the distance of the nearest neighbor by applying the euclidean distance formula, while in other methods, optimization has been done on the distance formula by comparing it with the other similar in order to get optimal results. This study will discuss the calculation of the euclidean distance formula in KNN compared with the normalized euclidean distance, manhattan and normalized manhattan to achieve optimization results or optimal value in finding the distance of the nearest neighbor.

Journal ArticleDOI
TL;DR: A variational autoencoder-based just-in-time (JIT) learning framework for soft sensor modeling with Gaussian process regression as a nonlinear regression model and Kullback-Leibler divergence is employed to evaluate the similarity between historical samples and a query sample.

Journal ArticleDOI
TL;DR: A particle swarm optimization algorithm based on the ring neighborhood topology of Euclidean distance between particles is proposed, which is called the close neighbor mobility optimization algorithm, and has better performance than most single-objective multi-modal algorithms.

Journal ArticleDOI
TL;DR: A new CBIR methodology is proposed and adequacy of any CBIR framework relies upon the features extracted from a color picture, firstly find the region of interest of the image using Sobel and Canny method and later on output is applied on HSV color space, it is clear to human vision eye.
Abstract: Nowadays content-based image retrieval (CBIR) framework is drawing in consideration of numerous analysts because of far-reaching applications found in numerous territories. In this paper, a new CBIR methodology is proposed and adequacy of any CBIR framework relies upon the features extracted from a color picture. In this work, firstly find the region of interest of the image using Sobel and Canny method and later on output is applied on HSV color space, it is clear to human vision eye. For classification, neural network is used and categorized the data with class labels. The similarity distance is estimated between the query image and stored image with different similarity metrics like Manhattan distance, Euclidean distance, Chebyshev, Hamming distance and Jaccard distance. The experimental result is estimated on accuracy, precision. The experiment performed on two well-known databases i.e.: Corel-1k and Corel-5k dataset and new methodology proves the better accuracy results up to 87.33% and 68.93% respectively and improves the precision results also up to 86.36% and 68.47% respectively. In this paper, results are also extended up to 80%.

Journal ArticleDOI
TL;DR: A novel three-way decisions approach is proposed and applied to medical diagnosis and it is shown that different types of parameters can respond the level of the tolerance relation and the risk preference of decision makers.

Journal ArticleDOI
TL;DR: In this article, it was shown that the Euclidean Hausdorff dimension of a planar Gaussian free field with respect to the Liouville quantum gravity (LQG) metric is equal to the exponent of the KPZ formula, which describes distances in discrete approximations of the LQG metric.
Abstract: Let $\gamma\in (0,2)$, let $h$ be the planar Gaussian free field, and let $D_h$ be the associated $\gamma$-Liouville quantum gravity (LQG) metric. We prove that for any random Borel set $X \subset \mathbb{C}$ which is independent from $h$, the Hausdorff dimensions of $X$ with respect to the Euclidean metric and with respect to the $\gamma$-LQG metric $D_h$ are a.s. related by the (geometric) KPZ formula. As a corollary, we deduce that the Hausdorff dimension of the continuum $\gamma$-LQG metric is equal to the exponent $d_\gamma > 2$ studied by Ding and Gwynne (2018), which describes distances in discrete approximations of $\gamma$-LQG such as random planar maps. We also derive "worst-case" bounds relating the Euclidean and $\gamma$-LQG dimensions of $X$ when $X$ and $h$ are not necessarily independent, which answers a question posed by Aru (2015). Using these bounds, we obtain an upper bound for the Euclidean Hausdorff dimension of a $\gamma$-LQG geodesic which equals $1.312\dots$ when $\gamma = \sqrt{8/3}$; and an upper bound of $1.9428\dots$ for the Euclidean Hausdorff dimension of a connected component of the boundary of a $\sqrt{8/3}$-LQG metric ball. We use the axiomatic definition of the $\gamma$-LQG metric, so the paper can be understood by readers with minimal background knowledge beyond a basic level of familiarity with the Gaussian free field.

Journal ArticleDOI
TL;DR: A scale-invariant PC geometry quality assessment metric is proposed based on a new type of correspondence, namely between a point and a distribution of points, able to reliably measure the geometry quality for PCs with different intrinsic characteristics and degraded by several coding solutions.
Abstract: Nowadays, point clouds (PCs) are a promising representation format for immersive content and target several emerging applications, notably in virtual and augmented reality. However, efficient coding solutions are critically needed due to the large amount of PC data required for high quality user experiences. To address these needs, several PC coding standards were developed and thus, objective PC quality metrics able to accurately account for the subjective impact of coding artifacts are needed. In this paper, a scale-invariant PC geometry quality assessment metric is proposed based on a new type of correspondence, namely between a point and a distribution of points. This metric is able to reliably measure the geometry quality for PCs with different intrinsic characteristics and degraded by several coding solutions. Experimental results show the superiority of the proposed PC quality metric over relevant state-of-the-art.

Journal ArticleDOI
TL;DR: A new data-driven dissimilarity measure, called MADD, is used, which uses the distance concentration phenomenon to its advantage, and as a result, clustering algorithms based on MADD usually perform well for high dimensional data.
Abstract: Popular clustering algorithms based on usual distance functions (e.g., the Euclidean distance) often suffer in high dimension, low sample size (HDLSS) situations, where concentration of pairwise distances and violation of neighborhood structure have adverse effects on their performance. In this article, we use a new data-driven dissimilarity measure, called MADD, which takes care of these problems. MADD uses the distance concentration phenomenon to its advantage, and as a result, clustering algorithms based on MADD usually perform well for high dimensional data. We establish it using theoretical as well as numerical studies. We also address the problem of estimating the number of clusters. This is a challenging problem in cluster analysis, and several algorithms are available for it. We show that many of these existing algorithms have superior performance in high dimensions when they are constructed using MADD. We also construct a new estimator based on a penalized version of the Dunn index and prove its consistency in the HDLSS asymptotic regime. Several simulated and real data sets are analyzed to demonstrate the usefulness of MADD for cluster analysis of high dimensional data.

Journal ArticleDOI
TL;DR: The Euclidean distance degree of an algebraic variety is a well-studied topic in applied algebra and geometry as mentioned in this paper, and has direct applications in geometric modeling, computer vision, and statistics.
Abstract: The Euclidean distance degree of an algebraic variety is a well-studied topic in applied algebra and geometry. It has direct applications in geometric modeling, computer vision, and statistics. We ...