scispace - formally typeset
Search or ask a question

Showing papers on "Metric (mathematics) published in 2021"


Journal ArticleDOI
TL;DR: A deeply supervised (DS) attention metric-based network (DSAMNet) is proposed in this article to learn change maps by means of deep metric learning, in which convolutional block attention modules (CBAM) are integrated to provide more discriminative features.
Abstract: Change detection (CD) aims to identify surface changes from bitemporal images. In recent years, deep learning (DL)-based methods have made substantial breakthroughs in the field of CD. However, CD results can be easily affected by external factors, including illumination, noise, and scale, which leads to pseudo-changes and noise in the detection map. To deal with these problems and achieve more accurate results, a deeply supervised (DS) attention metric-based network (DSAMNet) is proposed in this article. A metric module is employed in DSAMNet to learn change maps by means of deep metric learning, in which convolutional block attention modules (CBAM) are integrated to provide more discriminative features. As an auxiliary, a DS module is introduced to enhance the feature extractor's learning ability and generate more useful features. Moreover, another challenge encountered by data-driven DL algorithms is posed by the limitations in change detection datasets (CDDs). Therefore, we create a CD dataset, Sun Yat-Sen University (SYSU)-CD, for bitemporal image CD, which contains a total of 20,000 aerial image pairs of size 256 x 256. Experiments are conducted on both the CDD and the SYSU-CD dataset. Compared to other state-of-the-art methods, our network achieves the highest accuracy on both datasets, with an F1 of 93.69% on the CDD dataset and 78.18% on the SYSU-CD dataset.

206 citations


Journal ArticleDOI
TL;DR: This paper proposes a new domain adaptation method named Adversarial Tight Match (ATM) which enjoys the benefits of both adversarial training and metric learning and proposes a novel distance loss, named Maximum Density Divergence (MDD), to quantify the distribution divergence.
Abstract: Unsupervised domain adaptation addresses the problem of transferring knowledge from a well-labeled source domain to an unlabeled target domain where the two domains have distinctive data distributions. Thus, the essence of domain adaptation is to mitigate the distribution divergence between the two domains. The state-of-the-art methods practice this very idea by either conducting adversarial training or minimizing a metric which defines the distribution gaps. In this paper, we propose a new domain adaptation method named adversarial tight match (ATM) which enjoys the benefits of both adversarial training and metric learning. Specifically, at first, we propose a novel distance loss, named maximum density divergence (MDD), to quantify the distribution divergence. MDD minimizes the inter-domain divergence (“match” in ATM) and maximizes the intra-class density (“tight” in ATM). Then, to address the equilibrium challenge issue in adversarial domain adaptation, we consider leveraging the proposed MDD into adversarial domain adaptation framework. At last, we tailor the proposed MDD as a practical learning loss and report our ATM. Both empirical evaluation and theoretical analysis are reported to verify the effectiveness of the proposed method. The experimental results on four benchmarks, both classical and large-scale, show that our method is able to achieve new state-of-the-art performance on most evaluations.

171 citations


Journal ArticleDOI
TL;DR: Higher order tracking accuracy (HOTA) as mentioned in this paper is proposed to explicitly balance the effect of performing accurate detection, association and localization into a single unified metric for comparing trackers, which is able to capture important aspects of MOT performance not previously taken into account by established metrics.
Abstract: Multi-object tracking (MOT) has been notoriously difficult to evaluate. Previous metrics overemphasize the importance of either detection or association. To address this, we present a novel MOT evaluation metric, higher order tracking accuracy (HOTA), which explicitly balances the effect of performing accurate detection, association and localization into a single unified metric for comparing trackers. HOTA decomposes into a family of sub-metrics which are able to evaluate each of five basic error types separately, which enables clear analysis of tracking performance. We evaluate the effectiveness of HOTA on the MOTChallenge benchmark, and show that it is able to capture important aspects of MOT performance not previously taken into account by established metrics. Furthermore, we show HOTA scores better align with human visual evaluation of tracking performance.

169 citations


Proceedings ArticleDOI
01 Jan 2021
TL;DR: In this article, the authors proposed a Deep Attentive Center Loss (DACL) method to adaptively select a subset of significant feature elements for enhanced discrimination, which integrates an attention mechanism to estimate attention weights correlated with feature importance.
Abstract: Learning discriminative features for Facial Expression Recognition (FER) in the wild using Convolutional Neural Networks (CNNs) is a non-trivial task due to the significant intra-class variations and inter-class similarities. Deep Metric Learning (DML) approaches such as center loss and its variants jointly optimized with softmax loss have been adopted in many FER methods to enhance the discriminative power of learned features in the embedding space. However, equally supervising all features with the metric learning method might include irrelevant features and ultimately degrade the generalization ability of the learning algorithm. We propose a Deep Attentive Center Loss (DACL) method to adaptively select a subset of significant feature elements for enhanced discrimination. The proposed DACL integrates an attention mechanism to estimate attention weights correlated with feature importance using the intermediate spatial feature maps in CNN as context. The estimated weights accommodate the sparse formulation of center loss to selectively achieve intra-class compactness and inter-class separation for the relevant information in the embedding space. An extensive study on two widely used wild FER datasets demonstrates the superiority of the proposed DACL method compared to state-of-the-art methods.

137 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, the authors explore and analyze the latent style space of Style-GAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets.
Abstract: We explore and analyze the latent style space of Style-GAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets. We first show that StyleSpace, the space of channel-wise style parameters, is significantly more disentangled than the other intermediate latent spaces explored by previous works. Next, we describe a method for discovering a large collection of style channels, each of which is shown to control a distinct visual attribute in a highly localized and dis-entangled manner. Third, we propose a simple method for identifying style channels that control a specific attribute, using a pretrained classifier or a small number of example images. Manipulation of visual attributes via these StyleSpace controls is shown to be better disentangled than via those proposed in previous works. To show this, we make use of a newly proposed Attribute Dependency metric. Finally, we demonstrate the applicability of StyleSpace controls to the manipulation of real images. Our findings pave the way to semantically meaningful and well-disentangled image manipulations via simple and intuitive interfaces.

123 citations


Journal ArticleDOI
TL;DR: A novel few-shot learning method named multi-scale metric learning (MSML) is proposed to extract multi- Scale features and learn the multi- scale relations between samples for the classification of few- shot learning.
Abstract: Few-shot learning in image classification is developed to learn a model that aims to identify unseen classes with only few training samples for each class. Fewer training samples and new tasks of classification make many traditional classification models no longer applicable. In this paper, a novel few-shot learning method named multi-scale metric learning (MSML) is proposed to extract multi-scale features and learn the multi-scale relations between samples for the classification of few-shot learning. In the proposed method, a feature pyramid structure is introduced for multi-scale feature embedding, which aims to combine high-level strong semantic features with low-level but abundant visual features. Then a multi-scale relation generation network (MRGN) is developed for hierarchical metric learning, in which high-level features are corresponding to deeper metric learning while low-level features are corresponding to lighter metric learning. Moreover, a novel loss function named intra-class and inter-class relation loss (IIRL) is proposed to optimize the proposed deep network, which aims to strengthen the correlation between homogeneous groups of samples and weaken the correlation between heterogeneous groups of samples. Experimental results on mini ImageNet and tiered ImageNet demonstrate that the proposed method achieves superior performance in few-shot learning problem.

122 citations


Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a novel feature descriptor named the Histogram of Orientated Phase Congruency (HOPC), which is based on the structural properties of images.
Abstract: Automatic registration of multimodal remote sensing data (eg, optical, LiDAR, SAR) is a challenging task due to the significant non-linear radiometric differences between these data To address this problem, this paper proposes a novel feature descriptor named the Histogram of Orientated Phase Congruency (HOPC), which is based on the structural properties of images Furthermore, a similarity metric named HOPCncc is defined, which uses the normalized correlation coefficient (NCC) of the HOPC descriptors for multimodal registration In the definition of the proposed similarity metric, we first extend the phase congruency model to generate its orientation representation, and use the extended model to build HOPCncc Then a fast template matching scheme for this metric is designed to detect the control points between images The proposed HOPCncc aims to capture the structural similarity between images, and has been tested with a variety of optical, LiDAR, SAR and map data The results show that HOPCncc is robust against complex non-linear radiometric differences and outperforms the state-of-the-art similarities metrics (ie, NCC and mutual information) in matching performance Moreover, a robust registration method is also proposed in this paper based on HOPCncc, which is evaluated using six pairs of multimodal remote sensing images The experimental results demonstrate the effectiveness of the proposed method for multimodal image registration

95 citations


Journal ArticleDOI
Fuyuan Xiao1
TL;DR: A generalized evidential distance measure called the CED is proposed, which can measure the difference or dissimilarity between CBBAs in complex evidence theory and is applied to a medical diagnosis problem to illustrate its practicability.
Abstract: Evidence theory is an effective methodology for modeling and processing uncertainty that has been widely applied in various fields. In evidence theory, a number of distance measures have been presented, which play an important role in representing the degree of difference between pieces of evidence. However, the existing evidential distances focus on traditional basic belief assignments (BBAs) modeled in terms of real numbers and are not compatible with complex BBAs (CBBAs) extended to the complex plane. Therefore, in this article, a generalized evidential distance measure called the complex evidential distance (CED) is proposed, which can measure the difference or dissimilarity between CBBAs in complex evidence theory. This is the first work to consider distance measures for CBBAs, and it provides a promising way to measure the differences between pieces of evidence in a more general framework of complex plane space. Furthermore, the CED is a strict distance metric with the properties of nonnegativity, nondegeneracy, symmetry, and triangle inequality that satisfies the axioms of a distance. In particular, when the CBBAs degenerate into classical BBAs, the CED will degenerate into Jousselme et al. ’s distance. Therefore, the proposed CED is a generalization of the traditional evidential distance, but it has a greater ability to measure the difference or dissimilarity between pieces of evidence. Finally, a decision-making algorithm for pattern recognition is devised based on the CED and is applied to a medical diagnosis problem to illustrate its practicability.

89 citations


Journal ArticleDOI
TL;DR: This paper provides a global explainable AI model which is based on Lorenz decomposition, thus extending previous contributions based on variance decompositions, and provides a unifying variable importance criterion that combines predictive accuracy with explainability, using a normalised and easy to interpret metric.
Abstract: Explainability of artificial intelligence methods has become a crucial issue, especially in the most regulated fields, such as health and finance. In this paper, we provide a global explainable AI method which is based on Lorenz decompositions, thus extending previous contributions based on variance decompositions. This allows the resulting Shapley-Lorenz decomposition to be more generally applicable, and provides a unifying variable importance criterion that combines predictive accuracy with explainability, using a normalised and easy to interpret metric. The proposed decomposition is illustrated within the context of a real financial problem: the prediction of bitcoin prices.

87 citations


Journal ArticleDOI
TL;DR: The proposed metric has demonstrated the state-of-the-art performance for predicting the subjective point cloud quality compared with multiple full-reference and no-reference models, e.g., the weighted peak signal-to-noise ratio (PSNR), structural similarity (SSIM), feature similarity (FSIM) and natural image quality evaluator (NIQE).
Abstract: Point cloud is emerged as a promising media format to represent realistic 3D objects or scenes in applications, such as virtual reality, teleportation, etc. How to accurately quantify the subjective point cloud quality for application-driven optimization, however, is still a challenging and open problem. In this paper, we attempt to tackle this problem in a systematic means. First, we produce a fairly large point cloud dataset where ten popular point clouds are augmented with seven types of impairments (e.g., compression, photometry/color noise, geometry noise, scaling) at six different distortion levels, and organize a formal subjective assessment with tens of subjects to collect mean opinion scores (MOS) for all 420 processed point cloud samples (PPCS). We then try to develop an objective metric that can accurately estimate the subjective quality. Towards this goal, we choose to project the 3D point cloud onto six perpendicular image planes of a cube for the color texture image and corresponding depth image, and aggregate image-based global (e.g., Jensen-Shannon (JS) divergence) and local features (e.g., edge, depth, pixel-wise similarity, complexity) among all projected planes for a final objective index. Model parameters are fixed constants after performing the regression using a small and independent dataset previously published. The proposed metric has demonstrated the state-of-the-art performance for predicting the subjective point cloud quality compared with multiple full-reference and no-reference models, e.g., the weighted peak signal-to-noise ratio (PSNR), structural similarity (SSIM), feature similarity (FSIM) and natural image quality evaluator (NIQE). The dataset is made publicly accessible at http://smt.sjtu.edu.cn or http://vision.nju.edu.cn for all interested audiences.

82 citations


Journal ArticleDOI
TL;DR: Using information geometry, it is proved that irreversible entropy production is bounded from below by a modified Wasserstein distance between the initial and final states, thus strengthening the Clausius inequality in the reversible-Markov case.
Abstract: We derive geometrical bounds on the irreversibility in both quantum and classical Markovian open systems that satisfy the detailed balance condition. Using information geometry, we prove that irreversible entropy production is bounded from below by a modified Wasserstein distance between the initial and final states, thus strengthening the Clausius inequality in the reversible-Markov case. The modified metric can be regarded as a discrete-state generalization of the Wasserstein metric, which has been used to bound dissipation in continuous-state Langevin systems. Notably, the derived bounds can be interpreted as the quantum and classical speed limits, implying that the associated entropy production constrains the minimum time of transforming a system state. We illustrate the results on several systems and show that a tighter bound than the Carnot bound for the efficiency of quantum heat engines can be obtained.

Journal ArticleDOI
TL;DR: This paper proposes a new approach, called Feature Space Metric-based Meta-learning Model (FSM3), to overcome the challenge of the few-shot fault diagnosis under multiple limited data conditions, which is a mixture of general supervised learning and episodic metric meta-learning.

Journal ArticleDOI
TL;DR: Experiments indicate that the proposed approaches outperform state-of-the-art highly imbalanced learning methods and are more robust to high IR.
Abstract: With the expansion of data, increasing imbalanced data has emerged. When the imbalance ratio (IR) of data is high, most existing imbalanced learning methods decline seriously in classification performance. In this paper, we systematically investigate the highly imbalanced data classification problem, and propose an uncorrelated cost-sensitive multiset learning (UCML) approach for it. Specifically, UCML first constructs multiple balanced subsets through random partition, and then employs the multiset feature learning (MFL) to learn discriminant features from the constructed multiset. To enhance the usability of each subset and deal with the non-linearity issue existed in each subset, we further propose a deep metric based UCML (DM-UCML) approach. DM-UCML introduces the generative adversarial network technique into the multiset constructing process, such that each subset can own similar distribution with the original dataset. To cope with the non-linearity issue, DM-UCML integrates deep metric learning with MFL, such that more favorable performance can be achieved. In addition, DM-UCML designs a new discriminant term to enhance the discriminability of learned metrics. Experiments on eight traditional highly class-imbalanced datasets and two large-scale datasets indicate that: the proposed approaches outperform state-of-the-art highly imbalanced learning methods and are more robust to high IR.

Journal ArticleDOI
TL;DR: In this article, the authors employ the Newman-Janis procedure to construct a rotating generalisation of the non-rotating regular geometries and obtain a stationary, axially symmetric metric that depends on mass, spin and an additional real parameter.
Abstract: The recent opening of gravitational wave astronomy has shifted the debate about black hole mimickers from a purely theoretical arena to a phenomenological one. In this respect, missing a definitive quantum gravity theory, the possibility to have simple, meta-geometries describing in a compact way alternative phenomenologically viable scenarios is potentially very appealing. A recently proposed metric by Simpson and Visser is exactly an example of such meta-geometry describing, for different values of a single parameter, different non-rotating black hole mimickers. Here, we employ the Newman--Janis procedure to construct a rotating generalisation of such geometry. We obtain a stationary, axially symmetric metric that depends on mass, spin and an additional real parameter $\ell$. According to the value of such parameter, the metric may represent a rotating traversable wormhole, a rotating regular black hole with one or two horizons, or three more limiting cases. By studying the internal and external rich structure of such solutions, we show that the obtained metric describes a family of interesting and simple regular geometries providing viable Kerr black hole mimickers for future phenomenological studies.

Proceedings ArticleDOI
30 May 2021
TL;DR: In this article, a probabilistic, multi-modal, multiobject tracking system consisting of different trainable modules is proposed to provide robust and data-driven tracking results.
Abstract: Multi-object tracking is an important ability for an autonomous vehicle to safely navigate a traffic scene. Current state-of-the-art follows the tracking-by-detection paradigm where existing tracks are associated with detected objects through some distance metric. Key challenges to increase tracking accuracy lie in data association and track life cycle management. We propose a probabilistic, multi-modal, multiobject tracking system consisting of different trainable modules to provide robust and data-driven tracking results. First, we learn how to fuse features from 2D images and 3D LiDAR point clouds to capture the appearance and geometric information of an object. Second, we propose to learn a metric that combines the Mahalanobis and feature distances when comparing a track and a new detection in data association. And third, we propose to learn when to initialize a track from an unmatched object detection. Through extensive quantitative and qualitative results, we show that when using the same object detectors our method outperforms state-of-the-art approaches on the NuScenes and KITTI datasets.

Journal ArticleDOI
TL;DR: All the algorithms studied in this paper will be evaluated with exhaustive testing in order to analyze their capabilities in standard classification problems, particularly considering dimensionality reduction and kernelization.

Journal ArticleDOI
TL;DR: In this paper, a black-box predictor is used to generate set-valued predictions from a black box predictor that control the expected loss on future test points at a user-specified level.
Abstract: While improving prediction accuracy has been the focus of machine learning in recent years, this alone does not suffice for reliable decision-making. Deploying learning systems in consequential settings also requires calibrating and communicating the uncertainty of predictions. To convey instance-wise uncertainty for prediction tasks, we show how to generate set-valued predictions from a black-box predictor that control the expected loss on future test points at a user-specified level. Our approach provides explicit finite-sample guarantees for any dataset by using a holdout set to calibrate the size of the prediction sets. This framework enables simple, distribution-free, rigorous error control for many tasks, and we demonstrate it in five large-scale machine learning problems: (1) classification problems where some mistakes are more costly than others; (2) multi-label classification, where each observation has multiple associated labels; (3) classification problems where the labels have a hierarchical structure; (4) image segmentation, where we wish to predict a set of pixels containing an object of interest; and (5) protein structure prediction. Lastly, we discuss extensions to uncertainty quantification for ranking, metric learning and distributionally robust learning.

Proceedings ArticleDOI
29 Jun 2021
TL;DR: In this paper, a method for reducing a multi-class Confusion Matrix into a 2 × 2 version enabling the use of the relevant performance metrics and methods like the Receiver Operator Characteristic and the Area Under the Curve for the assessment of different classification algorithms is presented.
Abstract: The paper presents a novel method for reducing a multi-class Confusion Matrix into a 2 × 2 version enabling the use of the relevant performance metrics and methods like the Receiver Operator Characteristic and the Area Under the Curve for the assessment of different classification algorithms. The reduction method is based on class grouping and leads to a specific Confusion Matrix type. The developed method is then exploited for the assessment of several state-of-the-art machine learning algorithms applied on a customer experience metric.

Journal ArticleDOI
TL;DR: A novel few-shot learning framework named hybrid inference network (HIN) is proposed to tackle the problem of SAR target recognition with only a few training samples by combining the inductive inference and the transductive inference methods.
Abstract: Synthetic aperture radar (SAR) automatic target recognition (ATR) plays an important role in SAR image interpretation. However, at least hundreds of training samples are usually required for each target type in the existing SAR ATR algorithms. In this article, a novel few-shot learning framework named hybrid inference network (HIN) is proposed to tackle the problem of SAR target recognition with only a few training samples. The recognition procedure of HIN consists of two main stages. In the first stage, an embedding network is utilized to map the SAR images into an embedding space. In the second stage, a hybrid inference strategy that combines the inductive inference and the transductive inference is adopted to classify the samples in the embedding space. In the inductive inference section, each sample is recognized independently according to a metric based on Euclidean distance. In the transductive inference section, all samples are recognized as a whole according to their manifold structures by label propagation. Finally, in the hybrid inference section, the classification result is obtained by combining the above two inference methods. To train the framework more effectively, a novel loss function named enhanced hybrid loss is proposed to constrain samples to gain better interclass separability in the embedding space. Experimental results on the moving and stationary target acquisition and recognition (MSTAR) benchmark data set illustrate that HIN performs well in few-shot SAR image classification.

Journal ArticleDOI
TL;DR: A novel method called deep transfer metric learning for kernel regression (DTMLKR) is proposed and applied to the RUL prediction of bearings under multiple operating conditions and the superiority of the proposed method is verified.

Journal ArticleDOI
TL;DR: In this article, the authors employ the Newman-Janis procedure to construct a rotating generalisation of the non-rotating regular geometries and obtain a stationary, axially symmetric metric that depends on mass, spin and an additional real parameter.
Abstract: The recent opening of gravitational wave astronomy has shifted the debate about black hole mimickers from a purely theoretical arena to a phenomenological one. In this respect, missing a definitive quantum gravity theory, the possibility to have simple, meta-geometries describing in a compact way alternative phenomenologically viable scenarios is potentially very appealing. A recently proposed metric by Simpson and Visser is exactly an example of such meta-geometry describing, for different values of a single parameter, different non-rotating black hole mimickers. Here, we employ the Newman--Janis procedure to construct a rotating generalisation of such geometry. We obtain a stationary, axially symmetric metric that depends on mass, spin and an additional real parameter $\ell$. According to the value of such parameter, the metric may represent a rotating traversable wormhole, a rotating regular black hole with one or two horizons, or three more limiting cases. By studying the internal and external rich structure of such solutions, we show that the obtained metric describes a family of interesting and simple regular geometries providing viable Kerr black hole mimickers for future phenomenological studies.

Journal ArticleDOI
TL;DR: A metric-learning-based hashing network is introduced, which implicitly uses a big, pretrained DNN as an intermediate representation step without the need of retraining or fine-tuning and learns a semantic-based metric space where the features are optimized for the target retrieval task.
Abstract: Hashing methods have recently been shown to be very effective in the retrieval of remote sensing (RS) images due to their computational efficiency and fast search speed. Common hashing methods in RS are based on hand-crafted features on top of which they learn a hash function, which provides the final binary codes. However, these features are not optimized for the final task (i.e., retrieval using binary codes). On the other hand, modern deep neural networks (DNNs) have shown an impressive success in learning optimized features for a specific task in an end-to-end fashion. Unfortunately, typical RS data sets are composed of only a small number of labeled samples, which make the training (or fine-tuning) of big DNNs problematic and prone to overfitting. To address this problem, in this letter, we introduce a metric-learning-based hashing network, which: 1) implicitly uses a big, pretrained DNN as an intermediate representation step without the need of retraining or fine-tuning; 2) learns a semantic-based metric space where the features are optimized for the target retrieval task; and 3) computes compact binary hash codes for fast search. Experiments carried out on two RS benchmarks highlight that the proposed network significantly improves the retrieval performance under the same retrieval time when compared to the state-of-the-art hashing methods in RS.

Journal ArticleDOI
TL;DR: A novel deep metric learning model is proposed, where machinery condition is classified by retrieving similarities, and a novel loss function called normalized softmax loss with adaptive angle margin (NSL-AAM) is developed for second problem.
Abstract: Intelligent fault diagnosis based on deep neural networks and big data has been an attractive field and shows great prospects for applications. However, applications in practice face following problems. (1) Unexpected and unseen faults of machinery in real environment may be encountered. (2) Large collections of healthy condition samples and few fault condition samples result in the imbalanced distribution of machinery health conditions. This paper proposes a novel deep metric learning model, where machinery condition is classified by retrieving similarities. The trained deep metric learning model can learn and recognize new faults quickly and easily to address the first problem. As core of deep metric learning, a novel loss function called normalized softmax loss with adaptive angle margin (NSL-AAM) is developed for second problem. NSL-AAM can supervise neural networks learning imbalanced data without altering the original data distribution. Experiments for balanced and imbalanced fault diagnosis are conducted to verify the ability of the proposed model for fault diagnosis. The results demonstrate that the proposed model can not only extract more distinctive features automatically, but also balance the representation of both the majority and minority classes. Furthermore, the results of experiments for diagnosing new faults are reported, which proves the capability of the trained model for open-set classification.

Proceedings ArticleDOI
30 May 2021
TL;DR: In this paper, a symmetric KL-divergence is introduced to the ICP cost that reflects the difference between two probabilistic distributions to speed up the registration process while maintaining accuracy.
Abstract: In this paper, a three-dimensional light detection and ranging simultaneous localization and mapping (SLAM) method is proposed that is available for tracking and mapping with 500–1000 Hz processing. The proposed method significantly reduces the number of points used for point cloud registration using a novel ICP metric to speed up the registration process while maintaining accuracy. Point cloud registration with ICP is less accurate when the number of points is reduced because ICP basically minimizes the distance between points. To avoid this problem, symmetric KL-divergence is introduced to the ICP cost that reflects the difference between two probabilistic distributions. The cost includes not only the distance between points but also differences between distribution shapes. The experimental results on the KITTI dataset indicate that the proposed method has high computational efficiency, strongly outperforms other methods, and has similar accuracy to the state-of-the-art SLAM method.

Journal ArticleDOI
TL;DR: In this article, the authors investigate an alternative relative accuracy measure which avoids this bias: the log of the accuracy ratio: log (prediction / actual) which is particularly relevant if the scatter in the data grows as the value of the variable grows (heteroscedasticity).
Abstract: Surveys show that the mean absolute percentage error (MAPE) is the most widely used measure of forecast accuracy in businesses and organizations. It is however, biased: When used to select among competing prediction methods it systematically selects those whose predictions are too low. This is not widely discussed and so is not generally known among practitioners. We explain why this happens. We investigate an alternative relative accuracy measure which avoids this bias: the log of the accuracy ratio: log (prediction / actual). Relative accuracy is particularly relevant if the scatter in the data grows as the value of the variable grows (heteroscedasticity). We demonstrate using simulations that for heteroscedastic data (modelled by a multiplicative error factor) the proposed metric is far superior to MAPE for model selection. Another use for accuracy measures is in fitting parameters to prediction models. Minimum MAPE models do not predict a simple statistic and so theoretical analysis is limited. We prove that when the proposed metric is used instead, the resulting least squares regression model predicts the geometric mean. This important property allows its theoretical properties to be understood.

Journal ArticleDOI
TL;DR: FovVideoVDP as mentioned in this paper is a video difference metric that models the spatial, temporal, and peripheral aspects of perception, which is derived from psychophysical studies of the early visual system, which model spatio-temporal contrast sensitivity, cortical magnification and contrast masking.
Abstract: FovVideoVDP is a video difference metric that models the spatial, temporal, and peripheral aspects of perception. While many other metrics are available, our work provides the first practical treatment of these three central aspects of vision simultaneously. The complex interplay between spatial and temporal sensitivity across retinal locations is especially important for displays that cover a large field-of-view, such as Virtual and Augmented Reality displays, and associated methods, such as foveated rendering. Our metric is derived from psychophysical studies of the early visual system, which model spatio-temporal contrast sensitivity, cortical magnification and contrast masking. It accounts for physical specification of the display (luminance, size, resolution) and viewing distance. To validate the metric, we collected a novel foveated rendering dataset which captures quality degradation due to sampling and reconstruction. To demonstrate our algorithm's generality, we test it on 3 independent foveated video datasets, and on a large image quality dataset, achieving the best performance across all datasets when compared to the state-of-the-art.

Journal ArticleDOI
05 Mar 2021
TL;DR: In this paper, it was shown that the Brownian map is the only random sphere-homeomorphic metric measure space with scale invariance and the conditional independence of the inside and outside of certain "slices" bounded by geodesics.
Abstract: The Brownian map is a random sphere-homeomorphic metric measure space obtained by "gluing together" the continuum trees described by the $x$ and $y$ coordinates of the Brownian snake. We present an alternative "breadth-first" construction of the Brownian map, which produces a surface from a certain decorated branching process. It is closely related to the peeling process, the hull process, and the Brownian cactus. Using these ideas, we prove that the Brownian map is the only random sphere-homeomorphic metric measure space with certain properties: namely, scale invariance and the conditional independence of the inside and outside of certain "slices" bounded by geodesics. We also formulate a characterization in terms of the so-called Levy net produced by a metric exploration from one measure-typical point to another. This characterization is part of a program for proving the equivalence of the Brownian map and Liouville quantum gravity with parameter $\gamma= \sqrt{8/3}$.

Journal ArticleDOI
TL;DR: In this article, weak and strong uniqueness results for the solutions of multi-dimensional stochastic McKean-Vlasov equation were established under relaxed regularity conditions under the restricted assumption of diffusion, yet without any regularity of the drift.
Abstract: New weak and strong existence and weak and strong uniqueness results for the solutions of multi-dimensional stochastic McKean–Vlasov equation are established under relaxed regularity conditions. Weak existence requires a non-degeneracy of diffusion and no more than a linear growth of both coefficients in the state variable. Weak and strong uniqueness are established under the restricted assumption of diffusion, yet without any regularity of the drift; this part is based on the analysis of the total variation metric.

Journal ArticleDOI
TL;DR: The key idea behind SR2CNN is to learn the representation of signal semantic feature space by introducing a proper combination of cross entropy loss, center loss and reconstruction loss, as well as adopting a suitable distance metric space such that semantic features have greater minimal inter-class distance than maximal intra- class distance.
Abstract: Signal recognition is one of the significant and challenging tasks in the signal processing and communications field. It is often a common situation that there's no training data accessible for some signal classes to perform a recognition task. Hence, as widely-used in image processing field, zero-shot learning (ZSL) is also very important for signal recognition. Unfortunately, ZSL regarding this field has hardly been studied due to inexplicable signal semantics. This paper proposes a ZSL framework, signal recognition and reconstruction convolutional neural networks (SR2CNN), to address relevant problems in this situation. The key idea behind SR2CNN is to learn the representation of signal semantic feature space by introducing a proper combination of cross entropy loss, center loss and reconstruction loss, as well as adopting a suitable distance metric space such that semantic features have greater minimal inter-class distance than maximal intra-class distance. The proposed SR2CNN can discriminate signals even if no training data is available for some signal class. Moreover, SR2CNN can gradually improve itself in the aid of signal detection, because of constantly refined class center vectors in semantic feature space. These merits are all verified by extensive experiments with ablation studies.

Journal ArticleDOI
Wanyin Wu1, Dapeng Tao1, Hao Li1, Zhao Yang2, Jun Cheng 
TL;DR: This work summarizes the different types of features and metric learning approaches from a label attributes perspective, and conducts comprehensive experiments on metric learning methods with two datasets, showing that the relations of loss function with deep feature space and metriclearning.