Showing papers on "Feature vector published in 2021"

PDF

Open Access

Proceedings Article•DOI•

Coordinate Attention for Efficient Mobile Network Design

[...]

Qibin Hou¹, Daquan Zhou¹, Jiashi Feng¹•Institutions (1)

20 Jun 2021

TL;DR: CoordAttention as mentioned in this paper embeds positional information into channel attention to capture long-range dependencies along one spatial direction and meanwhile precise positional information can be preserved along the other spatial direction.

...read moreread less

Abstract: Recent studies on mobile network design have demonstrated the remarkable effectiveness of channel attention (e.g., the Squeeze-and-Excitation attention) for lifting model performance, but they generally neglect the positional information, which is important for generating spatially selective attention maps. In this paper, we propose a novel attention mechanism for mobile networks by embedding positional information into channel attention, which we call "coordinate attention". Unlike channel attention that transforms a feature tensor to a single feature vector via 2D global pooling, the coordinate attention factorizes channel attention into two 1D feature encoding processes that aggregate features along the two spatial directions, respectively. In this way, long-range dependencies can be captured along one spatial direction and meanwhile precise positional information can be preserved along the other spatial direction. The resulting feature maps are then encoded separately into a pair of direction-aware and position-sensitive attention maps that can be complementarily applied to the input feature map to augment the representations of the objects of interest. Our coordinate attention is simple and can be flexibly plugged into classic mobile networks, such as MobileNetV2, MobileNeXt, and EfficientNet with nearly no computational overhead. Extensive experiments demonstrate that our coordinate attention is not only beneficial to ImageNet classification but more interestingly, behaves better in down-stream tasks, such as object detection and semantic segmentation. Code is available at https://github.com/Andrew-Qibin/CoordAttention.

...read moreread less

1,372 citations

Proceedings Article•DOI•

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation

[...]

Pan Zhang¹, Bo Zhang², Ting Zhang², Dong Chen², Yong Wang¹, Fang Wen² - Show less +2 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

26 Jan 2021

TL;DR: ProDA as mentioned in this paper aligns the prototypical assignments based on relative feature distances for two different views of the same target, producing a more compact target feature space and distilling the already learned knowledge to a self-supervised pretrained model further boosts the performance.

...read moreread less

Abstract: Self-training is a competitive approach in domain adaptive segmentation, which trains the network with the pseudo labels on the target domain. However inevitably, the pseudo labels are noisy and the target features are dispersed due to the discrepancy between source and target domains. In this paper, we rely on representative prototypes, the feature centroids of classes, to address the two issues for unsupervised domain adaptation. In particular, we take one step further and exploit the feature distances from prototypes that provide richer information than mere prototypes. Specifically, we use it to estimate the likelihood of pseudo labels to facilitate online correction in the course of training. Meanwhile, we align the prototypical assignments based on relative feature distances for two different views of the same target, producing a more compact target feature space. Moreover, we find that distilling the already learned knowledge to a self-supervised pretrained model further boosts the performance. Our method shows tremendous performance advantage over state-of-the-art methods. The code is available at https://github.com/microsoft/ProDA.

...read moreread less

272 citations

Proceedings Article•DOI•

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition

[...]

Stephen Hausler¹, Sourav Garg¹, Ming Xu¹, Michael Milford¹, Tobias Fischer¹ - Show less +1 more•Institutions (1)

Queensland University of Technology¹

01 Jun 2021

TL;DR: Patch-NetVLAD as discussed by the authors combines the advantages of both local and global descriptor methods by deriving patch-level features from NetVLAD residuals, which enables aggregation and matching of deep-learned local features defined over the feature-space grid.

...read moreread less

Abstract: Visual Place Recognition is a challenging task for robotics and autonomous systems, which must deal with the twin problems of appearance and viewpoint change in an always changing world. This paper introduces Patch-NetVLAD, which provides a novel formulation for combining the advantages of both local and global descriptor methods by deriving patch-level features from NetVLAD residuals. Unlike the fixed spatial neighborhood regime of existing local keypoint features, our method enables aggregation and matching of deep-learned local features defined over the feature-space grid. We further introduce a multi-scale fusion of patch features that have complementary scales (i.e. patch sizes) via an integral feature space and show that the fused features are highly invariant to both condition (season, structure, and illumination) and viewpoint (translation and rotation) changes. Patch-NetVLAD achieves state-of-the-art visual place recognition results in computationally limited scenarios, validated on a range of challenging real-world datasets, including winning the Facebook Mapillary Visual Place Recognition Challenge at ECCV2020. It is also adaptable to user requirements, with a speed-optimised version operating over an order of magnitude faster than the state-of-the-art. By combining superior performance with improved computational efficiency in a configurable framework, Patch-NetVLAD is well suited to enhance both stand-alone place recognition capabilities and the overall performance of SLAM systems.

...read moreread less

199 citations

Journal Article•DOI•

Intelligent Fault Diagnosis by Fusing Domain Adversarial Training and Maximum Mean Discrepancy via Ensemble Learning

[...]

Yibin Li¹, Yan Song¹, Lei Jia¹, Shengyao Gao², Qiqiang Li¹, Meikang Qiu³ - Show less +2 more•Institutions (3)

Shandong University¹, United States Naval Academy², Columbia University³

01 Apr 2021-IEEE Transactions on Industrial Informatics

TL;DR: This article proposes an intelligent fault diagnosis method based on an improved domain adaptation method and shows that the proposed method is effective and applicable in diagnosing faults with domain mismatch.

...read moreread less

Abstract: Nowadays, the industrial Internet of Things (IIoT) has been successfully utilized in smart manufacturing. The massive amount of data in IIoT promote the development of deep learning-based health monitoring for industrial equipment. Since monitoring data for mechanical fault diagnosis collected on different working conditions or equipment have domain mismatch, models trained with training data may not work in practical applications. Therefore, it is essential to study fault diagnosis methods with domain adaptation ability. In this article, we propose an intelligent fault diagnosis method based on an improved domain adaptation method. Specifically, two feature extractors concerning feature space distance and domain mismatch are trained using maximum mean discrepancy and domain adversarial training respectively to enhance feature representation. Since separate classifiers are trained for feature extractors, ensemble learning is further utilized to obtain final results. Experimental results indicate that the proposed method is effective and applicable in diagnosing faults with domain mismatch.

...read moreread less

183 citations

Proceedings Article•DOI•

Adaptive Prototype Learning and Allocation for Few-Shot Segmentation

[...]

Gen Li¹, Varun Jampani², Laura Sevilla-Lara¹, Deqing Sun², Jong Hyun Kim³, Joongkyu Kim³ - Show less +2 more•Institutions (3)

University of Edinburgh¹, Google², Sungkyunkwan University³

01 Jun 2021

TL;DR: Zhang et al. as mentioned in this paper proposed a superpixel-guided clustering (SGC) and guided prototype allocation (GPA) modules for multiple prototype extraction and allocation, which extracts more representative prototypes by aggregating similar feature vectors, while GPA is able to select matched prototypes to provide more accurate guidance.

...read moreread less

Abstract: Prototype learning is extensively used for few-shot segmentation. Typically, a single prototype is obtained from the support feature by averaging the global object information. However, using one prototype to represent all the information may lead to ambiguities. In this paper, we propose two novel modules, named superpixel-guided clustering (SGC) and guided prototype allocation (GPA), for multiple prototype extraction and allocation. Specifically, SGC is a parameter-free and training-free approach, which extracts more representative prototypes by aggregating similar feature vectors, while GPA is able to select matched prototypes to provide more accurate guidance. By integrating the SGC and GPA together, we propose the Adaptive Superpixel-guided Network (ASGNet), which is a lightweight model and adapts to object scale and shape variation. In addition, our network can easily generalize to k-shot segmentation with substantial improvement and no additional computational cost. In particular, our evaluations on COCO demonstrate that ASGNet surpasses the state-of-the-art method by 5% in 5-shot segmentation.1

...read moreread less

172 citations

Journal Article•DOI•

A deep translation (GAN) based change detection network for optical and SAR remote sensing images

[...]

Xinghua Li¹, Zhengshun Du¹, Yanyuan Huang¹, Zhenyu Tan²•Institutions (2)

Wuhan University¹, Northwest University (China)²

01 Sep 2021-Isprs Journal of Photogrammetry and Remote Sensing

TL;DR: A deep translation based change detection network (DTCDN) for optical and SAR images is proposed that utilizes deep context features to separate the unchanged pixels and changed pixels in a supervised CD network.

...read moreread less

Abstract: With the development of space-based imaging technology, a larger and larger number of images with different modalities and resolutions are available. The optical images reflect the abundant spectral information and geometric shape of ground objects, whose qualities are degraded easily in poor atmospheric conditions. Although synthetic aperture radar (SAR) images cannot provide the spectral features of the region of interest (ROI), they can capture all-weather and all-time polarization information. In nature, optical and SAR images encapsulate lots of complementary information, which is of great significance for change detection (CD) in poor weather situations. However, due to the difference in imaging mechanisms of optical and SAR images, it is difficult to conduct their CD directly using the traditional difference or ratio algorithms. Most recent CD methods bring image translation to reduce their difference, but the results are obtained by ordinary algebraic methods and threshold segmentation with limited accuracy. Towards this end, this work proposes a deep translation based change detection network (DTCDN) for optical and SAR images. The deep translation firstly maps images from one domain (e.g., optical) to another domain (e.g., SAR) through a cyclic structure into the same feature space. With the similar characteristics after deep translation, they become comparable. Different from most previous researches, the translation results are imported to a supervised CD network that utilizes deep context features to separate the unchanged pixels and changed pixels. In the experiments, the proposed DTCDN was tested on four representative data sets from Gloucester, California, and Shuguang village. Compared with state-of-the-art methods, the effectiveness and robustness of the proposed method were confirmed.

...read moreread less

166 citations

Journal Article•DOI•

Internal Feature Selection Method of CSP Based on L1-Norm and Dempster–Shafer Theory

[...]

Jing Jin¹, Ruocheng Xiao¹, Ian Daly², Yangyang Miao¹, Xingyu Wang¹, Andrzej Cichocki³ - Show less +2 more•Institutions (3)

East China University of Science and Technology¹, University of Essex², Skolkovo Institute of Science and Technology³

27 Oct 2021-IEEE Transactions on Neural Networks

TL;DR: A new feature selection method based on the Dempster–Shafer theory is proposed, which takes into consideration the distribution of features and results in a significant increase in the performance of MI-based BCI systems.

...read moreread less

Abstract: The common spatial pattern (CSP) algorithm is a well-recognized spatial filtering method for feature extraction in motor imagery (MI)-based brain–computer interfaces (BCIs). However, due to the influence of nonstationary in electroencephalography (EEG) and inherent defects of the CSP objective function, the spatial filters, and their corresponding features are not necessarily optimal in the feature space used within CSP. In this work, we design a new feature selection method to address this issue by selecting features based on an improved objective function. Especially, improvements are made in suppressing outliers and discovering features with larger interclass distances. Moreover, a fusion algorithm based on the Dempster–Shafer theory is proposed, which takes into consideration the distribution of features. With two competition data sets, we first evaluate the performance of the improved objective functions in terms of classification accuracy, feature distribution, and embeddability. Then, a comparison with other feature selection methods is carried out in both accuracy and computational time. Experimental results show that the proposed methods consume less additional computational cost and result in a significant increase in the performance of MI-based BCI systems.

...read moreread less

143 citations

Journal Article•DOI•

A multi-stage semi-supervised learning approach for intelligent fault diagnosis of rolling bearing using data augmentation and metric learning

[...]

Kun Yu¹, Tian Ran Lin², Hui Ma¹, Hui Ma³, Xiang Li¹, Xu Li¹ - Show less +2 more•Institutions (3)

Northeastern University (China)¹, Qingdao University², Chinese Ministry of Education³

01 Jan 2021-Mechanical Systems and Signal Processing

TL;DR: A three-stage SSL approach using data augmentation (DA) and metric learning is proposed for an intelligent bearing fault diagnosis under limited labeled data to demonstrate that the proposed method can perform better in bearing fault diagnosed under limited labeling samples than existing diagnostic methods.

...read moreread less

141 citations

Journal Article•DOI•

A rigorous and robust quantum speed-up in supervised machine learning

[...]

Yunchao Liu¹, Yunchao Liu², Srinivasan Arunachalam¹, Kristan Temme¹•Institutions (2)

IBM¹, University of California, Berkeley²

12 Jul 2021-Nature Physics

TL;DR: In this paper, the authors construct a classifier for quantum machine learning and show that no classical learner can classify the data inverse-polynomially better than random guessing, assuming the widely believed hardness of the discrete logarithm problem.

...read moreread less

Abstract: Recently, several quantum machine learning algorithms have been proposed that may offer quantum speed-ups over their classical counterparts. Most of these algorithms are either heuristic or assume that data can be accessed quantum-mechanically, making it unclear whether a quantum advantage can be proven without resorting to strong assumptions. Here we construct a classification problem with which we can rigorously show that heuristic quantum kernel methods can provide an end-to-end quantum speed-up with only classical access to data. To prove the quantum speed-up, we construct a family of datasets and show that no classical learner can classify the data inverse-polynomially better than random guessing, assuming the widely believed hardness of the discrete logarithm problem. Furthermore, we construct a family of parameterized unitary circuits, which can be efficiently implemented on a fault-tolerant quantum computer, and use them to map the data samples to a quantum feature space and estimate the kernel entries. The resulting quantum classifier achieves high accuracy and is robust against additive errors in the kernel entries that arise from finite sampling statistics. Many quantum machine learning algorithms have been proposed, but it is typically unknown whether they would outperform classical methods on practical devices. A specially constructed algorithm shows that a formal quantum advantage is possible.

...read moreread less

136 citations

Journal Article•DOI•

ArcFace: Additive Angular Margin Loss for Deep Face Recognition.

[...]

Jiankang Deng¹, Jia Guo, Jing Yang², Niannan Xue¹, Irene Cotsia³, Stefanos Zafeiriou¹ - Show less +2 more•Institutions (3)

Imperial College London¹, University of Nottingham², University of London³

09 Jun 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Zhang et al. as mentioned in this paper proposed an additive angular margin loss (ArcFace), which not only has a clear geometric interpretation, but also significantly enhances the discriminative power.

...read moreread less

Abstract: Recently, a popular line of research in face recognition is adopting margins in the well-established softmax loss function to maximize class separability. In this paper, we first introduce an Additive Angular Margin Loss (ArcFace), which not only has a clear geometric interpretation but also significantly enhances the discriminative power. Since ArcFace is susceptible to the massive label noise, we further propose sub-center ArcFace, in which each class contains K sub-centers and training samples only need to be close to any of the $K$ positive sub-centers. Sub-center ArcFace encourages one dominant sub-class that contains the majority of clean faces and non-dominant sub-classes that include hard or noisy faces. Based on this self-propelled isolation, we boost the performance through automatically purifying raw web faces under massive real-world noise. Besides discriminative feature embedding, we also explore the inverse problem, mapping feature vectors to face images. Without training any additional generator or discriminator, the pre-trained ArcFace model can generate identity-preserved face images for both subjects inside and outside the training data only by using the network gradient and Batch Normalization (BN) priors. Extensive experiments demonstrate that ArcFace can enhance the discriminative feature embedding as well as strengthen the generative face synthesis.

...read moreread less

136 citations

Journal Article•DOI•

Deep learning-based clustering approaches for bioinformatics

[...]

Md. Rezaul Karim¹, Oya Beyan², Oya Beyan¹, Achille Zappa³, Ivan G. Costa², Dietrich Rebholz-Schuhmann⁴, Michael Cochez, Stefan Decker¹, Stefan Decker² - Show less +5 more•Institutions (4)

Fraunhofer Society¹, RWTH Aachen University², National University of Ireland, Galway³, National Institutes of Health⁴

18 Jan 2021-Briefings in Bioinformatics

TL;DR: In this article, the authors present a review of state-of-the-art DL-based approaches for clustering analysis that are based on representation learning, which they hope to be useful for bioinformatics research.

...read moreread less

Abstract: Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insights into biological processes in the genomics level, e.g. clustering of gene expressions provides insights on the natural structure inherent in the data, understanding gene functions, cellular processes, subtypes of cells and understanding gene regulations. Subsequently, clustering approaches, including hierarchical, centroid-based, distribution-based, density-based and self-organizing maps, have long been studied and used in classical machine learning settings. In contrast, deep learning (DL)-based representation and feature learning for clustering have not been reviewed and employed extensively. Since the quality of clustering is not only dependent on the distribution of data points but also on the learned representation, deep neural networks can be effective means to transform mappings from a high-dimensional data space into a lower-dimensional feature space, leading to improved clustering results. In this paper, we review state-of-the-art DL-based approaches for cluster analysis that are based on representation learning, which we hope to be useful, particularly for bioinformatics research. Further, we explore in detail the training procedures of DL-based clustering algorithms, point out different clustering quality metrics and evaluate several DL-based approaches on three bioinformatics use cases, including bioimaging, cancer genomics and biomedical text mining. We believe this review and the evaluation results will provide valuable insights and serve a starting point for researchers wanting to apply DL-based unsupervised methods to solve emerging bioinformatics research problems.

...read moreread less

Proceedings Article•DOI•

Intra-Inter Camera Similarity for Unsupervised Person Re-Identification

[...]

Shiyu Xuan¹, Shiliang Zhang¹•Institutions (1)

Peking University¹

22 Mar 2021

TL;DR: Zhang et al. as mentioned in this paper decompose the sample similarity computation into two stages, i.e., the intra-camera and inter-camera computations, respectively, and propose a novel intra-inter camera similarity for pseudo-label generation.

...read moreread less

Abstract: Most of unsupervised person Re-Identification (Re-ID) works produce pseudo-labels by measuring the feature similarity without considering the distribution discrepancy among cameras, leading to degraded accuracy in label computation across cameras. This paper targets to address this challenge by studying a novel intra-inter camera similarity for pseudo-label generation. We decompose the sample similarity computation into two stage, i.e., the intra-camera and inter-camera computations, respectively. The intra-camera computation directly leverages the CNN features for similarity computation within each camera. Pseudo-labels generated on different cameras train the re-id model in a multi-branch network. The second stage considers the classification scores of each sample on different cameras as a new feature vector. This new feature effectively alleviates the distribution discrepancy among cameras and generates more reliable pseudo-labels. We hence train our re-id model in two stages with intra-camera and inter-camera pseudo-labels, respectively. This simple intra-inter camera similarity produces surprisingly good performance on multiple datasets, e.g., achieves rank-1 accuracy of 89.5% on the Market1501 dataset, outperforming the recent unsupervised works by 9+%, and is comparable with the latest transfer learning works that leverage extra annotations.

...read moreread less

Proceedings Article•DOI•

FVC: A New Framework towards Deep Video Compression in Feature Space

[...]

Zhihao Hu¹, Guo Lu², Dong Xu³•Institutions (3)

Beihang University¹, Beijing Institute of Technology², University of Sydney³

01 Jun 2021

TL;DR: Wang et al. as mentioned in this paper proposed a feature-space video coding network (FVC) by performing all major operations (i.e., motion estimation, motion compression, motion compensation and residual compression) in the feature space.

...read moreread less

Abstract: Learning based video compression attracts increasing attention in the past few years. The previous hybrid coding approaches rely on pixel space operations to reduce spatial and temporal redundancy, which may suffer from inaccurate motion estimation or less effective motion compensation. In this work, we propose a feature-space video coding network (FVC) by performing all major operations (i.e., motion estimation, motion compression, motion compensation and residual compression) in the feature space. Specifically, in the proposed deformable compensation module, we first apply motion estimation in the feature space to produce motion information (i.e., the offset maps), which will be compressed by using the auto-encoder style network. Then we perform motion compensation by using deformable convolution and generate the predicted feature. After that, we compress the residual feature between the feature from the current frame and the predicted feature from our deformable compensation module. For better frame reconstruction, the reference features from multiple previous reconstructed frames are also fused by using the nonlocal attention mechanism in the multi-frame feature fusion module. Comprehensive experimental results demonstrate that the proposed framework achieves the state-of-the-art performance on four benchmark datasets including HEVC, UVG, VTL and MCL-JCV.

...read moreread less

Journal Article•DOI•

Wireless Image Retrieval at the Edge

[...]

Mikolaj Jankowski¹, Deniz Gunduz¹, Krystian Mikolajczyk¹•Institutions (1)

Imperial College London¹

01 Jan 2021-IEEE Journal on Selected Areas in Communications

TL;DR: This work studies the image retrieval problem at the wireless edge, where an edge device captures an image, which is then used to retrieve similar images from an edge server, and proposes two alternative schemes based on digital and analog communications.

...read moreread less

Abstract: We study the image retrieval problem at the wireless edge, where an edge device captures an image, which is then used to retrieve similar images from an edge server. These can be images of the same person or a vehicle taken from other cameras at different times and locations. Our goal is to maximize the accuracy of the retrieval task under power and bandwidth constraints over the wireless link. Due to the stringent delay constraint of the underlying application, sending the whole image at a sufficient quality is not possible. We propose two alternative schemes based on digital and analog communications, respectively. In the digital approach, we first propose a deep neural network (DNN) aided retrieval-oriented image compression scheme, whose output bit sequence is transmitted over the channel using conventional channel codes. In the analog joint source and channel coding (JSCC) approach, the feature vectors are directly mapped into channel symbols. We evaluate both schemes on image based re-identification (re-ID) tasks under different channel conditions, including both static and fading channels. We show that the JSCC scheme significantly increases the end-to-end accuracy, speeds up the encoding process, and provides graceful degradation with channel conditions. The proposed architecture is evaluated through extensive simulations on different datasets and channel conditions, as well as through ablation studies.

...read moreread less

Journal Article•DOI•

Improving Visual Reasoning Through Semantic Representation

[...]

Wenfeng Zheng¹, Xiangjun Liu¹, Xubin Ni¹, Lirong Yin², Bo Yang¹ - Show less +1 more•Institutions (2)

University of Electronic Science and Technology of China¹, Louisiana State University²

22 Apr 2021-IEEE Access

TL;DR: The model using semantic representation as input verifies that more accurate results can be obtained by introducing a high-level semantic representation, and shows that it is feasible and effective to introduce high- level and abstract forms of knowledge representation into deep learning tasks.

...read moreread less

Abstract: In visual reasoning, the achievement of deep learning significantly improved the accuracy of results. Image features are primarily used as input to get answers. However, the image features are too redundant to learn accurate characterizations within a limited complexity and time. While in the process of human reasoning, abstract description of an image is usually to avoid irrelevant details. Inspired by this, a higher-level representation named semantic representation is introduced. In this paper, a detailed visual reasoning model is proposed. This new model contains an image understanding model based on semantic representation, feature extraction and process model refined with watershed and u-distance method, a feature vector learning model using pyramidal pooling and residual network, and a question understanding model combining problem embedding coding method and machine translation decoding method. The feature vector could better represent the whole image instead of overly focused on specific characteristics. The model using semantic representation as input verifies that more accurate results can be obtained by introducing a high-level semantic representation. The result also shows that it is feasible and effective to introduce high-level and abstract forms of knowledge representation into deep learning tasks. This study lays a theoretical and experimental foundation for introducing different levels of knowledge representation into deep learning in the future.

...read moreread less

Journal Article•DOI•

Ventilation Diagnosis of Angle Grinder Using Thermal Imaging.

[...]

Adam Glowacz¹•Institutions (1)

AGH University of Science and Technology¹

18 Apr 2021-Sensors

TL;DR: In this article, an innovative method called BCAoMID-F (Binarized Common Areas of Maximum Image Differences-Fusion) is proposed to extract features of thermal images of three angle grinders.

...read moreread less

Abstract: The paper presents an analysis and classification method to evaluate the working condition of angle grinders by means of infrared (IR) thermography and IR image processing. An innovative method called BCAoMID-F (Binarized Common Areas of Maximum Image Differences—Fusion) is proposed in this paper. This method is used to extract features of thermal images of three angle grinders. The computed features are 1-element or 256-element vectors. Feature vectors are the sum of pixels of matrix V or PCA of matrix V or histogram of matrix V. Three different cases of thermal images were considered: healthy angle grinder, angle grinder with 1 blocked air inlet, angle grinder with 2 blocked air inlets. The classification of feature vectors was carried out using two classifiers: Support Vector Machine and Nearest Neighbor. Total recognition efficiency for 3 classes (TRAG) was in the range of 98.5–100%. The presented technique is efficient for fault diagnosis of electrical devices and electric power tools.

...read moreread less

Proceedings Article•DOI•

NBNet: Noise Basis Learning for Image Denoising with Subspace Projection

[...]

Shen Cheng, Yuzhi Wang, Haibin Huang, Donghao Liu, Haoqiang Fan, Shuaicheng Liu¹ - Show less +2 more•Institutions (1)

University of Electronic Science and Technology of China¹

01 Jun 2021

TL;DR: NBNet as mentioned in this paper proposes a non-local attention module to explicitly learn the basis generation as well as subspace projection, which achieves state-of-the-art performance on PSNR and SSIM with significantly less computational cost.

...read moreread less

Abstract: In this paper, we introduce NBNet, a novel framework for image denoising. Unlike previous works, we propose to tackle this challenging problem from a new perspective: noise reduction by image-adaptive projection. Specifically, we propose to train a network that can separate signal and noise by learning a set of reconstruction basis in the feature space. Subsequently, image denosing can be achieved by selecting corresponding basis of the signal subspace and projecting the input into such space. Our key insight is that projection can naturally maintain the local structure of input signal, especially for areas with low light or weak textures. Towards this end, we propose SSA, a non-local attention module we design to explicitly learn the basis generation as well as subspace projection. We further incorporate SSA with NBNet, a UNet structured network designed for end-to-end image denosing based. We conduct evaluations on benchmarks, including SIDD and DND, and NBNet achieves state-of-the-art performance on PSNR and SSIM with significantly less computational cost.

...read moreread less

Proceedings Article•DOI•

Contrastive Embedding for Generalized Zero-Shot Learning

[...]

Zongyan Han¹, Zhenyong Fu¹, Shuo Chen, Jian Yang¹•Institutions (1)

Nanjing University of Science and Technology¹

20 Jun 2021

TL;DR: Hanzy et al. as mentioned in this paper proposed a contrastive embedding for generalized zero-shot learning (GZSL), which integrates the generation model with the embedding model, yielding a hybrid GZSL framework that maps both the real and the synthetic samples produced by the generator model into an embedding space.

...read moreread less

Abstract: Generalized zero-shot learning (GZSL) aims to recognize objects from both seen and unseen classes, when only the labeled examples from seen classes are provided. Recent feature generation methods learn a generative model that can synthesize the missing visual features of unseen classes to mitigate the data-imbalance problem in GZSL. However, the original visual feature space is suboptimal for GZSL classification since it lacks discriminative information. To tackle this issue, we propose to integrate the generation model with the embedding model, yielding a hybrid GZSL framework. The hybrid GZSL approach maps both the real and the synthetic samples produced by the generation model into an embedding space, where we perform the final GZSL classification. Specifically, we propose a contrastive embedding (CE) for our hybrid GZSL framework. The proposed contrastive embedding can leverage not only the class-wise supervision but also the instance-wise supervision, where the latter is usually neglected by existing GZSL researches. We evaluate our proposed hybrid GZSL framework with contrastive embedding, named CE-GZSL, on five benchmark datasets. The results show that our CEGZSL method can outperform the state-of-the-arts by a significant margin on three datasets. Our codes are available on https://github.com/Hanzy1996/CE-GZSL.

...read moreread less

Journal Article•DOI•

A Latent Factor Analysis-Based Approach to Online Sparse Streaming Feature Selection

[...]

Di Wu¹, Yi He², Xin Luo¹, MengChu Zhou³•Institutions (3)

Chinese Academy of Sciences¹, Old Dominion University², Macau University of Science and Technology³

04 Aug 2021-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: This study proposes a latent-factor-analysis-based online sparse-streaming-feature selection algorithm (LOSSA), which is to apply latent factor analysis to pre-estimate missing data in sparse streaming features before conducting feature selection, thereby addressing the missing data issue effectively and efficiently.

...read moreread less

Abstract: Online streaming feature selection (OSFS) has attracted extensive attention during the past decades. Current approaches commonly assume that the feature space of fixed data instances dynamically increases without any missing data. However, this assumption does not always hold in many real applications. Motivated by this observation, this study aims to implement online feature selection from sparse streaming features, i.e., features flow in one by one with missing data as instance count remains fixed. To do so, this study proposes a latent-factor-analysis-based online sparse-streaming-feature selection algorithm (LOSSA). Its main idea is to apply latent factor analysis to pre-estimate missing data in sparse streaming features before conducting feature selection, thereby addressing the missing data issue effectively and efficiently. Theoretical and empirical studies indicate that LOSSA can significantly improve the quality of OSFS when missing data are encountered in target instances.

...read moreread less

Journal Article•DOI•

Knowledge mapping-based adversarial domain adaptation: A novel fault diagnosis method with high generalizability under variable working conditions

[...]

Qi Li¹, Changqing Shen¹, Liang Chen¹, Zhongkui Zhu¹•Institutions (1)

Soochow University (Suzhou)¹

15 Jan 2021-Mechanical Systems and Signal Processing

TL;DR: A knowledge mapping-based adversarial domain adaptation (KMADA) method with a discriminator and a feature extractor to generalize knowledge from target to source domain and indicates the irreplaceable superiority of the KMADA, which achieves the highest diagnosis accuracy.

...read moreread less

Proceedings Article•DOI•

Self-Guided and Cross-Guided Learning for Few-Shot Segmentation

[...]

Bingfeng Zhang, Jimin Xiao, Terry Qin

01 Jun 2021

TL;DR: Zhang et al. as mentioned in this paper proposed a self-guided learning approach, where the lost critical information is mined through making an initial prediction for the annotated support image, the covered and uncovered foreground regions are encoded to the primary and auxiliary support vectors using masked GAP, respectively.

...read moreread less

Abstract: Few-shot segmentation has been attracting a lot of attention due to its effectiveness to segment unseen object classes with a few annotated samples. Most existing approaches use masked Global Average Pooling (GAP) to encode an annotated support image to a feature vector to facilitate query image segmentation. However, this pipeline unavoidably loses some discriminative information due to the average operation. In this paper, we propose a simple but effective self-guided learning approach, where the lost critical information is mined. Specifically, through making an initial prediction for the annotated support image, the covered and uncovered foreground regions are encoded to the primary and auxiliary support vectors using masked GAP, respectively. By aggregating both primary and auxiliary support vectors, better segmentation performances are obtained on query images. Enlightened by our self-guided module for 1-shot segmentation, we propose a cross-guided module for multiple shot segmentation, where the final mask is fused using predictions from multiple annotated samples with high-quality support vectors contributing more and vice versa. This module improves the final prediction in the inference stage without re-training. Extensive experiments show that our approach achieves new state-of-the-art performances on both PASCAL-5i and COCO-20i datasets. Source code is available at https://github.com/zbf1991/SCL.

...read moreread less

Proceedings Article•DOI•

Beyond Max-Margin: Class Margin Equilibrium for Few-shot Object Detection

[...]

Bohao Li¹, Boyu Yang¹, Chang Liu¹, Feng Liu¹, Rongrong Ji², Qixiang Ye¹ - Show less +2 more•Institutions (2)

Chinese Academy of Sciences¹, Xiamen University²

20 Jun 2021

TL;DR: Lee et al. as mentioned in this paper proposed a class margin equilibrium (CME) approach to optimize both feature space partition and novel class reconstruction in a systematic way by using a fully connected layer to decouple localization features and then reserve adequate margin space for novel classes by introducing simple-yet effective class margin loss during feature learning.

...read moreread less

Abstract: Few-shot object detection has made substantial progress by representing novel class objects using the feature representation learned upon a set of base class objects. However, an implicit contradiction between novel class classification and representation is unfortunately ignored. On the one hand, to achieve accurate novel class classification, the distributions of either two base classes must be far away from each other (max-margin). On the other hand, to precisely represent novel classes, the distributions of base classes should be close to each other to reduce the intra-class distance of novel classes (min-margin). In this paper, we propose a class margin equilibrium (CME) approach, with the aim to optimize both feature space partition and novel class reconstruction in a systematic way. CME first converts the few-shot detection problem to the few-shot classification problem by using a fully connected layer to decouple localization features. CME then reserves adequate margin space for novel classes by introducing simple-yet-effective class margin loss during feature learning. Finally, CME pursues margin equilibrium by disturbing the features of novel class instances in an adversarial min-max fashion. Experiments on Pascal VOC and MS-COCO datasets show that CME significantly improves upon two baseline detectors (up to 3 ~ 5% in average), achieving state-of-the-art performance. Code is available at https://github.com/BohaoLee/CME.

...read moreread less

Journal Article•DOI•

Cross-view Locality Preserved Diversity and Consensus Learning for Multi-view Unsupervised Feature Selection

[...]

Chang Tang¹, Xiao Zheng², Xinwang Liu², Wei Zhang³, Jing Zhang⁴, Jian Xiong⁵, Lizhe Wang¹ - Show less +3 more•Institutions (5)

China University of Geosciences (Wuhan)¹, National University of Defense Technology², Qilu University of Technology³, Beihang University⁴, Southwestern University of Finance and Economics⁵

01 Jan 2021-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This work resent a MV-UFS model via cross-view local structure preserved diversity and consensus learning, referred to as CvLP-DCL briefly, and regularize the fact that different views represent same samples to solve the resultant optimization problem.

...read moreread less

Abstract: Although demonstrating great success, previous multi-view unsupervised feature selection (MV-UFS) methods often construct a view-specific similarity graph and characterize the local structure of data within each single view. In such a way, the cross-view information could be ignored. In addition, they usually assume that different feature views are projected from a latent feature space while the diversity of different views cannot be fully captured. In this work, we resent a MV-UFS model via cross-view local structure preserved diversity and consensus learning, referred to as CvLP-DCL briefly. In order to exploit both the shared and distinguishing information across different views, we project each view into a label space, which consists of a consensus part and a view-specific part. Therefore, we regularize the fact that different views represent same samples. Meanwhile, a cross-view similarity graph learning term with matrix-induced regularization is embedded to preserve the local structure of data in the label space. By imposing the $l_{2,1}$ -norm on the feature projection matrices for constraining row sparsity, discriminative features can be selected from different views. An efficient algorithm is designed to solve the resultant optimization problem and extensive experiments on six publicly datasets are conducted to validate the effectiveness of the proposed CvLP-DCL.

...read moreread less

Journal Article•DOI•

DSLR: Deep Stacked Laplacian Restorer for Low-Light Image Enhancement

[...]

Seokjae Lim¹, Won Jun Kim¹•Institutions (1)

Konkuk University¹

01 Jan 2021-IEEE Transactions on Multimedia

TL;DR: The proposed method, so-called a deep stacked Laplacian restorer (DSLR), is capable of separately recovering the global illumination and local details from the original input, and progressively combining them in the image space, and outperforms state-of-the-art methods.

...read moreread less

Abstract: Various images captured in complicated lighting conditions often suffer from deterioration of the image quality. Such poor quality not only dissatisfies the user expectation but also may lead to a significant performance drop in many applications. In this paper, anovel method for low-light image enhancement is proposed by leveraging useful propertiesof the Laplacian pyramid both in image and feature spaces. Specifically, the proposed method, so-called a deep stacked Laplacian restorer (DSLR), is capable of separately recovering the global illumination and local details from the original input, and progressively combining them in the image space. Moreover, the Laplacian pyramid defined in the feature space makes such recovering processes more efficient based on abundant connectionsof higher-order residuals in a multiscale structure. This decomposition-based scheme is fairly desirable for learning the highly nonlinear relation between degraded images and their enhanced results. Experimental results on various datasets demonstrate that the proposed DSLR outperforms state-of-the-art methods. The code and model are publicly available at: https://github.com/SeokjaeLIM/DSLR-release .

...read moreread less

Journal Article•DOI•

A distributed sensor-fault detection and diagnosis framework using machine learning

[...]

Sana Ullah Jan¹, Young Doo Lee¹, Insoo Koo¹•Institutions (1)

University of Ulsan¹

08 Feb 2021-Information Sciences

TL;DR: This work proposes a distributed sensor-fault detection and diagnosis system based on machine learning algorithms where the fault detection block is implemented in the sensor in order to achieve output immediately after data collection and shows the efficiency of the proposed fuzzy learning-based model over classic neuro-fuzzy and non- fuzzy learning approaches.

...read moreread less

Journal Article•DOI•

A two-layer feature selection method using Genetic Algorithm and Elastic Net

[...]

Fatemeh Amini¹, Guiping Hu¹•Institutions (1)

Iowa State University¹

15 Mar 2021-Expert Systems With Applications

TL;DR: Zhang et al. as mentioned in this paper proposed a two-layer feature selection approach that combines a wrapper and an embedded method in constructing an appropriate subset of predictors to improve the prediction accuracy.

...read moreread less

Abstract: Feature selection, as a critical pre-processing step for machine learning, aims at determining representative predictors from a high-dimensional feature space dataset to improve the prediction accuracy. However, the increase in feature space dimensionality, comparing to the number of observations, poses a severe challenge to many existing feature selection methods considering computational efficiency and prediction performance. This paper presents a new two-layer feature selection approach that combines a wrapper and an embedded method in constructing an appropriate subset of predictors. In the first layer of the proposed method, Genetic Algorithm (GA) has been adopted as a wrapper to search for the optimal subset of predictors, which aims to reduce the number of predictors and the prediction error. As one of the meta-heuristic approaches, GA is selected due to its computational efficiency; however, GAs do not guarantee the optimality. To address this issue, a second layer is added to the proposed method to eliminate any remaining redundant/irrelevant predictors to improve the prediction accuracy. Elastic Net (EN) has been selected as the embedded method in the second layer because of its flexibility in adjusting the penalty terms in the regularization process and time efficiency. This two-layer approach has been applied on a Maize genetic dataset from NAM population, which consists of multiple subsets of datasets with different ratios of the number of predictors to the number of observations. The numerical results confirm the superiority of the proposed model.

...read moreread less

Journal Article•DOI•

Metric-based meta-learning model for few-shot fault diagnosis under multiple limited data conditions

[...]

Duo Wang¹, Ming Zhang¹, Ming Zhang², Yuchun Xu², Weining Lu¹, Jun Yang¹, Tao Zhang¹ - Show less +3 more•Institutions (2)

Tsinghua University¹, Aston University²

16 Jun 2021-Mechanical Systems and Signal Processing

TL;DR: This paper proposes a new approach, called Feature Space Metric-based Meta-learning Model (FSM3), to overcome the challenge of the few-shot fault diagnosis under multiple limited data conditions, which is a mixture of general supervised learning and episodic metric meta-learning.

...read moreread less

Journal Article•DOI•

Classification of Coronavirus (COVID-19) from X-ray and CT images using shrunken features.

[...]

Şaban Öztürk¹, Umut Özkaya, Mucahid Barstugan•Institutions (1)

Amasya University¹

01 Mar 2021-International Journal of Imaging Systems and Technology

TL;DR: The proposed machine learning method for the detection of viral epidemics by analyzing X‐ray and CT images has leveraging performance, especially to make the diagnosis of COVID‐19 in a short time and effectively.

...read moreread less

Abstract: Necessary screenings must be performed to control the spread of the COVID-19 in daily life and to make a preliminary diagnosis of suspicious cases. The long duration of pathological laboratory tests and the suspicious test results led the researchers to focus on different fields. Fast and accurate diagnoses are essential for effective interventions for COVID-19. The information obtained by using X-ray and Computed Tomography (CT) images is vital in making clinical diagnoses. Therefore it is aimed to develop a machine learning method for the detection of viral epidemics by analyzing X-ray and CT images. In this study, images belonging to six situations, including coronavirus images, are classified using a two-stage data enhancement approach. Since the number of images in the dataset is deficient and unbalanced, a shallow image augmentation approach was used in the first phase. It is more convenient to analyze these images with hand-crafted feature extraction methods because the dataset newly created is still insufficient to train a deep architecture. Therefore, the Synthetic minority over-sampling technique algorithm is the second data enhancement step of this study. Finally, the feature vector is reduced in size by using a stacked auto-encoder and principal component analysis methods to remove interconnected features in the feature vector. According to the obtained results, it is seen that the proposed method has leveraging performance, especially to make the diagnosis of COVID-19 in a short time and effectively. Also, it is thought to be a source of inspiration for future studies for deficient and unbalanced datasets.

...read moreread less

Journal Article•DOI•

Unsupervised Adversarial Domain Adaptation for Cross-Domain Face Presentation Attack Detection

[...]

Guoqing Wang¹, Hu Han¹, Shiguang Shan¹, Xilin Chen¹•Institutions (1)

Chinese Academy of Sciences¹

01 Jan 2021-IEEE Transactions on Information Forensics and Security

TL;DR: This work proposes an unsupervised domain adaptation with disentangled representation (DR-UDA) approach to improve the generalization capability of PAD into new scenarios and shows promisinggeneralization capability in several public-domain face PAD databases.

...read moreread less

Abstract: Face presentation attack detection (PAD) is essential for securing the widely used face recognition systems. Most of the existing PAD methods do not generalize well to unseen scenarios because labeled training data of the new domain is usually not available. In light of this, we propose an unsupervised domain adaptation with disentangled representation (DR-UDA) approach to improve the generalization capability of PAD into new scenarios. DR-UDA consists of three modules, i.e., ML-Net, UDA-Net and DR-Net. ML-Net aims to learn a discriminative feature representation using the labeled source domain face images via metric learning. UDA-Net performs unsupervised adversarial domain adaptation in order to optimize the source domain and target domain encoders jointly, and obtain a common feature space shared by both domains. As a result, the source domain PAD model can be effectively transferred to the unlabeled target domain for PAD. DR-Net further disentangles the features irrelevant to specific domains by reconstructing the source and target domain face images from the common feature space. Therefore, DR-UDA can learn a disentangled representation space which is generative for face images in both domains and discriminative for live vs. spoof classification. The proposed approach shows promising generalization capability in several public-domain face PAD databases.

...read moreread less

Proceedings Article•DOI•

Stable View Synthesis

[...]

Gernot Riegler¹, Vladlen Koltun¹•Institutions (1)

Intel¹

01 Jun 2021

TL;DR: Stable View Synthesis (SVS) as discussed by the authors is a view-dependent on-surface feature aggregation, in which directional feature vectors at each 3D point are processed to produce a new feature vector for a ray that maps this point into the new target view.

...read moreread less

Abstract: We present Stable View Synthesis (SVS). Given a set of source images depicting a scene from freely distributed viewpoints, SVS synthesizes new views of the scene. The method operates on a geometric scaffold computed via structure-from-motion and multi-view stereo. Each point on this 3D scaffold is associated with view rays and corresponding feature vectors that encode the appearance of this point in the input images. The core of SVS is view-dependent on-surface feature aggregation, in which directional feature vectors at each 3D point are processed to produce a new feature vector for a ray that maps this point into the new target view. The target view is then rendered by a convolutional network from a tensor of features synthesized in this way for all pixels. The method is composed of differentiable modules and is trained end-to-end. It supports spatially-varying view-dependent importance weighting and feature transformation of source images at each point; spatial and temporal stability due to the smooth dependence of on-surface feature aggregation on the target view; and synthesis of view-dependent effects such as specular reflection. Experimental results demonstrate that SVS outperforms state-of-the-art view synthesis methods both quantitatively and qualitatively on three diverse real-world datasets, achieving unprecedented levels of realism in free-viewpoint video of challenging large-scale scenes. Code is available at https://github.com/intel-isl/StableViewSynthesis

...read moreread less

Collapse