Showing papers on "Feature (machine learning) published in 2021"

PDF

Open Access

Proceedings Article•DOI•

MagFace: A Universal Representation for Face Recognition and Quality Assessment

[...]

Qiang Meng, Shichao Zhao, Zhida Huang, Feng Zhou

11 Mar 2021

TL;DR: MagFace as discussed by the authors introduces an adaptive mechanism to learn a well-structured within-class feature distributions by pulling easy samples to class centers while pushing hard samples away, which prevents models from overfitting on noisy low-quality samples and improves face recognition in the wild.

...read moreread less

Abstract: The performance of face recognition system degrades when the variability of the acquired faces increases. Prior work alleviates this issue by either monitoring the face quality in pre-processing or predicting the data uncertainty along with the face feature. This paper proposes MagFace, a category of losses that learn a universal feature embedding whose magnitude can measure the quality of the given face. Under the new loss, it can be proven that the magnitude of the feature embedding monotonically increases if the subject is more likely to be recognized. In addition, Mag-Face introduces an adaptive mechanism to learn a well-structured within-class feature distributions by pulling easy samples to class centers while pushing hard samples away. This prevents models from overfitting on noisy low-quality samples and improves face recognition in the wild. Extensive experiments conducted on face recognition, quality assessments as well as clustering demonstrate its superiority over state-of-the-arts. The code is available at https://github.com/IrvingMeng/MagFace.

...read moreread less

268 citations

Posted Content•

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

[...]

Jie Lei¹, Linjie Li², Luowei Zhou², Zhe Gan², Tamara L. Berg¹, Mohit Bansal¹, Jingjing Liu² - Show less +3 more•Institutions (2)

University of North Carolina at Chapel Hill¹, Microsoft²

11 Feb 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: Experiments on text-to-video retrieval and video question answering on six datasets demonstrate that CLIPBERT outperforms (or is on par with) existing methods that exploit full-length videos, suggesting that end- to-end learning with just a few sparsely sampled clips is often more accurate than using densely extracted offline features from full- length videos, proving the proverbial less-is-more principle.

...read moreread less

Abstract: The canonical approach to video-and-language learning (e.g., video question answering) dictates a neural model to learn from offline-extracted dense video features from vision models and text features from language models. These feature extractors are trained independently and usually on tasks different from the target domains, rendering these fixed features sub-optimal for downstream tasks. Moreover, due to the high computational overload of dense video features, it is often difficult (or infeasible) to plug feature extractors directly into existing approaches for easy finetuning. To provide a remedy to this dilemma, we propose a generic framework ClipBERT that enables affordable end-to-end learning for video-and-language tasks, by employing sparse sampling, where only a single or a few sparsely sampled short clips from a video are used at each training step. Experiments on text-to-video retrieval and video question answering on six datasets demonstrate that ClipBERT outperforms (or is on par with) existing methods that exploit full-length videos, suggesting that end-to-end learning with just a few sparsely sampled clips is often more accurate than using densely extracted offline features from full-length videos, proving the proverbial less-is-more principle. Videos in the datasets are from considerably different domains and lengths, ranging from 3-second generic domain GIF videos to 180-second YouTube human activity videos, showing the generalization ability of our approach. Comprehensive ablation studies and thorough analyses are provided to dissect what factors lead to this success. Our code is publicly available at this https URL

...read moreread less

267 citations

Journal Article•DOI•

Conceptual and empirical comparison of dimensionality reduction algorithms (PCA, KPCA, LDA, MDS, SVD, LLE, ISOMAP, LE, ICA, t-SNE)

[...]

Farzana Anowar¹, Farzana Anowar², Samira Sadaoui¹, Bassant Selim²•Institutions (2)

University of Regina¹, Ericsson²

01 May 2021-Computer Science Review

TL;DR: This study conceptually and empirically explores the most representative FEAs and determines the optimal sets of new features and the quality of the various transformed feature spaces in terms of statistical significance and power analysis, and the FEA efficacy interms of classification accuracy and speed.

...read moreread less

229 citations

Journal Article•DOI•

An Attention-Based Deep Learning Approach for Sleep Stage Classification With Single-Channel EEG

[...]

Emadeldeen Eldele¹, Zhenghua Chen², Chengyu Liu³, Min Wu², Chee Keong Kwoh¹, Xiaoli Li², Cuntai Guan¹ - Show less +3 more•Institutions (3)

Nanyang Technological University¹, Institute for Infocomm Research Singapore², Southeast University³

28 Apr 2021

TL;DR: In this paper, an attention-based deep learning architecture called AttnSleep was proposed to classify sleep stages using single-channel EEG signals, which leverages a multi-head attention mechanism to capture the temporal dependencies among the extracted features.

...read moreread less

Abstract: Automatic sleep stage mymargin classification is of great importance to measure sleep quality. In this paper, we propose a novel attention-based deep learning architecture called AttnSleep to classify sleep stages using single channel EEG signals. This architecture starts with the feature extraction module based on multi-resolution convolutional neural network (MRCNN) and adaptive feature recalibration (AFR). The MRCNN can extract low and high frequency features and the AFR is able to improve the quality of the extracted features by modeling the inter-dependencies between the features. The second module is the temporal context encoder (TCE) that leverages a multi-head attention mechanism to capture the temporal dependencies among the extracted features. Particularly, the multi-head attention deploys causal convolutions to model the temporal relations in the input features. We evaluate the performance of our proposed AttnSleep model using three public datasets. The results show that our AttnSleep outperforms state-of-the-art techniques in terms of different evaluation metrics. Our source codes, experimental data, and supplementary materials are available at https://github.com/emadeldeen24/AttnSleep .

...read moreread less

205 citations

Proceedings Article•DOI•

DER: Dynamically Expandable Representation for Class Incremental Learning

[...]

Shipeng Yan¹, Jiangwei Xie¹, Xuming He¹•Institutions (1)

ShanghaiTech University¹

01 Jun 2021

TL;DR: In this paper, a two-stage learning approach was proposed to utilize a dynamically expandable representation for more effective incremental concept modeling, where at each incremental step, the previously learned representation was augmented with additional feature dimensions from a new learnable feature extractor.

...read moreread less

Abstract: We address the problem of class incremental learning, which is a core step towards achieving adaptive vision intelligence. In particular, we consider the task setting of incremental learning with limited memory and aim to achieve better stability-plasticity trade-off. To this end, we propose a novel two-stage learning approach that utilizes a dynamically expandable representation for more effective incremental concept modeling. Specifically, at each incremental step, we freeze the previously learned representation and augment it with additional feature dimensions from a new learnable feature extractor. This enables us to integrate new visual concepts with retaining learned knowledge. We dynamically expand the representation according to the complexity of novel concepts by introducing a channel-level mask-based pruning strategy. Moreover, we introduce an auxiliary loss to encourage the model to learn diverse and discriminate features for novel concepts. We conduct extensive experiments on the three class incremental learning benchmarks and our method consistently outperforms other methods with a large margin.1

...read moreread less

196 citations

Journal Article•DOI•

Bankruptcy prediction for SMEs using transactional data and two-stage multiobjective feature selection

[...]

Gang Kou¹, Yong Xu¹, Yi Peng², Feng Shen¹, Yang Chen¹, Kun Chang, Shaomin Kou - Show less +3 more•Institutions (2)

Southwestern University of Finance and Economics¹, University of Electronic Science and Technology of China²

01 Jan 2021

TL;DR: This study proposes a two-stage multiobjective feature-selection method that optimizes the number of features as well as model classification performance, and shows that the proposed model achieved similar classification performance while greatly reducing the cardinality of the feature subset.

...read moreread less

Abstract: Many bankruptcy prediction models for small and medium-sized enterprises (SMEs) are built using accounting-based financial ratios. This study proposes a bankruptcy prediction model for SMEs that uses transactional data and payment network–based variables under a scenario where no financial (accounting) data are required. Offline and online test results both confirmed the predictive capability and economic benefit of transactional data–based variables. However, incorporating those features in predictive models produces high dimensional problems, which deteriorates model interpretability and increases feature acquisition costs. Thus, we propose a two-stage multiobjective feature-selection method that optimizes the number of features as well as model classification performance. The results showed that the proposed model achieved similar classification performance while greatly reducing the cardinality of the feature subset. Finally, the feature importance evaluation for features in the optimal subset confirmed the importance of transactional data and payment network-based variables for bankruptcy prediction.

...read moreread less

173 citations

Journal Article•DOI•

MetaCOVID: A Siamese neural network framework with contrastive loss for n -shot diagnosis of COVID-19 patients.

[...]

Mohammad Shorfuzzaman¹, M. Shamim Hossain²•Institutions (2)

Taif University¹, King Saud University²

01 May 2021-Pattern Recognition

TL;DR: An AI system based on deep meta learning is proposed in this research to accelerate analysis of chest X-ray (CXR) images in automatic detection of COVID-19 cases and achieves 95.6% accuracy and AUC of 0.97 in diagnosing CO VID-19 from CXR images even with a limited number of training samples.

...read moreread less

153 citations

Proceedings Article•DOI•

Facial Expression Recognition in the Wild via Deep Attentive Center Loss

[...]

Amir Hossein Farzaneh¹, Xiaojun Qi¹•Institutions (1)

Utah State University¹

01 Jan 2021

TL;DR: In this article, the authors proposed a Deep Attentive Center Loss (DACL) method to adaptively select a subset of significant feature elements for enhanced discrimination, which integrates an attention mechanism to estimate attention weights correlated with feature importance.

...read moreread less

Abstract: Learning discriminative features for Facial Expression Recognition (FER) in the wild using Convolutional Neural Networks (CNNs) is a non-trivial task due to the significant intra-class variations and inter-class similarities. Deep Metric Learning (DML) approaches such as center loss and its variants jointly optimized with softmax loss have been adopted in many FER methods to enhance the discriminative power of learned features in the embedding space. However, equally supervising all features with the metric learning method might include irrelevant features and ultimately degrade the generalization ability of the learning algorithm. We propose a Deep Attentive Center Loss (DACL) method to adaptively select a subset of significant feature elements for enhanced discrimination. The proposed DACL integrates an attention mechanism to estimate attention weights correlated with feature importance using the intermediate spatial feature maps in CNN as context. The estimated weights accommodate the sparse formulation of center loss to selectively achieve intra-class compactness and inter-class separation for the relevant information in the embedding space. An extensive study on two widely used wild FER datasets demonstrates the superiority of the proposed DACL method compared to state-of-the-art methods.

...read moreread less

137 citations

Proceedings Article•DOI•

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

[...]

Shancheng Fang¹, Hongtao Xie¹, Yuxin Wang¹, Zhendong Mao¹, Yongdong Zhang¹ - Show less +1 more•Institutions (1)

University of Science and Technology of China¹

01 Jun 2021

TL;DR: In this article, Fang et al. proposed an autonomous, bidirectional and iterative ABINet for scene text recognition, which blocks gradient flow between vision and language models to enforce explicitly language modeling.

...read moreread less

Abstract: Linguistic knowledge is of great benefit to scene text recognition. However, how to effectively model linguistic rules in end-to-end deep networks remains a research challenge. In this paper, we argue that the limited capacity of language models comes from: 1) implicitly language modeling; 2) unidirectional feature representation; and 3) language model with noise input. Correspondingly, we propose an autonomous, bidirectional and iterative ABINet for scene text recognition. Firstly, the autonomous suggests to block gradient flow between vision and language models to enforce explicitly language modeling. Secondly, a novel bidirectional cloze network (BCN) as the language model is proposed based on bidirectional feature representation. Thirdly, we propose an execution manner of iterative correction for language model which can effectively alleviate the impact of noise input. Additionally, based on the ensemble of iterative predictions, we propose a self-training method which can learn from unlabeled images effectively. Extensive experiments indicate that ABINet has superiority on low-quality images and achieves state-of-the-art results on several mainstream benchmarks. Besides, the ABINet trained with ensemble self-training shows promising improvement in realizing human-level recognition. Code is available at https://github.com/FangShancheng/ABINet.

...read moreread less

136 citations

Journal Article•DOI•

MRI-Based Brain Tumor Classification Using Ensemble of Deep Features and Machine Learning Classifiers

[...]

Jaeyong Kang¹, Zahid Ullah¹, Jeonghwan Gwak•Institutions (1)

Korea National University of Transportation¹

22 Mar 2021-Sensors

TL;DR: In this paper, the authors proposed a method for brain tumor classification using an ensemble of deep features and machine learning classifiers, where the top three deep features which perform well on several machine-learning classifiers are selected and concatenated as an ensemble-of-deep features which is then fed into several machine learning classes to predict the final output.

...read moreread less

Abstract: Brain tumor classification plays an important role in clinical diagnosis and effective treatment. In this work, we propose a method for brain tumor classification using an ensemble of deep features and machine learning classifiers. In our proposed framework, we adopt the concept of transfer learning and uses several pre-trained deep convolutional neural networks to extract deep features from brain magnetic resonance (MR) images. The extracted deep features are then evaluated by several machine learning classifiers. The top three deep features which perform well on several machine learning classifiers are selected and concatenated as an ensemble of deep features which is then fed into several machine learning classifiers to predict the final output. To evaluate the different kinds of pre-trained models as a deep feature extractor, machine learning classifiers, and the effectiveness of an ensemble of deep feature for brain tumor classification, we use three different brain magnetic resonance imaging (MRI) datasets that are openly accessible from the web. Experimental results demonstrate that an ensemble of deep features can help improving performance significantly, and in most cases, support vector machine (SVM) with radial basis function (RBF) kernel outperforms other machine learning classifiers, especially for large datasets.

...read moreread less

128 citations

Journal Article•DOI•

WaveletKernelNet: An Interpretable Deep Neural Network for Industrial Intelligent Diagnosis

[...]

Tianfu Li¹, Zhibin Zhao¹, Chuang Sun¹, Li Cheng¹, Xuefeng Chen¹, Ruqiang Yan¹, Robert X. Gao² - Show less +3 more•Institutions (2)

Xi'an Jiaotong University¹, Case Western Reserve University²

20 Jan 2021-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A novel wavelet-driven deep neural network, termed as WaveletKernelNet (WKN), is presented, where a continuous wavelet convolutional (CWConv) layer is designed to replace the first convolutionAL layer of the standard CNN.

...read moreread less

Abstract: Convolutional neural network (CNN), with the ability of feature learning and nonlinear mapping, has demonstrated its effectiveness in prognostics and health management (PHM). However, an explanation on the physical meaning of a CNN architecture has rarely been studied. In this article, a novel wavelet-driven deep neural network, termed as WaveletKernelNet (WKN), is presented, where a continuous wavelet convolutional (CWConv) layer is designed to replace the first convolutional layer of the standard CNN. This enables the first CWConv layer to discover more meaningful kernels. Furthermore, only the scale parameter and translation parameter are directly learned from raw data at this CWConv layer. This provides a very effective way to obtain a customized kernel bank, specifically tuned for extracting defect-related impact component embedded in the vibration signal. In addition, three experimental studies using data from laboratory environment are carried out to verify the effectiveness of the proposed method for mechanical fault diagnosis. The experimental results show that the accuracy of the WKNs is higher than CNN by more than 10%, which indicate the importance of the designed CWConv layer. Besides, through theoretical analysis and feature map visualization, it is found that the WKNs are interpretable, have fewer parameters, and have the ability to converge faster within the same training epochs.

...read moreread less

Journal Article•DOI•

Multi-Scale Metric Learning for Few-Shot Learning

[...]

Wen Jiang¹, Kai Huang¹, Jie Geng¹, Xinyang Deng¹•Institutions (1)

Northwestern Polytechnical University¹

01 Mar 2021-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A novel few-shot learning method named multi-scale metric learning (MSML) is proposed to extract multi- Scale features and learn the multi- scale relations between samples for the classification of few- shot learning.

...read moreread less

Abstract: Few-shot learning in image classification is developed to learn a model that aims to identify unseen classes with only few training samples for each class. Fewer training samples and new tasks of classification make many traditional classification models no longer applicable. In this paper, a novel few-shot learning method named multi-scale metric learning (MSML) is proposed to extract multi-scale features and learn the multi-scale relations between samples for the classification of few-shot learning. In the proposed method, a feature pyramid structure is introduced for multi-scale feature embedding, which aims to combine high-level strong semantic features with low-level but abundant visual features. Then a multi-scale relation generation network (MRGN) is developed for hierarchical metric learning, in which high-level features are corresponding to deeper metric learning while low-level features are corresponding to lighter metric learning. Moreover, a novel loss function named intra-class and inter-class relation loss (IIRL) is proposed to optimize the proposed deep network, which aims to strengthen the correlation between homogeneous groups of samples and weaken the correlation between heterogeneous groups of samples. Experimental results on mini ImageNet and tiered ImageNet demonstrate that the proposed method achieves superior performance in few-shot learning problem.

...read moreread less

Journal Article•DOI•

Uncovering the structure of clinical EEG signals with self-supervised learning.

[...]

Hubert Banville¹, Omar Chehab¹, Aapo Hyvärinen², Denis-Alexander Engemann³, Alexandre Gramfort¹ - Show less +1 more•Institutions (3)

French Institute for Research in Computer Science and Automation¹, University of Helsinki², Institut national de la recherche agronomique³

01 Mar 2021-Journal of Neural Engineering

TL;DR: The results suggest that self-supervision may pave the way to a wider use of deep learning models on EEG data, and linear classifiers trained on SSL-learned features consistently outperformed purely supervised deep neural networks in low-labeled data regimes while reaching competitive performance when all labels were available.

...read moreread less

Abstract: Objective. Supervised learning paradigms are often limited by the amount of labeled data that is available. This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG), where labeling can be costly in terms of specialized expertise and human processing time. Consequently, deep learning architectures designed to learn on EEG data have yielded relatively shallow models and performances at best similar to those of traditional feature-based approaches. However, in most situations, unlabeled data is available in abundance. By extracting information from this unlabeled data, it might be possible to reach competitive performance with deep neural networks despite limited access to labels. Approach. We investigated self-supervised learning (SSL), a promising technique for discovering structure in unlabeled data, to learn representations of EEG signals. Specifically, we explored two tasks based on temporal context prediction as well as contrastive predictive coding on two clinically-relevant problems: EEG-based sleep staging and pathology detection. We conducted experiments on two large public datasets with thousands of recordings and performed baseline comparisons with purely supervised and hand-engineered approaches. Main results. Linear classifiers trained on SSL-learned features consistently outperformed purely supervised deep neural networks in low-labeled data regimes while reaching competitive performance when all labels were available. Additionally, the embeddings learned with each method revealed clear latent structures related to physiological and clinical phenomena, such as age effects. Significance. We demonstrate the benefit of SSL approaches on EEG data. Our results suggest that self-supervision may pave the way to a wider use of deep learning models on EEG data.

...read moreread less

Journal Article•DOI•

2D-human face recognition using SIFT and SURF descriptors of face’s feature regions

[...]

Surbhi Gupta¹, Kutub Thakur², Munish Kumar³•Institutions (3)

Gokaraju Rangaraju Institute of Engineering and Technology¹, New Jersey City University², Punjab Technical University³

01 Mar 2021-The Visual Computer

TL;DR: The authors have presented the feature-based method for 2D face images, which uses speeded up robust features (SURF) and scale-invariant feature transform (SIFT) for feature extraction and has a maximum recognition accuracy of 99.7%.

...read moreread less

Abstract: Face recognition is the process of identifying people through facial images. It has become vital for security and surveillance applications and required everywhere including institutions, organizations, offices, and social places. There are a number of challenges faced in face recognition which includes face pose, age, gender, illumination, and other variable condition. Another challenge is that the database size for these applications is usually small. So, training and recognition become difficult. Face recognition methods can be divided into two major categories, appearance-based method and feature-based method. In this paper, the authors have presented the feature-based method for 2D face images. speeded up robust features (SURF) and scale-invariant feature transform (SIFT) are used for feature extraction. Five public datasets, namely Yale2B, Face 94, M2VTS, ORL, and FERET, are used for experimental work. Various combinations of SIFT and SURF features with two classification techniques, namely decision tree and random forest, have experimented in this work. A maximum recognition accuracy of 99.7% has been reported by the authors with a combination of SIFT (64-components) and SURF (32-components).

...read moreread less

Journal Article•DOI•

Feature Extraction and Selection for Emotion Recognition from Electrodermal Activity

[...]

Jainendra Shukla¹, Miguel Barreda-Ángeles, Joan Oliver, Gora Chand Nandi², Domenec Puig³ - Show less +1 more•Institutions (3)

Indraprastha Institute of Information Technology¹, Indian Institute of Information Technology, Allahabad², Rovira i Virgili University³

01 Oct 2021-IEEE Transactions on Affective Computing

TL;DR: In this paper, the authors performed a systematic comparison of 40 different EDA features using three feature selection methods, Joint Mutual Information (JMI), Conditional Mutual Information Maximization (CMIM), and Double Input Symmetrical Relevance (DISR), and found that approximately the same numbers of features are required to obtain the optimal accuracy for the arousal recognition and the valence recognition.

...read moreread less

Abstract: Electrodermal activity (EDA) is indicative of psychological processes related to human cognition and emotions. Previous research has studied many methods for extracting EDA features; however, their appropriateness for emotion recognition has been tested using a small number of distinct feature sets and on different, usually small, data sets. In the current research, we reviewed 25 studies and implemented 40 different EDA features across time, frequency and time-frequency domains on the publicly available AMIGOS dataset. We performed a systematic comparison of these EDA features using three feature selection methods, Joint Mutual Information (JMI), Conditional Mutual Information Maximization (CMIM) and Double Input Symmetrical Relevance (DISR) and machine learning techniques. We found that approximately the same numbers of features are required to obtain the optimal accuracy for the arousal recognition and the valence recognition. Also, the subject-dependent classification results were significantly higher than the subject-independent classification for both arousal and valence recognition. Statistical features related to the Mel-Frequency Cepstral Coefficients (MFCC) were explored for the first time for the emotion recognition from EDA signals and they outperformed all other feature groups, including the most commonly used Skin Conductance Response (SCR) related features.

...read moreread less

Proceedings Article•DOI•

Facial expression and attributes recognition based on multi-task learning of lightweight neural networks

[...]

Andrey V. Savchenko¹•Institutions (1)

National Research University – Higher School of Economics¹

31 Mar 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the multi-task learning of lightweight convolutional neural networks is studied for face identification and classification of facial attributes (age, gender, ethnicity) trained on cropped faces without margins.

...read moreread less

Abstract: In this paper, the multi-task learning of lightweight convolutional neural networks is studied for face identification and classification of facial attributes (age, gender, ethnicity) trained on cropped faces without margins The necessity to fine-tune these networks to predict facial expressions is highlighted Several models are presented based on MobileNet, EfficientNet and RexNet architectures It was experimentally demonstrated that they lead to near state-of-the-art results in age, gender and race recognition on the UTKFace dataset and emotion classification on the AffectNet dataset Moreover, it is shown that the usage of the trained models as feature extractors of facial regions in video frames leads to 45% higher accuracy than the previously known state-of-the-art single models for the AFEW and the VGAF datasets from the EmotiW challenges The models and source code are publicly available at this https URL

...read moreread less

Journal Article•DOI•

Remaining Useful Life Prediction Using a Novel Feature-Attention-Based End-to-End Approach

[...]

Hui Liu¹, Zhenyu Liu¹, Weiqiang Jia¹, Xianke Lin²•Institutions (2)

Zhejiang University¹, University of Ontario Institute of Technology²

01 Feb 2021-IEEE Transactions on Industrial Informatics

TL;DR: A novel feature-attention-based end-to-end approach for RUL prediction that gives greater attention weights to more important features dynamically in the training process and outperforms other latest existing approaches.

...read moreread less

Abstract: Deep learning plays an increasingly important role in industrial applications, such as the remaining useful life (RUL) prediction of machines. However, when dealing with multifeature data, most deep learning approaches do not have effective mechanisms to weigh the input features adaptively. In this article, a novel feature-attention-based end-to-end approach is proposed for RUL prediction. First, the proposed feature-attention mechanism is directly applied to the input data, which gives greater attention weights to more important features dynamically in the training process. This helps the model focus more on those critical inputs, and the prediction performance is therefore improved. Next, bidirectional gated recurrent units (BGRU) are used to extract long-term dependencies from the weighted input data, and convolutional neural networks are employed to capture local features from the output sequences of BGRU. Finally, fully connected networks are used to learn the above-mentioned abstract representations to predict the RUL. The proposed approach is validated in a case study of turbofan engines. The experimental results demonstrate that the proposed approach outperforms other latest existing approaches.

...read moreread less

Journal Article•DOI•

Knowledge mapping-based adversarial domain adaptation: A novel fault diagnosis method with high generalizability under variable working conditions

[...]

Qi Li¹, Changqing Shen¹, Liang Chen¹, Zhongkui Zhu¹•Institutions (1)

Soochow University (Suzhou)¹

15 Jan 2021-Mechanical Systems and Signal Processing

TL;DR: A knowledge mapping-based adversarial domain adaptation (KMADA) method with a discriminator and a feature extractor to generalize knowledge from target to source domain and indicates the irreplaceable superiority of the KMADA, which achieves the highest diagnosis accuracy.

...read moreread less

Posted Content•DOI•

Efficient masked face recognition method during the COVID-19 pandemic

[...]

Walid Hariri¹•Institutions (1)

University of Annaba¹

15 Nov 2021-Signal, Image and Video Processing

TL;DR: This paper proposes a reliable method based on discard masked region and deep learning based features in order to address the problem of masked face recognition process and results show high recognition performance.

...read moreread less

Abstract: The coronavirus disease (COVID-19) is an unparalleled crisis leading to a huge number of casualties and security problems. In order to reduce the spread of coronavirus, people often wear masks to protect themselves. This makes face recognition a very difficult task since certain parts of the face are hidden. A primary focus of researchers during the ongoing coronavirus pandemic is to come up with suggestions to handle this problem through rapid and efficient solutions. In this paper, we propose a reliable method based on occlusion removal and deep learning-based features in order to address the problem of the masked face recognition process. The first step is to remove the masked face region. Next, we apply three pre-trained deep Convolutional Neural Networks (CNN), namely VGG-16, AlexNet, and ResNet-50, and use them to extract deep features from the obtained regions (mostly eyes and forehead regions). The Bag-of-features paradigm is then applied to the feature maps of the last convolutional layer in order to quantize them and to get a slight representation comparing to the fully connected layer of classical CNN. Finally, Multilayer Perceptron (MLP) is applied for the classification process. Experimental results on Real-World-Masked-Face-Dataset show high recognition performance compared to other state-of-the-art methods.

...read moreread less

Journal Article•DOI•

EDMF: Efficient Deep Matrix Factorization with Review Feature Learning for Industrial Recommender System

[...]

Hai Liu¹, Chao Zheng¹, Duantengchuan Li², Xiaoxuan Shen¹, Ke Lin³, Jiazhang Wang⁴, Zhen Zhang⁵, Zhaoli Zhang¹, Neal N. Xiong¹ - Show less +5 more•Institutions (5)

Central China Normal University¹, Wuhan University², Harbin Institute of Technology³, Northwestern University⁴, Wuhan Sports University⁵

16 Nov 2021-IEEE Transactions on Industrial Informatics

TL;DR: Wang et al. as mentioned in this paper proposed an efficient deep matrix factorization with review feature learning for the industrial recommender system (EDMF), which extracted the interactive features of onefold review by convolutional neural networks with word attention mechanism.

...read moreread less

Abstract: Recommendation accuracy is a fundamental problem in the quality of the recommendation system. In this paper, we propose an efficient deep matrix factorization with review feature learning for the industrial recommender system (EDMF). Two characteristics in user's review are revealed. First, interactivity between the user and the item, which can also be considered as the former's scoring behavior on the latter, is exploited in a review. Second, the review is only a partial description of the user's preferences for the item, which is revealed as the sparsity property. Specifically, in the first characteristic, EDMF extracts the interactive features of onefold review by convolutional neural networks with word attention mechanism. Subsequently, L0 norm is leveraged to constrain the review considering that the review information is a sparse feature, which is the second characteristic. Furthermore, the loss function is constructed by maximum a posteriori estimation theory, where the interactivity and sparsity property are converted as two prior probability functions. Finally, the alternative minimization algorithm is introduced to optimize the loss functions. Experimental results on several datasets demonstrate that the proposed methods, which show good industrial conversion application prospects, outperform the state-of-the-art methods in terms of effectiveness and efficiency.

...read moreread less

Journal Article•DOI•

A federated learning system with enhanced feature extraction for human activity recognition

[...]

Zhiwen Xiao¹, Xin Xu², Huanlai Xing¹, Fuhong Song¹, Xinhan Wang¹, Bowen Zhao¹ - Show less +2 more•Institutions (2)

Southwest Jiaotong University¹, China University of Mining and Technology²

11 Oct 2021-Knowledge Based Systems

TL;DR: Experimental results demonstrate that PEN outperforms 14 existing HAR algorithms on these datasets in terms of the F1-score; HARFLS with PEN obtains better recognition results on the WISDM and PAMAP2 datasets, compared with 11 existing federated learning systems with various feature extraction structures.

...read moreread less

Abstract: With the rapid growth of mobile devices, wearable sensor-based human activity recognition (HAR) has become one of the hottest topics in the Internet of Things. However, it is challenging for traditional approaches to achieving high recognition accuracy while protecting users’ privacy and sensitive information. To this end, we design a federated learning system for HAR (HARFLS). Based on the FederatedAveraging method, HARFLS enables each user to handle its activity recognition task safely and collectively. However, the recognition accuracy largely depends on the system’s feature extraction ability. To capture sufficient features from HAR data, we design a perceptive extraction network (PEN) as the feature extractor for each user. PEN is mainly composed of a feature network and a relation network. The feature network, based on a convolutional block, is responsible for discovering local features from the HAR data while the relation network, a combination of long short-term memory (LSTM) and attention mechanism, focuses on mining global relationships hidden in the data. Four widely used datasets, i.e., WISDM, UCI_HAR 2012, OPPORTUNITY, and PAMAP2, are used for performance evaluation. Experimental results demonstrate that PEN outperforms 14 existing HAR algorithms on these datasets in terms of the F1-score; HARFLS with PEN obtains better recognition results on the WISDM and PAMAP2 datasets, compared with 11 existing federated learning systems with various feature extraction structures.

...read moreread less

Journal Article•DOI•

A New Multiple Source Domain Adaptation Fault Diagnosis Method Between Different Rotating Machines

[...]

Jun Zhu¹, Nan Chen¹, Changqing Shen²•Institutions (2)

National University of Singapore¹, Soochow University (Suzhou)²

01 Jul 2021-IEEE Transactions on Industrial Informatics

TL;DR: Transfer learning (TL) is proposed by leveraging knowledge learned from source domain to target domain by utilizing multiadversarial learning strategy for obtaining feature representations, which are invariant to the multiple domain shifts and discriminative for the learning goal at the same time.

...read moreread less

Abstract: Fault diagnosis based on data-driven methods are widely investigated when enough supervised samples of the target machine are available to build a reliable model. However, the labeled samples in practical operated machine are usually scarce and difficult to collect. If the model is built based on the sufficient labeled samples from different source machines, the diagnosis performance will degenerate owing to the domain discrepancy. To solve this issue, in this article, transfer learning (TL) is proposed by leveraging knowledge learned from source domain to target domain. While TL methods for fault diagnosis have been actively studied, most of them focus on learning from a single source. Since the labeled samples can come from multiple domains, more general diagnosis knowledge can be learned, which is beneficial to the prediction for the target domain. Therefore, a new TL approach based on multisource domain adaptation is proposed. A multiadversarial learning strategy is utilized for obtaining feature representations, which are invariant to the multiple domain shifts and discriminative for the learning goal at the same time. Extensive experimental analysis on four different bearing datasets is performed to illustrate the effectiveness and advantage of the proposed method.

...read moreread less

Journal Article•DOI•

Fine-grained vehicle type classification using lightweight convolutional neural network with feature optimization and joint learning strategy

[...]

Wei Sun¹, Guoce Zhang¹, Xiaorui Zhang¹, Zhang Xu¹, Ge Nannan¹ - Show less +1 more•Institutions (1)

Nanjing University of Information Science and Technology¹

01 Aug 2021-Multimedia Tools and Applications

TL;DR: A fine-grained VTC method using lightweight convolutional neural network with feature optimization and joint learning strategy combining softmax loss and contrastive-center loss to class vehicle types is proposed, thereby improving model’s fine- grained classification ability.

...read moreread less

Abstract: Vehicle type classification (VTC) plays an important role in today’s intelligent transportation. Previous VTC systems usually run on a monitoring center’s host machine due to the models’ complexity, which consume lots of computing resources and have poor real-time performance. If these systems are deployed to embedded terminals by making the model lightweight while ensuring accuracy, then the problem can be addressed. To this end, we propose a fine-grained VTC method using lightweight convolutional neural network with feature optimization and joint learning strategy. Firstly, a lightweight convolutional network with feature optimization (LWCNN-FO) is designed. We use depthwise separable convolution to reduce network parameters. Besides, the SENet module is added to obtain the important degree of each feature channel automatically through the sample-based self-learning, which can improve recognition accuracy with less network parameters growth. In addition, considering both between-class similarity and intra-class variance, this paper adopts the joint learning strategy combining softmax loss and contrastive-center loss to class vehicle types, thereby improving model’s fine-grained classification ability. We also build a dataset, called Car-159, consisting of 7998 pictures for 159 vehicle types, to evaluate our method. Compared with the state-of-the-art methods, experimental results show that our method can effectively decrease model’s complexity while maintaining accuracy.

...read moreread less

Journal Article•DOI•

Attention Consistent Network for Remote Sensing Scene Classification

[...]

Xu Tang¹, Qiushuo Ma¹, Xiangrong Zhang¹, Fang Liu², Jingjing Ma¹, Licheng Jiao¹ - Show less +2 more•Institutions (2)

Xidian University¹, Nanjing University of Science and Technology²

14 Jan 2021-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: Tang et al. as mentioned in this paper proposed an attention consistent network (ACNet) based on the Siamese network for remote sensing image scene classification, which unifies the salient regions and impact/separate the RS images from the same/different semantic categories.

...read moreread less

Abstract: Remote sensing (RS) image scene classification is an important research topic in the RS community, which aims to assign the semantics to the land covers. Recently, due to the strong behavior of convolutional neural network (CNN) in feature representation, the growing number of CNN-based classification methods has been proposed for RS images. Although they achieve cracking performance, there is still some room for improvement. First, apart from the global information, the local features are crucial to distinguish the RS images. The existing networks are good at capturing the global features since the CNNs’ hierarchical structure and the nonlinear fitting capacity. However, the local features are not always emphasized. Second, to obtain satisfactory classification results, the distances of RS images from the same/different classes should be minimized/maximized. Nevertheless, these key points in pattern classification do not get the attention they deserve. To overcome the limitation mentioned above, we propose a new CNN named attention consistent network (ACNet) based on the Siamese network in this article. First, due to the dual-branch structure of ACNet, the input data are the image pairs that are obtained by the spatial rotation. This helps our model to fully explore the global features from RS images. Second, we introduce different attention techniques to mine the objects’ information from RS images comprehensively. Third, considering the influence of the spatial rotation and the similarities between RS images, we develop an attention consistent model to unify the salient regions and impact/separate the RS images from the same/different semantic categories. Finally, the classification results can be obtained using the learned features. Three popular RS scene datasets are selected to validate our ACNet. Compared with some existing networks, the proposed method can achieve better performance. The encouraging results illustrate that ACNet is effective for the RS image scene classification. The source codes of this method can be found in https://github.com/TangXu-Group/Remote-Sensing-Images-Classification/tree/main/GLCnet .

...read moreread less

Proceedings Article•DOI•

Benchmarking Representation Learning for Natural World Image Collections

[...]

Grant Van Horn¹, Elijah Cole², Sara Beery², Kimberly Wilber³, Serge Belongie¹, Oisin MacAodha⁴ - Show less +2 more•Institutions (4)

Cornell University¹, California Institute of Technology², Google³, University of Edinburgh⁴

19 Jun 2021

TL;DR: In this paper, the authors present two new natural world visual classification datasets, iNat2021 and NeWT, with the aim of benchmarking the performance of representation learning algorithms on a suite of challenging natural world binary classification tasks that go beyond standard species classification.

...read moreread less

Abstract: Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. However, to date the vast majority of these approaches have restricted themselves to training on standard benchmark datasets such as ImageNet. We argue that fine-grained visual categorization problems, such as plant and animal species classification, provide an informative testbed for self-supervised learning. In order to facilitate progress in this area we present two new natural world visual classification datasets, iNat2021 and NeWT. The former consists of 2.7M images from 10k different species up-loaded by users of the citizen science application iNaturalist. We designed the latter, NeWT, in collaboration with domain experts with the aim of benchmarking the performance of representation learning algorithms on a suite of challenging natural world binary classification tasks that go beyond standard species classification. These two new datasets allow us to explore questions related to large-scale representation and transfer learning in the context of fine-grained categories. We provide a comprehensive analysis of feature extractors trained with and without supervision on ImageNet and iNat2021, shedding light on the strengths and weaknesses of different learned features across a diverse set of tasks. We find that features produced by standard supervised methods still outperform those produced by self-supervised approaches such as SimCLR. However, improved self-supervised learning methods are constantly being released and the iNat2021 and NeWT datasets are a valuable resource for tracking their progress.

...read moreread less

Journal Article•DOI•

Rolling bearing fault diagnosis with combined convolutional neural networks and support vector machine

[...]

Tian Han¹, Longwen Zhang¹, Zhongjun Yin¹, Andy C. C. Tan²•Institutions (2)

University of Science and Technology Beijing¹, Universiti Tunku Abdul Rahman²

01 Jun 2021-Measurement

TL;DR: The proposed CNN-SVM system is applied in bearing fault diagnosis, which takes the time domain diagram of bearing vibration data as the system input and has the advantages of less time-consuming, high precision and strong generalization ability.

...read moreread less

Proceedings Article•DOI•

Learning Dynamic Alignment via Meta-filter for Few-shot Learning

[...]

Chengming Xu¹, Yanwei Fu¹, Chen Liu¹, Chengjie Wang², Jilin Li², Feiyue Huang², Li Zhang¹, Xiangyang Xue¹ - Show less +4 more•Institutions (2)

Fudan University¹, Tencent²

01 Jun 2021

TL;DR: In this paper, the authors propose to learn a dynamic alignment, which can effectively highlight both query regions and channels according to different local support information, which is achieved by first dynamically sampling the neighborhood of the feature position conditioned on the input few shot, based on which they further predict a both position-dependent and channel-dependent Dynamic Meta-filter.

...read moreread less

Abstract: Few-shot learning (FSL), which aims to recognise new classes by adapting the learned knowledge with extremely limited few-shot (support) examples, remains an important open problem in computer vision. Most of the existing methods for feature alignment in few-shot learning only consider image-level or spatial-level alignment while omitting the channel disparity. Our insight is that these methods would lead to poor adaptation with redundant matching, and leveraging channel-wise adjustment is the key to well adapting the learned knowledge to new classes. Therefore, in this paper, we propose to learn a dynamic alignment, which can effectively highlight both query regions and channels according to different local support information. Specifically, this is achieved by first dynamically sampling the neighbourhood of the feature position conditioned on the input few shot, based on which we further predict a both position-dependent and channel-dependent Dynamic Meta-filter. The filter is used to align the query feature with position-specific and channel-specific knowledge. Moreover, we adopt Neural Ordinary Differential Equation (ODE) to enable a more accurate control of the alignment. In such a sense our model is able to better capture fine-grained semantic context of the few-shot example and thus facilitates dynamical knowledge adaptation for few-shot learning. The resulting framework establishes the new state-of-the-arts on major few-shot visual recognition benchmarks, including miniImageNet and tieredImageNet.

...read moreread less

Journal Article•DOI•

Meta-i6mA: an interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework.

[...]

Md. Mehedi Hasan¹, Shaherin Basith², Mst. Shamima Khatun³, Gwang Lee², Balachandran Manavalan², Hiroyuki Kurata³ - Show less +2 more•Institutions (3)

China Agricultural University¹, Ajou University², Kyushu Institute of Technology³

20 May 2021-Briefings in Bioinformatics

TL;DR: Ten different feature encoding schemes were explored, with the goal of capturing key characteristics around 6mA sites and Meta-i6mA was proposed that combined the baseline models using the meta-predictor approach and outperformed the existing predictors.

...read moreread less

Abstract: DNA N6-methyladenine (6mA) represents important epigenetic modifications, which are responsible for various cellular processes. The accurate identification of 6mA sites is one of the challenging tasks in genome analysis, which leads to an understanding of their biological functions. To date, several species-specific machine learning (ML)-based models have been proposed, but majority of them did not test their model to other species. Hence, their practical application to other plant species is quite limited. In this study, we explored 10 different feature encoding schemes, with the goal of capturing key characteristics around 6mA sites. We selected five feature encoding schemes based on physicochemical and position-specific information that possesses high discriminative capability. The resultant feature sets were inputted to six commonly used ML methods (random forest, support vector machine, extremely randomized tree, logistic regression, naive Bayes and AdaBoost). The Rosaceae genome was employed to train the above classifiers, which generated 30 baseline models. To integrate their individual strength, Meta-i6mA was proposed that combined the baseline models using the meta-predictor approach. In extensive independent test, Meta-i6mA showed high Matthews correlation coefficient values of 0.918, 0.827 and 0.635 on Rosaceae, rice and Arabidopsis thaliana, respectively and outperformed the existing predictors. We anticipate that the Meta-i6mA can be applied across different plant species. Furthermore, we developed an online user-friendly web server, which is available at http://kurata14.bio.kyutech.ac.jp/Meta-i6mA/.

...read moreread less

Journal Article•DOI•

Emotion recognition based on EEG feature maps through deep learning network

[...]

Ante Topic¹, Mladen Russo¹•Institutions (1)

University of Split¹

16 Apr 2021-Engineering Science and Technology, an International Journal

TL;DR: Experimental results show that the proposed methods can improve the emotion recognition rate on the different size datasets and are effective in comparison with studies where authors used EEG signals that classify human emotions in the two-dimensional space.

...read moreread less

Journal Article•DOI•

A decision support system for multimodal brain tumor classification using deep learning

[...]

Muhammad Imran Sharif¹, Muhammad Attique Khan², Musaed Alhussein³, Khursheed Aurangzeb³, Mudassar Raza¹ - Show less +1 more•Institutions (3)

COMSATS Institute of Information Technology¹, HITEC University², King Saud University³

09 Mar 2021-Complex & Intelligent Systems

TL;DR: A new automated deep learning method is proposed for the classification of multiclass brain tumors using a modified genetic algorithm based on metaheuristics and a non-redundant serial-based approach.

...read moreread less

Abstract: Multiclass classification of brain tumors is an important area of research in the field of medical imaging. Since accuracy is crucial in the classification, a number of techniques are introduced by computer vision researchers; however, they still face the issue of low accuracy. In this article, a new automated deep learning method is proposed for the classification of multiclass brain tumors. To realize the proposed method, the Densenet201 Pre-Trained Deep Learning Model is fine-tuned and later trained using a deep transfer of imbalanced data learning. The features of the trained model are extracted from the average pool layer, which represents the very deep information of each type of tumor. However, the characteristics of this layer are not sufficient for a precise classification; therefore, two techniques for the selection of features are proposed. The first technique is Entropy–Kurtosis-based High Feature Values (EKbHFV) and the second technique is a modified genetic algorithm (MGA) based on metaheuristics. The selected features of the GA are further refined by the proposed new threshold function. Finally, both EKbHFV and MGA-based features are fused using a non-redundant serial-based approach and classified using a multiclass SVM cubic classifier. For the experimental process, two datasets, including BRATS2018 and BRATS2019, are used without increase and have achieved an accuracy of more than 95%. The precise comparison of the proposed method with other neural nets shows the significance of this work.

...read moreread less

Collapse