Showing papers in "Neurocomputing in 2021"
[...]
TL;DR: A comprehensive review of the recent developments on deep face recognition can be found in this paper, covering broad topics on algorithm designs, databases, protocols, and application scenes, as well as the technical challenges and several promising directions.
Abstract: Deep learning applies multiple processing layers to learn representations of data with multiple levels of feature extraction. This emerging technique has reshaped the research landscape of face recognition (FR) since 2014, launched by the breakthroughs of DeepFace and DeepID. Since then, deep learning technique, characterized by the hierarchical architecture to stitch together pixels into invariant face representation, has dramatically improved the state-of-the-art performance and fostered successful real-world applications. In this survey, we provide a comprehensive review of the recent developments on deep FR, covering broad topics on algorithm designs, databases, protocols, and application scenes. First, we summarize different network architectures and loss functions proposed in the rapid evolution of the deep FR methods. Second, the related face processing methods are categorized into two classes: “one-to-many augmentation” and “many-to-one normalization”. Then, we summarize and compare the commonly used databases for both model training and evaluation. Third, we review miscellaneous scenes in deep FR, such as cross-factor, heterogenous, multiple-media and industrial scenes. Finally, the technical challenges and several promising directions are highlighted.
169 citations
[...]
TL;DR: Results for every optimization task demonstrate that LSEOFOA can provide a high-performance and self-assured tradeoff between exploration and exploitation, and overall research findings show that the proposed model is superior in terms of classification accuracy, Matthews correlation coefficient, sensitivity, and specificity.
Abstract: Bankruptcy prediction is a crucial application in financial fields to aid in accurate decision making for business enterprises. Many models may stagnate to low-accuracy results due to the uninformed choice of parameters. This paper presents a forward-thinking bankruptcy prediction model based on kernel extreme learning machine (KELM), which proposes a new efficient version of a fruit fly optimization (FOA) algorithm called LSEOFOA, to evolve and harmonize the penalty and the kernel parameter in KELM. The upgraded version of FOA is conceptualized based on three reorganizations. The first attempt is to include Levy's flight for improving exploration inclinations, and the second is based on slime mould algorithm (SMA) for avoiding premature convergence and enhancing the stability of the exploration and exploitation patterns. As the last modification, we utilized the elite opposition-based learning for accelerating the convergence. The algorithmic trends of this optimizer are verified, and then, it is verified on a bankruptcy prediction module. Therefore, to further demonstrate the superiority of the LSEOFOA method, comparison studies are performed using the conventional FOA and other variants of FOA and a set of advanced algorithms including EBOwithCMAR. Experimental results for every optimization task demonstrate that LSEOFOA can provide a high-performance and self-assured tradeoff between exploration and exploitation. Also, the developed KELM classifier is utilized for bankruptcy prediction, and its optimal parameters set are revealed by the proposed FOA. The effectiveness of the LSEOFOA-KELM model is rigorously evaluated using a financial dataset and comparison with KELM-based models with other competitive optimizers such as LSHADE-RSP. Overall research findings show that the proposed model is superior in terms of classification accuracy, Matthews correlation coefficient, sensitivity, and specificity. Towards more evolutionary and efficient prediction models, the proposed LSEOFOA-KELM prediction model can be regarded as a promising warning tool for financial decision making, with successful performance in bankruptcy prediction. Interested readers to the idea and related material of LSEOFOA-KELM can find the designed public web service at https://aliasgharheidari.com. Also, the info and source codes of the slime mould algorithm (SMA) in python, matlab and other languages are shared publicly at https://aliasgharheidari.com/SMA.html.
107 citations
[...]
TL;DR: Online learning as mentioned in this paper is a family of machine learning methods, where a learner attempts to tackle some predictive (or any type of decision-making) task by learning from a sequence of data instances one by one at each time.
Abstract: Online learning represents a family of machine learning methods, where a learner attempts to tackle some predictive (or any type of decision-making) task by learning from a sequence of data instances one by one at each time. The goal of online learning is to maximize the accuracy/correctness for the sequence of predictions/decisions made by the online learner given the knowledge of correct answers to previous prediction/learning tasks and possibly additional information. This is in contrast to traditional batch or offline machine learning methods that are often designed to learn a model from the entire training data set at once. Online learning has become a promising technique for learning from continuous streams of data in many real-world applications. This survey aims to provide a comprehensive survey of the online machine learning literature through a systematic review of basic ideas and key principles and a proper categorization of different algorithms and techniques. Generally speaking, according to the types of learning tasks and the forms of feedback information, the existing online learning works can be classified into three major categories: (i) online supervised learning where full feedback information is always available, (ii) online learning with limited feedback, and (iii) online unsupervised learning where no feedback is available. Due to space limitation, the survey will be mainly focused on the first category, but also briefly cover some basics of the other two categories. Finally, we also discuss some open issues and attempt to shed light on potential future research directions in this field.
82 citations
[...]
TL;DR: A novel image segmentation method is developed in this paper for quantitative analysis of GICS based on the deep reinforcement learning (DRL), which can accurately distinguish the test line and the control line in the GICS images.
Abstract: Gold immunochromatographic strip (GICS) is a widely used lateral flow immunoassay technique. A novel image segmentation method is developed in this paper for quantitative analysis of GICS based on the deep reinforcement learning (DRL), which can accurately distinguish the test line and the control line in the GICS images. The deep belief network (DBN) is employed in the deep Q network in our DRL algorithm. Meanwhile, the multi-factor learning curve is introduced in the DRL algorithm to dynamically adjust the capacity of the replay buffer and the sampling size, which leads to enhanced learning efficiency. It is worth mentioning that the states, actions, and rewards in the developed DRL algorithm are determined based on the characteristics of GICS images. Experiment results demonstrate the feasibility and reliability of the proposed DRL-based image segmentation method and show that the proposed new image segmentation method outperforms some existing image segmentation methods for quantitative analysis of GICS images.
62 citations
[...]
TL;DR: In this article, a supervised multi-label classification framework based on deep convolutional neural networks (CNNs) was proposed for predicting the presence of 14 common thoracic diseases and observations.
Abstract: Chest radiography is one of the most common types of diagnostic radiology exams, which is critical for screening and diagnosis of many different thoracic diseases. Specialized algorithms have been developed to detect several specific pathologies such as lung nodules or lung cancer. However, accurately detecting the presence of multiple diseases from chest X-rays (CXRs) is still a challenging task. This paper presents a supervised multi-label classification framework based on deep convolutional neural networks (CNNs) for predicting the presence of 14 common thoracic diseases and observations. We tackle this problem by training state-of-the-art CNNs that exploit hierarchical dependencies among abnormality labels. We also propose to use the label smoothing technique for a better handling of uncertain samples, which occupy a significant portion of almost every CXR dataset. Our model is trained on over 200,000 CXRs of the recently released CheXpert dataset and achieves a mean area under the curve (AUC) of 0.940 in predicting 5 selected pathologies from the validation set. This is the highest AUC score yet reported to date. The proposed method is also evaluated on the independent test set of the CheXpert competition, which is composed of 500 CXR studies annotated by a panel of 5 experienced radiologists. The performance is on average better than 2.6 out of 3 other individual radiologists with a mean AUC of 0.930, which ranks first on the CheXpert leaderboard at the time of writing this paper.
40 citations
[...]
TL;DR: A model that has joint weak saliency and attention aware is proposed, which can obtain more complete global features by weakening saliency features and obtains diversifiedsaliency features via attention diversity to improve the performance of the model.
Abstract: Attention mechanisms can extract salient features in images, which has been proven to be effective for person re-identification. However, focusing on the saliency of an image is not enough. On the one hand, the salient features extracted from the model are not necessarily the features needed, e.g., a similar background may also be mistaken as salient features; on the other hand, various salient features are often more conducive to improving the performance of the model. Based on this, in this paper, a model that has joint weak saliency and attention aware is proposed, which can obtain more complete global features by weakening saliency features. The model then obtains diversified saliency features via attention diversity to improve the performance of the model. Experiments on commonly used datasets prove the effectiveness of the proposed method.
38 citations
[...]
TL;DR: An adaptive NN controller is established which can ensure that all the signals in the closed-loop system are bounded under a class of switching signals with average dwell time and the tracking error converges to the predefined bounds.
Abstract: This article investigates a problem of adaptive neural output-feedback tracking control for a class of switched uncertain nonlinear systems in nonstrict-feedback structure with average dwell time. For the system under study, many factors are taken into consideration, such as unknown nonlinearities, unmeasurable states, external disturbance, unknown dead-zone input, and prescribed performance bounds. A switched NN state observer is established to observe the unmeasurable states and alleviate the conservativeness induced by taking advantage of a common observer. In order to defeat the trouble originated from the nonstrict-feedback structure, an effective adaptive law is introduced by adopting the properties of NNs. The influence of dead-zone on control performance is restricted by designing a special adaptive law in the last step of the backstepping design frame. The stability of the closed-loop system is proved by average dwell time approach and Lyapunov stability theory. By utilizing the multiple Lyapunov function method and the backstepping technique together with the prescribed performance bounds, an adaptive NN controller is established which can ensure that all the signals in the closed-loop system are bounded under a class of switching signals with average dwell time and the tracking error converges to the predefined bounds. The feasibility of the presented control scheme is illustrated by the simulation results.
33 citations
[...]
TL;DR: A new method of data missing estimation with tensor heterogeneous ensemble learning based on FNN (Fuzzy Neural Network) named FNNTEL is proposed in this paper and the performance is better than other commonly used technologies and different missing data generation models.
Abstract: The Internet of Vehicles (IoV) can obtain traffic information through a large number of data collected by sensors. However, the lack of data, abnormal data, and other low-quality problems have seriously restricted the development and application of the IoV. To solve the problem of missing data in a large-scale road network, the previous research achievements show that tensor decomposition method has the advantages in solving multi-dimensional data imputation problems, so we adopt this tensor mode to model traffic velocity data. A new method of data missing estimation with tensor heterogeneous ensemble learning based on FNN (Fuzzy Neural Network) named FNNTEL is proposed in this paper. The performance of this method is evaluated by our experiments and analysis. The proposed method is applied to be tested by the real data captured in Guangzhou and Tianjin of China respectively. A large number of experimental tests show that the performance of the new method is better than other commonly used technologies and different missing data generation models.
32 citations
[...]
TL;DR: This paper formulate causality extraction as a sequence labeling problem based on a novel causality tagging scheme, and proposes a neural causality extractor with the BiLSTM-CRF model as the backbone, named SCITE (Self-attentive BiL STM- CRF wIth Transferred Embeddings), which can directly extract cause and effect without extracting candidate causal pairs and identifying their relations separately.
Abstract: Causality extraction from natural language texts is a challenging open problem in artificial intelligence. Existing methods utilize patterns, constraints, and machine learning techniques to extract causality, heavily depending on domain knowledge and requiring considerable human effort and time for feature engineering. In this paper, we formulate causality extraction as a sequence labeling problem based on a novel causality tagging scheme. On this basis, we propose a neural causality extractor with the BiLSTM-CRF model as the backbone, named SCITE (Self-attentive BiLSTM-CRF wIth Transferred Embeddings), which can directly extract cause and effect without extracting candidate causal pairs and identifying their relations separately. To address the problem of data insufficiency, we transfer contextual string embeddings, also known as Flair embeddings, which are trained on a large corpus in our task. In addition, to improve the performance of causality extraction, we introduce a multihead self-attention mechanism into SCITE to learn the dependencies between causal words. We evaluate our method on a public dataset, and experimental results demonstrate that our method achieves significant and consistent improvement compared to baselines.
31 citations
[...]
TL;DR: A novel loss function is proposed that gives rise to a novel method, Outlier Exposure with Confidence Control (OECC), which achieves superior results in out-of-distribution detection with OE both on image and text classification tasks without requiring access to OOD samples.
Abstract: Deep neural networks have achieved great success in classification tasks during the last years. However, one major problem to the path towards artificial intelligence is the inability of neural networks to accurately detect samples from novel class distributions and therefore, most of the existent classification algorithms assume that all classes are known prior to the training stage. In this work, we propose a methodology for training a neural network that allows it to efficiently detect out-of-distribution (OOD) examples without compromising much of its classification accuracy on the test examples from known classes. We propose a novel loss function that gives rise to a novel method, Outlier Exposure with Confidence Control (OECC), which achieves superior results in OOD detection with OE both on image and text classification tasks without requiring access to OOD samples. Additionally, we experimentally show that the combination of OECC with state-of-the-art post-training OOD detection methods, like the Mahalanobis Detector (MD) and the Gramian Matrices (GM) methods, further improves their performance in the OOD detection task, demonstrating the potential of combining training and post-training methods for OOD detection.
28 citations
[...]
TL;DR: Four main methods of transfer learning are described and their practical applications in EEG signal analysis in recent years are explored.
Abstract: Electroencephalogram (EEG) signal analysis, which is widely used for human-computer interaction and neurological disease diagnosis, requires a large amount of labeled data for training. However, the collection of substantial EEG data could be difficult owing to its randomness and non-stationary. Moreover, there is notable individual difference in EEG data, which affects the reusability and generalization of models. For mitigating the adverse effects from the above factors, transfer learning is applied in this field to transfer the knowledge learnt in one domain into a different but related domain. Transfer learning adjusts models with small-scale data of the task, and also maintains the learning ability with individual difference. This paper describes four main methods of transfer learning and explores their practical applications in EEG signal analysis in recent years. Finally, we discuss challenges and opportunities of transfer learning and suggest areas for further study.
[...]
TL;DR: Results demonstrate that proposed CMWOA outperforms other three methods in most cases regarding several performance indicators, and is successfully applied to three real world problems, which further verifies the practicality of proposed algorithm.
Abstract: In this paper, a competitive mechanism integrated whale optimization algorithm (CMWOA) is proposed to deal with multi-objective optimization problems. By introducing the novel competitive mechanism, a better leader can be generated for guiding the update of whale population, which benefits the convergence of the algorithm. It should also be highlighted that in the competitive mechanism, an improved calculation of crowding distance is adopted which substitutes traditional addition operation with multiplication operation, providing a more accurate depiction of population density. In addition, differential evolution (DE) is concatenated to diversify the population, and the key parameters of DE have been assigned different adjusting strategies to further enhance the overall performance. Proposed CMWOA is evaluated comprehensively on a series of benchmark functions with different shapes of true Pareto front. Results demonstrate that proposed CMWOA outperforms other three methods in most cases regarding several performance indicators. Particularly, influences of model parameters have also been discussed in detail. At last, proposed CMWOA is successfully applied to three real world problems, which further verifies the practicality of proposed algorithm.
[...]
TL;DR: A novel method of adversarial knowledge transfer named SA-GAN stands for Subject Adaptor GAN, which utilizes the Generative Adversarial Network framework to perform cross-subject transfer learning in the domain of wearable sensor-based Human Activity Recognition.
Abstract: Application of intelligent systems especially in smart homes and health-related topics has been drawing more attention in the last decades. Training Human Activity Recognition (HAR) models - as a major module- requires a fair amount of labeled data. Despite training with large datasets, most of the existing models will face a dramatic performance drop when they are tested against unseen data from new users. Moreover, recording enough data for each new user is non-viable due to the limitations and challenges of working with human users. Transfer learning techniques aim to transfer the knowledge which has been learned from the source domain (subject) to the target domain in order to decrease the models’ performance loss in the target domain. This paper presents a novel method of adversarial knowledge transfer named SA-GAN stands for Subject Adaptor GAN, which utilizes the Generative Adversarial Network framework to perform cross-subject transfer learning in the domain of wearable sensor-based Human Activity Recognition. SA-GAN outperformed other state-of-the-art methods in more than 66% of experiments and showed the second-best performance in the remaining 25% of experiments. In some cases, it reached up to 90% of the accuracy which can be obtained by supervised training over the same domain data.
[...]
TL;DR: Using a new variable transformation and differential inclusions theory, a new framework is provided to deal with the inertial neural networks with fuzzy logics and discontinuous activation functions and some sufficient criteria are derived for achieving fixed-time synchronization.
Abstract: This paper aims to investigate the fixed-time synchronization analysis for discontinuous fuzzy inertial neural networks in the presence of parameter uncertainties. By using a new variable transformation and differential inclusions theory, we first establish two kinds of drive-response differential inclusion systems. By designing some novel discontinuous control inputs and using Lyapunov-Krasovskii functional approach, some sufficient criteria are derived for achieving fixed-time synchronization, and the corresponding setting times are estimated. The established results provide a new framework to deal with the inertial neural networks with fuzzy logics and discontinuous activation functions. Some previous works in the literature are extended and complement. Finally, two topical simulation examples are given to show the effectiveness of the developed main control schemes.
[...]
TL;DR: In this paper, a deep cascading network architecture (DCNA) is proposed to solve the SNR environment perception and modulation classification in sub-environments, which is composed of an SNR estimator network (SEN) and a modulation recognition cluster network (MRCN).
Abstract: BACKGROUND: Automatic modulation classification (AMC) plays a crucial role in cognitive radio, such as industrial automation, transmitter identification, and spectrum resource allocation. Recently, deep learning (DL) as a new machine learning (ML) methodology has achieved considerable implementation in AMC missions. However, few studies have examined the robustness of DL models under varying signal-to-noise ratio (SNR) environments. OBJECTIVE: The primary objective of this paper is to design a robust DL-based AMC model to adapt to noise changes. METHODS: The AMC task is divided into two sub-problems: SNR environment perception and modulation classification in sub-environments. A deep cascading network architecture (DCNA) is proposed to solve these two problems. DCNA is composed of an SNR estimator network (SEN) and a modulation recognition cluster network (MRCN). SEN is designed to identify the SNR levels of samples, and MRCN is composed of several subnetworks for further modulation recognition under diverse SNR settings. In addition, a label-smoothing method is proposed to promote the integration between SEN and MRCN. An auxiliary data-segmenting method is also presented to deal with the contrasting data requirements of DCNA. Note that DCNA does not utilize a specific network structure and can be generalized to various deep learning models with advanced improvements. RESULTS: Experimental results on dataset RML2016.10b show that our proposed DCNA can enhance the recognition performance of different network structures on AMC tasks. In particular, a combination of DCNA and convolutional long short-term deep neural network (CLDNN) can achieve a classification accuracy of 91.0 % , outperforming the previous research. CONCLUSION: The performance of the cascading network demonstrates the significant performance advantage and application feasibility of DCNA.
[...]
TL;DR: Both delay-independent and delay-dependent criteria to guarantee the existence, uniqueness and global stability of equilibrium point for the considered FOQVNNs are derived in the form of linear matrix inequality (LMI).
Abstract: This paper focuses on the robust stability analysis of fractional-order quaternion-valued neural networks (FOQVNNs) with neutral delay and parameter uncertainties. Without transforming the FOQVNNs into equivalent two complex-valued systems or four real-valued systems, based on homeomorphism principle, matrix inequality technique and Lyapunov method, both delay-independent and delay-dependent criteria to guarantee the existence, uniqueness and global stability of equilibrium point for the considered FOQVNNs are derived in the form of linear matrix inequality (LMI). Two examples with simulations are provided to manifest the theoretical results.
[...]
TL;DR: In this paper, the authors propose a dual memory system which separates continual learning from reinforcement learning and a pseudo-rehearsal system that "recalls" items representative of previous tasks via a deep generative network.
Abstract: Neural networks can achieve excellent results in a wide variety of applications. However, when they attempt to sequentially learn, they tend to learn the new task while catastrophically forgetting previous ones. We propose a model that overcomes catastrophic forgetting in sequential reinforcement learning by combining ideas from continual learning in both the image classification domain and the reinforcement learning domain. This model features a dual memory system which separates continual learning from reinforcement learning and a pseudo-rehearsal system that “recalls” items representative of previous tasks via a deep generative network. Our model sequentially learns Atari 2600 games without demonstrating catastrophic forgetting and continues to perform above human level on all three games. This result is achieved without: demanding additional storage requirements as the number of tasks increases, storing raw data or revisiting past tasks. In comparison, previous state-of-the-art solutions are substantially more vulnerable to forgetting on these complex deep reinforcement learning tasks.
[...]
TL;DR: This chapter proposes a systematic feature learning method that adopts a deep convolutional neural network to automate feature learning from the raw inputs in a systematic way and results in mutual enhancements in both feature learning and classification.
Abstract: This chapter focuses on the problem of human activity recognition (HAR), in which inputs in the form of multichannel time series signals are acquired from a set of body-worn wearable sensors and outputs are predefined human activities. In this problem, extracting effective features for identifying activities is a critical but challenging task. Most existing work relies on heuristic hand-crafted feature design and shallow feature learning architectures, which cannot find very discriminative features to accurately classify different activities. In this chapter, we propose a systematic feature learning method for the HAR problem. This method adopts a deep convolutional neural network (CNN) to automate feature learning from the raw inputs in a systematic way. Through the deep architecture, higher level abstract representations of low level raw time series signals are learned as effective features without the need for hand-crafting features. By leveraging the labeled information via supervised learning, the learned features are endowed with more discriminative power. Such a unification of feature learning and classification results in mutual enhancements in both. These unique advantages of the CNN lead to a mutually enhanced outcome of HAR, as verified in the experiments on multiple HAR datasets and comparisons with several state-of-the-art techniques.
[...]
TL;DR: Wang et al. as mentioned in this paper proposed a defense method called Backdoor Keyword Identification (BKI) to mitigate backdoor attacks which the adversary performs against LSTM-based text classification by data poisoning.
Abstract: It has been proved that deep neural networks are facing a new threat called backdoor attacks, where the adversary can inject backdoors into the neural network model through poisoning the training dataset. When the input containing some special pattern called the backdoor trigger, the model with backdoor will carry out malicious task such as misclassification specified by adversaries. In text classification systems, backdoors inserted in the models can cause spam or malicious speech to escape detection. Previous work mainly focused on the defense of backdoor attacks in computer vision, little attention has been paid to defense method for RNN backdoor attacks regarding text classification. In this paper, through analyzing the changes in inner LSTM neurons, we proposed a defense method called Backdoor Keyword Identification (BKI) to mitigate backdoor attacks which the adversary performs against LSTM-based text classification by data poisoning. This method can identify and exclude poisoning samples crafted to insert backdoor into the model from training data without a verified and trusted dataset. We evaluate our method on four different text classification datset: IMDB, DBpedia ontology, 20 newsgroups and Reuters-21578 dataset. It all achieves good performance regardless of the trigger sentences.
[...]
TL;DR: All the algorithms studied in this paper will be evaluated with exhaustive testing in order to analyze their capabilities in standard classification problems, particularly considering dimensionality reduction and kernelization.
Abstract: Distance metric learning is a branch of machine learning that aims to learn distances from the data, which enhances the performance of similarity-based algorithms. This tutorial provides a theoretical background and foundations on this topic and a comprehensive experimental analysis of the most-known algorithms. We start by describing the distance metric learning problem and its main mathematical foundations, divided into three main blocks: convex analysis, matrix analysis and information theory. Then, we will describe a representative set of the most popular distance metric learning methods used in classification. All the algorithms studied in this paper will be evaluated with exhaustive testing in order to analyze their capabilities in standard classification problems, particularly considering dimensionality reduction and kernelization. The results, verified by Bayesian statistical tests, highlight a set of outstanding algorithms. Finally, we will discuss several potential future prospects and challenges in this field. This tutorial will serve as a starting point in the domain of distance metric learning from both a theoretical and practical perspective.
[...]
TL;DR: A multi-task learning model for Chinese-oriented aspect-based sentiment analysis, namely LCF-ATEPC, which equips the capability of extracting aspect term and inferring aspect term polarity synchronously and is effective to analyze both Chinese and English comments simultaneously.
Abstract: Aspect-based sentiment analysis (ABSA) task is a fine-grained task of natural language processing and consists of two subtasks: aspect term extraction (ATE) and aspect polarity classification (APC). Most of the related works merely focus on the subtask of Chinese aspect term polarity inferring and fail to emphasize the research of Chinese-oriented ABSA multi-task learning. Based on the local context focus (LCF) mechanism, this paper firstly proposes a multi-task learning model for Chinese-oriented aspect-based sentiment analysis, namely LCF-ATEPC. Compared with other models, this model equips the capability of extracting aspect term and inferring aspect term polarity synchronously. The experimental results on four Chinese review datasets outperform state-of-the-art performance on the ATE and APC subtask. And by integrating the domain-adapted BERT model, LCF-ATEPC achieves the state-of-the-art performance of ATE and APC in the most commonly used SemEval-2014 task4 Restaurant and Laptop datasets. Moreover, this model is effective to analyze both Chinese and English reviews collaboratively and the experimental results on a multilingual mixed dataset prove its effectiveness.
[...]
TL;DR: An overview of the state-of-the-art attention models proposed in recent years is given and a unified model that is suitable for most attention structures is defined.
Abstract: Attention has arguably become one of the most important concepts in the deep learning field It is inspired by the biological systems of humans that tend to focus on the distinctive parts when processing large amounts of information With the development of deep neural networks, attention mechanism has been widely used in diverse application domains This paper aims to give an overview of the state-of-the-art attention models proposed in recent years Toward a better general understanding of attention mechanisms, we define a unified model that is suitable for most attention structures Each step of the attention mechanism implemented in the model is described in detail Furthermore, we classify existing attention models according to four criteria: the softness of attention, forms of input feature, input representation, and output representation Besides, we summarize network architectures used in conjunction with the attention mechanism and describe some typical applications of attention mechanism Finally, we discuss the interpretability that attention brings to deep learning and present its potential future trends
[...]
TL;DR: An attention-based interframe compensation scheme that replaces frames in blurry sequences with newly restored frames, and estimates temporal patterns among the replaced sequence to restore the whole sequence and propose an adaptive residual block that dynamically fuses multi-level features via learning location-specific weights.
Abstract: Video deblurring is a challenging low-level vision task due to variant blur artifacts caused by factors such as depth variations, high-speed movements and camera shakes. Although significant efforts have been devoted to addressing this task, two challenges of capturing temporal patterns and spatial topologies still remain. In this paper, an attention-based interframe compensation scheme is proposed to address the first challenge. The proposed scheme replaces frames in blurry sequences with newly restored frames, and estimates temporal patterns among the replaced sequence to restore the whole sequence. After each replacement, an attention block is employed to exploit dependencies among restored and blurry frames to capture stable temporal patterns. To tackle the second challenge, we propose an adaptive residual block that dynamically fuses multi-level features via learning location-specific weights. Comprehensive experimental results demonstrate that the proposed method achieves state-of-the-art performance in terms of accuracy, visual effect and model size.
[...]
TL;DR: The number of times students have accessed the resources made available on VLE platforms has been identified as a key factor affecting student performance and has been analysed by conducting a real case study which has involved 120 students doing a masters degree over a VLE platform.
Abstract: Educational institutions continually strive to improve the services they offer, their aim is to have the best possible teaching staff, increase the quality of teaching and the academic performance of their students. A knowledge of the factors that affect student learning could help universities and study centres adjust their curricula and teaching methods to the needs of their students. One of the first measures employed by teaching institutions was to create Virtual Learning Environments (VLEs). This type of environment makes it possible to attract a larger number of students because it enables them to study from wherever they are in the world, meaning that the student’s location is no longer a constraint. Moreover, VLEs facilitate access to teaching resources, they make it easier to monitor the activity of the teaching staff and of the interactions between students and teachers. Thus, online environments make it possible to assess the factors that cause the students’ academic performance to increase or decrease. To understand the factors that influence the university learning process, this paper applies a series of automatic learning techniques to a public dataset, including tree-based models and different types of Artificial Neural Networks (ANNs). Having applied these techniques to the dataset, the number of times students have accessed the resources made available on VLE platforms has been identified as a key factor affecting student performance. This factor has been analysed by conducting a real case study which has involved 120 students doing a masters degree over a VLE platform. Concretely, the case study participants were masters degree students in areas related to computer engineering at the University of Salamanca.
[...]
TL;DR: Recently, a large body of deep learning methods have been proposed and has shown great promise in handling the traditional ill-posed problem of depth estimation as discussed by the authors, which is of great significance for many applications such as augmented reality, target tracking and autonomous driving.
Abstract: Depth estimation is a classic task in computer vision, which is of great significance for many applications such as augmented reality, target tracking and autonomous driving. Traditional monocular depth estimation methods are based on depth cues for depth prediction with strict requirements, e.g. shape-from-focus/ defocus methods require low depth of field on the scenes and images. Recently, a large body of deep learning methods have been proposed and has shown great promise in handling the traditional ill-posed problem. This paper aims to review the state-of-the-art development in deep learning-based monocular depth estimation. We give an overview of published papers between 2014 and 2020 in terms of training manners and task types. We firstly summarize the deep learning models for monocular depth estimation. Secondly, we categorize various deep learning-based methods in monocular depth estimation. Thirdly, we introduce the publicly available dataset and the evaluation metrics. And we also analysis the properties of these methods and compare their performance. Finally, we highlight the challenges in order to inform the future research directions.
[...]
TL;DR: This work aims to review the most relevant studies from a technical point of view, focusing on the low-level details for the implementation of the DL models, covering aspects like DL architectures, training strategies, data augmentation, transfer learning, or the features of the datasets used and their impact on the performance of the models.
Abstract: Deep Learning (DL) has attracted a lot of attention in the field of medical image analysis because of its higher performance in image classification when compared to previous state-of-the-art techniques. In addition, a recent meta-analysis found that the diagnostic performance of DL models is equivalent to that of health-care professionals. In this scenario, a lot of research using DL for polyp detection and classification have been published showing promising results in the last five years. Our work aims to review the most relevant studies from a technical point of view, focusing on the low-level details for the implementation of the DL models. To do so, this review analyzes the published research covering aspects like DL architectures, training strategies, data augmentation, transfer learning, or the features of the datasets used and their impact on the performance of the models. Additionally, comparative tables summarizing the main aspects analyzed in this review are publicly available at https://github.com/sing-group/deep-learning-colonoscopy .
[...]
TL;DR: An end-to-end Haze Concentration Adaptive Network, including a pyramid feature extractor (PFE), a feature enhancement module (FEM), and a multi-scale feature attention module (MSFAM) for image dehazing is proposed.
Abstract: Learning-based methods have attracted considerable interest in image dehazing. However, most existing methods are not well adapted to different hazy conditions, especially when dealing with the heavily hazy scene. There is often a significant amount of haze that remains in the images recovered by most methods. To address this issue, we propose an end-to-end Haze Concentration Adaptive Network (HCAN), including a pyramid feature extractor (PFE), a feature enhancement module (FEM), and a multi-scale feature attention module (MSFAM) for image dehazing. Specifically, PFE based on the feature pyramid structure leverages complementary features from different CNN layers to help the clear image prediction. Then, FEM fuses four kinds of images with different haze density (i.e., three recovered images in the FEM with light haze density, and the input hazy image with strong haze condition) to guide the network to adaptively perceive images under different haze conditions. Finally, MSFAM is designed under two principles, multi-scale structure and attention mechanism. It is used to help the network produce a clear image with more details, and ease the network training. Comprehensive experiments demonstrate that the proposed HCAN performs favorably against the state-of-the-art methods in terms of PSNR, SSIM, and visual effect. The results, per-trained models and code are available at https://github.com/TaoWangzj/HCAN .
[...]
TL;DR: A new computational approach to predict lncRNA-disease associations using graph regularized nonnegative matrix factorization (LDGRNMF), which considers disease-associated lncRNAs identification as recommendation system problem and can be regarded as an effectively tool for predicting potential lnc RNAs and diseases.
Abstract: Emerging evidence suggests that long non-coding RNAs (lncRNAs) play an important role in various biological processes and human diseases. Exploring the associations between lncRNAs and diseases can better understand the complex disease mechanisms. However, expensive and time-consuming for exploring by biological experiments, it is imperative to develop more accurate and efficient computational approaches to predicting lncRNA-disease associations. In this work, we develop a new computational approach to predict lncRNA-disease associations using graph regularized nonnegative matrix factorization (LDGRNMF), which considers disease-associated lncRNAs identification as recommendation system problem. More specifically, we calculate the similarity of disease based on Gaussian interaction profile kernel and disease semantic information, and calculate the similarity of lncRNA based on Gaussian interaction profile kernel. Secondly, the weighted K nearest known neighbor interaction profiles is applied to reconstruct lncRNA-disease association adjacency matrix. Finally, graph regularized nonnegative matrix factorization is exploited to predict the potential associations between lncRNAs and diseases. In the five-fold cross-validation experiments, LDGRNMF achieves AUC of 0.8985 which outperforms other compared methods. Moreover, in case studies for stomach cancer, breast cancer and lung cancer, 9, 8 and 6 of the top 10 candidate lncRNAs predicted by LDGRNMF are verified, respectively. Rigorous experimental results indicate that our method can be regarded as an effectively tool for predicting potential lncRNA-disease associations.
[...]
TL;DR: Experimental results demonstrate that the proposed self-supporting dehazing network (SSDN) outperforms state-of-the-art dehazed methods in terms of both quantitative accuracy and qualitative visual effect.
Abstract: As a pre-processing step of computer vision applications, single image dehazing remains challenging due to existing inefficiencies in the restoration of content and details. In this paper, the self-supporting dehazing network (SSDN) is proposed to overcome these two problems. For the restoration of image content, the self-filtering block is introduced to remove redundant features, hence improving the representation abilities of learned features. For the recovery of image details, a novel self-supporting module is proposed as a crucial component of the proposed SSDN. With this module, the complementary information among support images that are transformed from multi-level features is explored. By incorporating such information, the self-supporting module can learn more intrinsic image characteristics and generate fine-detail images. Experimental results demonstrate that the proposed SSDN outperforms state-of-the-art dehazing methods in terms of both quantitative accuracy and qualitative visual effect.
[...]
TL;DR: In this article, a novel end-to-end method based on the attention mechanism integrating convolutional and recurrent neural networks is proposed for joint entity and relation extraction, which can obtain rich semantics and takes full advantage of the associated information between entities and relations without introducing external complicated features.
Abstract: Extracting entities and relations from unstructured texts has become an important task in the natural language processing (NLP), especially knowledge graphs (KG). However, relation classification (RC) and named entity recognition (NER) tasks are usually considered separately, which lost a lot of associated contextual information. Therefore, a novel end-to-end method based on the attention mechanism integrating convolutional and recurrent neural networks is proposed for joint entity and relation extraction, which can obtain rich semantics and takes full advantage of the associated information between entities and relations without introducing external complicated features. The convolutional operation is employed to obtain character-level and word-level embeddings which are transferred to the multi-head attention mechanism. Then the multi-head attention mechanism can encode contextual semantics and embeddings to obtain efficient semantic representation. Moreover, the rich semantics are encoded to obtain final tag sequence based on recurrent neural networks. Finally, the experiments are performed on NYT10 and NYT11 benchmarks to demonstrate the proposed method. Compared with the current pipelined and joint approaches, the experimental results indicate that the proposed method can obtain state-of-the-art performance in terms of the standard F1-score.