Author
Yu Biao Liu
Bio: Yu Biao Liu is an academic researcher from Zhejiang University of Science and Technology. The author has contributed to research in topics: Computer science & Artificial intelligence. The author has co-authored 1 publications.
Papers
More filters
TL;DR: Zhang et al. as mentioned in this paper proposed a novel method for EEG-based emotion recognition based on multi-task learning with capsule network (CapsNet) and attention mechanism, which can learn multiple tasks simultaneously while exploiting commonalities and differences across tasks.
Abstract: Deep learning (DL) technologies have recently shown great potential in emotion recognition based on electroencephalography (EEG). However, existing DL-based EEG emotion recognition methods are built on single-task learning, i.e., learning arousal, valence, and dominance individually, which may ignore the complementary information of different tasks. In addition, single-task learning involves a new round of training every time a new task appears, which is time consuming. To this end, we propose a novel method for EEG-based emotion recognition based on multi-task learning with capsule network (CapsNet) and attention mechanism. First, multi-task learning can learn multiple tasks simultaneously while exploiting commonalities and differences across tasks, it can also obtain more data from different tasks, which can improve generalization and robustness. Second, the innovative structure of the CapsNet enables it to effectively characterize the intrinsic relationship among various EEG channels. Finally, the attention mechanism can change the weight of different channels to extract important information. In the DEAP dataset, the average accuracy reached 97.25%, 97.41%, and 98.35% on arousal, valence, and dominance, respectively. In the DREAMER dataset, average accuracy reached 94.96%, 95.54%, and 95.52% on arousal, valence, and dominance, respectively. Experimental results demonstrate the efficiency of the proposed method for EEG emotion recognition.
34 citations
TL;DR: Experimental results demonstrate that the proposed multimodal MRI volumetric data fusion method based on an end-to-end convolutional neural network can obtain more competitive results on both visual quality and objective assessment when compared with some representative 3-D and 2-D medical image fusion methods.
Abstract: Medical image fusion aims to integrate the complementary information captured by images of different modalities into a more informative composite image. However, current study on medical image fusion suffers from several drawbacks: 1) existing methods are mostly designed for 2-D slice fusion, and they tend to lose spatial contextual information when fusing medical images with volumetric structure slice by slice individually; 2) the few existing 3-D medical image fusion methods fail in considering the characteristics of source modalities sufficiently, leading to the loss of important modality information; and 3) most existing works concentrate on pursuing good performance on visual perception and objective evaluation, while there is a severe lack of clinical problem-oriented study. In this article, to address these issues, we propose a multimodal MRI volumetric data fusion method based on an end-to-end convolutional neural network (CNN). In our network, an attention-based multimodal feature fusion (MMFF) module is presented for more effective feature learning. In addition, a specific loss function that considers the characteristics of different MRI modalities is designed to preserve the modality information. Experimental results demonstrate that the proposed method can obtain more competitive results on both visual quality and objective assessment when compared with some representative 3-D and 2-D medical image fusion methods. We further verify the significance of the proposed method for brain tumor segmentation by enriching the input modalities, and the results show that it is helpful to improve the segmentation accuracy. The source code of our fusion method is available at https://github.com/yuliu316316/3D-CNN-Fusion.
14 citations
TL;DR: Wang et al. as discussed by the authors proposed a glioma segmentation-oriented multi-modal magnetic resonance (MR) image fusion method using an adversarial learning framework, which adopts a segmentation network as the discriminator to achieve more meaningful fusion results.
Abstract: Dear Editor, In recent years, multi-modal medical image fusion has received widespread attention in the image processing community. However, existing works on medical image fusion methods are mostly devoted to pursuing high performance on visual perception and objective fusion metrics, while ignoring the specific purpose in clinical applications. In this letter, we propose a glioma segmentation-oriented multi-modal magnetic resonance (MR) image fusion method using an adversarial learning framework, which adopts a segmentation network as the discriminator to achieve more meaningful fusion results from the perspective of the segmentation task. Experimental results demonstrate the advantage of the proposed method over some state-of-the-art medical image fusion methods.
12 citations
TL;DR: Wang et al. as discussed by the authors proposed a Transformer Capsule Network (TC-Net), which mainly contains an EEG Transformer module to extract EEG features and an emotion capsule module to refine the features and classify the emotion states.
Abstract: Deep learning has recently achieved remarkable success in emotion recognition based on Electroencephalogram (EEG), in which convolutional neural networks (CNNs) are the mostly used models. However, due to the local feature learning mechanism, CNNs have difficulty in capturing the global contextual information involving temporal domain, frequency domain, intra-channel and inter-channel. In this paper, we propose a Transformer Capsule Network (TC-Net), which mainly contains an EEG Transformer module to extract EEG features and an Emotion Capsule module to refine the features and classify the emotion states. In the EEG Transformer module, EEG signals are partitioned into non-overlapping windows. A Transformer block is adopted to capture global features among different windows, and we propose a novel patch merging strategy named EEG-PatchMerging (EEG-PM) to better extract local features. In the Emotion Capsule module, each channel of the EEG feature maps is encoded into a capsule to better characterize the spatial relationships among multiple features. Experimental results on two popular datasets (i.e., DEAP and DREAMER) demonstrate that the proposed method achieves the state-of-the-art performance in the subject-dependent scenario. Specifically, on DEAP (DREAMER), our TC-Net achieves the average accuracies of 98.76% (98.59%), 98.81% (98.61%) and 98.82% (98.67%) at valence, arousal and dominance dimensions, respectively. Moreover, the proposed TC-Net also shows high effectiveness in multi-state emotion recognition tasks using the popular VA and VAD models. The main limitation of the proposed model is that it tends to obtain relatively low performance in the cross-subject recognition task, which is worthy of further study in the future.
9 citations
TL;DR: In this paper , a residual architecture that includes a multi-scale feature extraction module and a dual-attention module is designed as the basic unit of a deep convolutional network, which is firstly used to obtain an initial fused image from the source images.
Abstract: Multi-focus image fusion methods can be mainly divided into two categories: transform domain methods and spatial domain methods. Recent emerged deep learning (DL)-based methods actually satisfy this taxonomy as well. In this paper, we propose a novel DL-based multi-focus image fusion method that can combine the complementary advantages of transform domain methods and spatial domain methods. Specifically, a residual architecture that includes a multi-scale feature extraction module and a dual-attention module is designed as the basic unit of a deep convolutional network, which is firstly used to obtain an initial fused image from the source images. Then, the trained network is further employed to extract features from the initial fused image and the source images for a similarity comparison, aiming to detect the focus property of each source pixel. The final fused image is obtained by selecting corresponding pixels from the source images and the initial fused image according to the focus property map. Experimental results show that the proposed method can effectively preserve the original focus information from the source images and prevent visual artifacts around the boundary regions, leading to more competitive qualitative and quantitative performance when compared with the state-of-the-art fusion methods.
8 citations
Cited by
More filters
TL;DR: Zhang et al. as mentioned in this paper proposed a brain tumor segmentation method based on the fusion of deep semantics and edge information in multimodal MRI, aiming to achieve a more sufficient utilization of multi-modal information for accurate segmentation.
Abstract: Brain tumor segmentation in multimodal MRI has great significance in clinical diagnosis and treatment. The utilization of multimodal information plays a crucial role in brain tumor segmentation. However, most existing methods focus on the extraction and selection of deep semantic features, while ignoring some features with specific meaning and importance to the segmentation problem. In this paper, we propose a brain tumor segmentation method based on the fusion of deep semantics and edge information in multimodal MRI, aiming to achieve a more sufficient utilization of multimodal information for accurate segmentation. The proposed method mainly consists of a semantic segmentation module, an edge detection module and a feature fusion module. In the semantic segmentation module, the Swin Transformer is adopted to extract semantic features and a shifted patch tokenization strategy is introduced for better training. The edge detection module is designed based on convolutional neural networks (CNNs) and an edge spatial attention block (ESAB) is presented for feature enhancement. The feature fusion module aims to fuse the extracted semantic and edge features, and we design a multi-feature inference block (MFIB) based on graph convolution to perform feature reasoning and information dissemination for effective feature fusion. The proposed method is validated on the popular BraTS benchmarks. The experimental results verify that the proposed method outperforms a number of state-of-the-art brain tumor segmentation methods. The source code of the proposed method is available at https://github.com/HXY-99/brats.
71 citations
TL;DR: Tang et al. as discussed by the authors proposed a novel image registration and fusion method, named SuperFusion, which combines image registration, image fusion, and semantic requirements of high-level vision tasks into a single framework.
Abstract: Image fusion aims to integrate complementary information in source images to synthesize a fused image comprehensively characterizing the imaging scene. However, existing image fusion algorithms are only applicable to strictly aligned source images and cause severe artifacts in the fusion results when input images have slight shifts or deformations. In addition, the fusion results typically only have good visual effect, but neglect the semantic requirements of high-level vision tasks. This study incorporates image registration, image fusion, and semantic requirements of high-level vision tasks into a single framework and proposes a novel image registration and fusion method, named SuperFusion. Specifically, we design a registration network to estimate bidirectional deformation fields to rectify geometric distortions of input images under the supervision of both photometric and end-point constraints. The registration and fusion are combined in a symmetric scheme, in which while mutual promotion can be achieved by optimizing the naive fusion loss, it is further enhanced by the mono-modal consistent constraint on symmetric fusion outputs. In addition, the image fusion network is equipped with the global spatial attention mechanism to achieve adaptive feature integration. Moreover, the semantic constraint based on the pre-trained segmentation model and Lovasz-Softmax loss is deployed to guide the fusion network to focus more on the semantic requirements of high-level vision tasks. Extensive experiments on image registration, image fusion, and semantic segmentation tasks demonstrate the superiority of our SuperFusion compared to the state-of-the-art alternatives. The source code and pre-trained model are publicly available at https://github.com/Linfeng-Tang/SuperFusion.
50 citations
TL;DR: Tang et al. as discussed by the authors proposed a novel image registration and fusion method, named SuperFusion, which combines image registration, image fusion, and semantic requirements of high-level vision tasks into a single framework.
Abstract: Image fusion aims to integrate complementary information in source images to synthesize a fused image comprehensively characterizing the imaging scene. However, existing image fusion algorithms are only applicable to strictly aligned source images and cause severe artifacts in the fusion results when input images have slight shifts or deformations. In addition, the fusion results typically only have good visual effect, but neglect the semantic requirements of high-level vision tasks. This study incorporates image registration, image fusion, and semantic requirements of high-level vision tasks into a single framework and proposes a novel image registration and fusion method, named SuperFusion. Specifically, we design a registration network to estimate bidirectional deformation fields to rectify geometric distortions of input images under the supervision of both photometric and end-point constraints. The registration and fusion are combined in a symmetric scheme, in which while mutual promotion can be achieved by optimizing the naive fusion loss, it is further enhanced by the mono-modal consistent constraint on symmetric fusion outputs. In addition, the image fusion network is equipped with the global spatial attention mechanism to achieve adaptive feature integration. Moreover, the semantic constraint based on the pre-trained segmentation model and Lovasz-Softmax loss is deployed to guide the fusion network to focus more on the semantic requirements of high-level vision tasks. Extensive experiments on image registration, image fusion, and semantic segmentation tasks demonstrate the superiority of our SuperFusion compared to the state-of-the-art alternatives. The source code and pre-trained model are publicly available at https://github.com/Linfeng-Tang/SuperFusion.
45 citations
TL;DR: In this paper , a self-training maximum classifier discrepancy method for EEG classification is proposed, which detects samples from a new subject beyond the support of the existing source subjects by maximising the discrepancies between two classifiers' outputs.
Abstract: Even with an unprecedented breakthrough of deep learning in electroencephalography (EEG), collecting adequate labelled samples is a critical problem due to laborious and time-consuming labelling. Recent study proposed to solve the limited label problem via domain adaptation methods. However, they mainly focus on reducing domain discrepancy without considering task-specific decision boundaries, which may lead to feature distribution overmatching and therefore make it hard to match within a large domain gap completely. A novel self-training maximum classifier discrepancy method for EEG classification is proposed in this study. The proposed approach detects samples from a new subject beyond the support of the existing source subjects by maximising the discrepancies between two classifiers' outputs. Besides, a self-training method that uses unlabelled test data to fully use knowledge from the new subject and further reduce the domain gap is proposed. Finally, a 3D Cube that incorporates the spatial and frequency information of the EEG data to create input features of a Convolutional Neural Network (CNN) is constructed. Extensive experiments on SEED and SEED-IV are conducted. The experimental evaluations exhibit that the proposed method can effectively deal with domain transfer problems and achieve better performance.
16 citations
TL;DR: In this article , the authors present a comprehensive review of brain disease detection from the fusion of neuroimaging modalities using DL models like convolutional neural networks, recurrent neural networks (RNNs), pretrained, generative adversarial networks (GANs), and autoencoders (AEs).
Abstract: Brain diseases, including tumors and mental and neurological disorders, seriously threaten the health and well-being of millions of people worldwide. Structural and functional neuroimaging modalities are commonly used by physicians to aid the diagnosis of brain diseases. In clinical settings, specialist doctors typically fuse the magnetic resonance imaging (MRI) data with other neuroimaging modalities for brain disease detection. As these two approaches offer complementary information, fusing these neuroimaging modalities helps physicians accurately diagnose brain diseases. Typically, fusion is performed between a functional and a structural neuroimaging modality. Because the functional modality can complement the structural modality information, thus improving the performance for the diagnosis of brain diseases by specialists. However, analyzing the fusion of neuroimaging modalities is difficult for specialist doctors. Deep Learning (DL) is a branch of artificial intelligence that has shown superior performances compared to more conventional methods in tasks such as brain disease detection from neuroimaging modalities. This work presents a comprehensive review paper in the field of brain disease detection from the fusion of neuroimaging modalities using DL models like convolutional neural networks (CNNs), recurrent neural networks (RNNs), pretrained, generative adversarial networks (GANs), and Autoencoders (AEs). First, neuroimaging modalities and the need for fusion are discussed. Then, review papers published in the field of neuroimaging multimodalities using AI techniques are explored. Moreover, fusion levels based on DL methods, including input, layer, and decision, with related studies conducted on diagnosing brain diseases, are discussed. Other sections present the most important challenges for diagnosing brain diseases from the fusion of neuroimaging modalities. In the discussion section, the details of previous research on the fusion of neuroimaging modalities based on MRI and DL models are reported. In the following, the most important future directions include Datasets, DA, imbalanced data, DL models, explainable AI, and hardware resources are presented. Finally, the main findings of this study are presented in the conclusion section.
14 citations