Showing papers by "Fuji Ren published in 2020"

PDF

Open Access

Journal Article•DOI•

Fuzzy C-Means clustering through SSIM and patch for image segmentation

[...]

Yiming Tang¹, Yiming Tang², Fuji Ren¹, Witold Pedrycz³, Witold Pedrycz⁴, Witold Pedrycz² - Show less +2 more•Institutions (4)

Hefei University of Technology¹, University of Alberta², King Abdulaziz University³, Polish Academy of Sciences⁴

01 Feb 2020-Applied Soft Computing

TL;DR: Through a collection of experimental studies, it is demonstrated that the proposed PFLSCM algorithm achieves improved segmentation performance in comparison with the results produced by some related FCM-based algorithms.

...read moreread less

72 citations

Journal Article•DOI•

A Review on Human-Computer Interaction and Intelligent Robots

[...]

Fuji Ren¹, Fuji Ren², Yanwei Bao¹•Institutions (2)

Hefei University of Technology¹, University of Tokushima²

17 Feb 2020-International Journal of Information Technology and Decision Making

TL;DR: This research highlights the existing technologies of listening, speaking, reading, writing, and other senses, which are widely used in human interaction, and introduces some intelligent robot systems and platforms.

...read moreread less

Abstract: In the field of artificial intelligence, human–computer interaction (HCI) technology and its related intelligent robot technologies are essential and interesting contents of research. From the perspective of software algorithm and hardware system, these above-mentioned technologies study and try to build a natural HCI environment. The purpose of this research is to provide an overview of HCI and intelligent robots. This research highlights the existing technologies of listening, speaking, reading, writing, and other senses, which are widely used in human interaction. Based on these same technologies, this research introduces some intelligent robot systems and platforms. This paper also forecasts some vital challenges of researching HCI and intelligent robots. The authors hope that this work will help researchers in the field to acquire the necessary information and technologies to further conduct more advanced research.

...read moreread less

65 citations

Journal Article•DOI•

CGMVQA: A New Classification and Generative Model for Medical Visual Question Answering

[...]

Fuji Ren¹, Yangyang Zhou¹•Institutions (1)

University of Tokushima¹

11 Mar 2020-IEEE Access

TL;DR: The proposed CGMVQA model, including classification and answer generation capabilities, is effective in medical visual question answering and can better assist doctors in clinical analysis and diagnosis.

...read moreread less

Abstract: Medical images are playing an important role in the medical domain. A mature medical visual question answering system can aid diagnosis, but there is no satisfactory method to solve this comprehensive problem so far. Considering that there are many different types of questions, we propose a model called CGMVQA, including classification and answer generation capabilities to turn this complex problem into multiple simple problems in this paper. We adopt data augmentation on images and tokenization on texts. We use pre-trained ResNet152 to extract image features and add three kinds of embeddings together to deal with texts. We reduce the parameters of the multi-head self-attention transformer to cut the computational cost down. We adjust the masking and output layers to change the functions of the model. This model establishes new state-of-the-art results: 0.640 of classification accuracy, 0.659 of word matching and 0.678 of semantic similarity in ImageCLEF 2019 VQA-Med data set. It suggests that the CGMVQA is effective in medical visual question answering and can better assist doctors in clinical analysis and diagnosis.

...read moreread less

58 citations

Journal Article•DOI•

Energy Optimized Congestion Control-Based Temperature Aware Routing Algorithm for Software Defined Wireless Body Area Networks

[...]

Omar Ahmed¹, Fuji Ren², Ammar Hawbani³, Yaser Al-sharabi¹•Institutions (3)

Hefei University of Technology¹, University of Tokushima², University of Science and Technology of China³

27 Feb 2020-IEEE Access

TL;DR: A novel Energy Optimized Congestion Control based on Temperature Aware Routing Algorithm (EOCC-TARA) using Enhanced Multi-objective Spider Monkey Optimization (EMSMO) for SDN-based WBAN overcomes the vital challenges, namely energy-efficiency, congestion-free communication, and reducing adverse thermal effects in WBAN routing.

...read moreread less

Abstract: Wireless Body Area Network (WBAN) is a promising cost-effective technology for the privacy confined military applications and healthcare applications like remote health monitoring, telemedicine, and e-health services. The use of a Software-Defined Network (SDN) approach improves the control and management processes of the complex structured WBANs and also provides higher flexibility and dynamic network structure. To seamless routing performance in SDN-based WBAN, the energy-efficiency problems must be tackled effectively. The main contribution of this paper is to develop a novel Energy Optimized Congestion Control based on Temperature Aware Routing Algorithm (EOCC-TARA) using Enhanced Multi-objective Spider Monkey Optimization (EMSMO) for SDN-based WBAN. This algorithm overcomes the vital challenges, namely energy-efficiency, congestion-free communication, and reducing adverse thermal effects in WBAN routing. First, the proposed EOCC-TARA routing algorithm considers the effects of temperature due to the thermal dissipation of sensor nodes and formulates a strategy to adaptively select the forwarding nodes based on temperature and energy. Then the congestion avoidance concept is added with the energy-efficiency, link reliability, and path loss for modeling the cost function based on which the EMSMO provides the optimal routing. Simulations were performed, and the evaluation results showed that the proposed EOCC-TARA routing algorithm has superior performance than the traditional routing approaches in terms of energy consumption, network lifetime, throughput, temperature control, congestion overhead, delay, and successful transmission rate.

...read moreread less

48 citations

Journal Article•DOI•

EEG emotion recognition model based on the LIBSVM classifier

[...]

Tian Chen¹, Sihang Ju¹, Fuji Ren², Fuji Ren¹, Mingyan Fan¹, Yu Gu¹ - Show less +2 more•Institutions (2)

Hefei University of Technology¹, University of Tokushima²

01 Nov 2020-Measurement

TL;DR: An electroencephalogram(EEG) emotion recognition method based on the LIBSVM classifier, where EEG features are calculated to represent the characterisitics associated with emotion states and the classification results of each channel are fused by the Takagi-Sugeno fuzzy model.

...read moreread less

39 citations

Journal Article•DOI•

Intention Detection Based on Siamese Neural Network With Triplet Loss

[...]

Fuji Ren¹, Siyuan Xue¹•Institutions (1)

University of Tokushima¹

30 Apr 2020-IEEE Access

TL;DR: A triplet training framework based on the multiclass classification approach to conduct the training for the intention detection task is proposed and a Siamese neural network architecture with metric learning is utilized to construct a robust and discriminative utterance feature embedding model.

...read moreread less

Abstract: Understanding the user's intention is an essential task for the spoken language understanding (SLU) module in the dialogue system, which further illustrates vital information for managing and generating future action and response. In this paper, we propose a triplet training framework based on the multiclass classification approach to conduct the training for the intention detection task. Precisely, we utilize a Siamese neural network architecture with metric learning to construct a robust and discriminative utterance feature embedding model. We modified the RMCNN model and fine-tuned BERT model as Siamese encoders to train utterance triplets from different semantic aspects. The triplet loss can effectively distinguish the details of two input data by learning a mapping from sequence utterances to a compact Euclidean space. After generating the mapping, the intention detection task can be easily implemented using standard techniques with pre-trained embeddings as feature vectors. Besides, we use the fusion strategy to enhance utterance feature representation in the downstream of intention detection task. We conduct experiments on several benchmark datasets of intention detection task: Snips dataset, ATIS dataset, Facebook multilingual task-oriented datasets, Daily Dialogue dataset, and MRDA dataset. The results illustrate that the proposed method can effectively improve the recognition performance of these datasets and achieves new state-of-the-art results on single-turn task-oriented datasets (Snips dataset, Facebook dataset), and a multi-turn dataset (Daily Dialogue dataset).

...read moreread less

34 citations

Journal Article•DOI•

DFF-ResNet: An Insect Pest Recognition Model Based on Residual Networks

[...]

Wenjie Liu¹, Guoqing Wu², Fuji Ren¹, Xin Kang¹•Institutions (2)

University of Tokushima¹, Nantong University²

16 Nov 2020

TL;DR: This paper proposed a feature fusion residual block to perform the insect pest recognition task and constructed the Deep Feature Fusion Residual Network (DFF-ResNet), which outperform the original ResNet and other state-of-the-art methods.

...read moreread less

Abstract: Insect pest control is considered as a significant factor in the yield of commercial crops. Thus, to avoid economic losses, we need a valid method for insect pest recognition. In this paper, we proposed a feature fusion residual block to perform the insect pest recognition task. Based on the original residual block, we fused the feature from a previous layer between two 1×1 convolution layers in a residual signal branch to improve the capacity of the block. Furthermore, we explored the contribution of each residual group to the model performance. We found that adding the residual blocks of earlier residual groups promotes the model performance significantly, which improves the capacity of generalization of the model. By stacking the feature fusion residual block, we constructed the Deep Feature Fusion Residual Network (DFF-ResNet). To prove the validity and adaptivity of our approach, we constructed it with two common residual networks (Pre-ResNet and Wide Residual Network (WRN)) and validated these models on the Canadian Institute For Advanced Research (CIFAR) and Street View House Number (SVHN) benchmark datasets. The experimental results indicate that our models have a lower test error than those of baseline models. Then, we applied our models to recognize insect pests and obtained validity on the IP102 benchmark dataset. The experimental results show that our models outperform the original ResNet and other state-of-the-art methods.

...read moreread less

31 citations

Journal Article•DOI•

A Linear Multivariate Binary Decision Tree Classifier Based on K-means Splitting

[...]

Fei Wang¹, Quan Wang¹, Quan Wang², Feiping Nie³, Zhongheng Li¹, Weizhong Yu¹, Fuji Ren² - Show less +3 more•Institutions (3)

Xi'an Jiaotong University¹, University of Tokushima², Northwestern Polytechnical University³

24 Jun 2020-Pattern Recognition

TL;DR: A novel non-split condition with an easy-setting hyperparameter which focuses more on minority classes of the current node is proposed and applied in the BDTKS model, avoiding ignoring the minority classes in the class imbalance cases and speeding up the process of classification.

...read moreread less

30 citations

Journal Article•DOI•

Sleepy: Wireless Channel Data Driven Sleep Monitoring via Commodity WiFi Devices

[...]

Yu Gu¹, Yifan Zhang¹, Jie Li², Yusheng Ji³, Xin An¹, Fuji Ren⁴ - Show less +2 more•Institutions (4)

Hefei University of Technology¹, University of Tsukuba², National Institute of Informatics³, University of Tokushima⁴

01 Jun 2020-IEEE Transactions on Big Data

TL;DR: The key idea of Sleepy is that the energy feature of the wireless channel follows a Gaussian Mixture Model derived from the accumulated channel data over a long period, leading to a low-cost yet promising solution for sleep monitoring.

...read moreread less

Abstract: Sleep is a major event of our daily lives. Its quality constitutes a critical indicator of people's health conditions, both mentally and physically. Existing sensor-based or vision-based sleep monitoring systems either are obstructive to use or fail to provide adequate coverage. With the fast expansion of wireless infrastructures nowadays, channel data, which is pervasive and transparent, emerges as another alternative. To this end, we propose Sleepy, a wireless channel data driven sleep monitoring system leveraging commercial WiFi devices. The key idea of Sleepy is that the energy feature of the wireless channel follows a Gaussian Mixture Model (GMM) derived from the accumulated channel data over a long period. Therefore, a GMM based foreground extraction method has been designed to adaptively distinguish motions like rollovers (foreground) from background (stationary postures), leading to certain major merits, e.g., no calibrations or target-dependent training needed. We prototype Sleepy and evaluate it in two real environments. In the short-term controlled experiments, Sleepy achieves 95.65 percent detection accuracy (DA) and 2.16 percent false negative rate (FNR) on average. In the 60-minute real sleep studies, Sleepy demonstrates strong stability, i.e., 0 percent FNR and 98.22 percent DA. Considering that Sleepy is compatible with existing WiFi infrastructure, it constitutes a low-cost yet promising solution for sleep monitoring.

...read moreread less

27 citations

Journal Article•DOI•

EmoSense: Computational Intelligence Driven Emotion Sensing via Wireless Channel Data

[...]

Yu Gu¹, Yantong Wang¹, Tao Liu¹, Yusheng Ji², Zhi Liu³, Peng Li⁴, Xiaoyan Wang⁵, Xin An¹, Fuji Ren⁶ - Show less +5 more•Institutions (6)

Hefei University of Technology¹, National Institute of Informatics², Shizuoka University³, University of Aizu⁴, Ibaraki University⁵, University of Tokushima⁶

01 Jun 2020

TL;DR: In this paper, a first-of-its-kind wireless emotion sensing system driven by computational intelligence is presented, where the basic methodology is to explore the physical expression of emotions from wireless channel response via data mining.

...read moreread less

Abstract: Emotion is well recognized as a distinguished symbol of human beings, and it plays a crucial role in our daily lives. Existing vision-based or sensor-based solutions are either obstructive to use or rely on specialized hardware, hindering their applicability. This paper introduces EmoSense, a first-of-its-kind wireless emotion sensing system driven by computational intelligence. The basic methodology is to explore the physical expression of emotions from wireless channel response via data mining. The design and implementation of EmoSense faces two major challenges—extracting physical expression from wireless channel data and recovering emotion from the corresponding physical expression. For the former, we present a Fresnel zone-based theoretical model depicting the fingerprint of the physical expression on channel response. For the latter, we design an efficient computational intelligence driven mechanism to recognize emotion from the corresponding fingerprints. We prototyped EmoSense on the commodity WiFi infrastructure and compared it with mainstream sensor-based and vision-based approaches in the real-world scenario. The numerical study over 3360 cases confirms that EmoSense achieves a comparable performance to the vision-based and sensor-based rivals under different scenarios. EmoSense only leverages the low-cost and prevalent WiFi infrastructures and thus, constitutes a tempting solution for emotion sensing.

...read moreread less

20 citations

Journal Article•DOI•

Multi-label Emotion Detection via Emotion-Specified Feature Extraction and Emotion Correlation Learning

[...]

Jiawen Deng¹, Fuji Ren¹•Institutions (1)

University of Tokushima¹

27 Oct 2020-IEEE Transactions on Affective Computing

TL;DR: The proposed Multi-label Emotion Detection Architecture (MEDA) is proposed to detect all associated emotions expressed in a given piece of text and can achieve better performance than state-of-the-art methods in this task.

...read moreread less

Abstract: Textual emotion detection is an attractive task while previous studies mainly focused on polarity or single-emotion classification. However, human expressions are complex, and multiple emotions often occur simultaneously with non-negligible emotion correlations. In this paper, a Multi-label Emotion Detection Architecture (MEDA) is proposed to detect all associated emotions expressed in a given piece of text. MEDA is mainly composed of two modules: Multi-Channel Emotion-Specified Feature Extractor (MC-ESFE) and Emotion Correlation Learner (ECorL). MEDA captures underlying emotion-specified features through MC-ESFE module in advance. MC-ESFE is composed of multiple channel-wise ESFE networks. Each channel is devoted to the feature extraction of a specified emotion from sentence-level to context-level through a hierarchical structure. Based on obtained features, emotion correlation learning is implemented through an emotion sequence predictor in ECorL. During model training, we define a new loss function, which is called multi-label focal loss. With this loss function, the model can focus more on misclassified positive-negative emotion pairs and improve the overall performance by balancing the prediction of positive and negative emotions. The evaluation of proposed MEDA architecture is carried out on emotional corpus: RenCECps and NLPCC2018 datasets. The experimental results indicate that the proposed method can achieve better performance than state-of-the-art methods in this task.

...read moreread less

Journal Article•DOI•

LAUN Improved StarGAN for Facial Emotion Recognition

[...]

Xiaohua Wang¹, Jianqiao Gong¹, Min Hu¹, Yu Gu¹, Fuji Ren¹ - Show less +1 more•Institutions (1)

Hefei University of Technology¹

03 Sep 2020-IEEE Access

TL;DR: The experimental results demonstrate that the improved StarGAN model can alleviate some flaws in the face generated by the original StarGAN, and can generate person images with better quality with different poses and expressions.

...read moreread less

Abstract: In the field of facial expression recognition, deep learning is extensively used. However, insufficient and unbalanced facial training data in available public databases is a major challenge for improving the expression recognition rate. Generative Adversarial Networks (GANs) can produce more one-to-one faces with different expressions, which can be used to enhance databases. StarGAN can perform one-to-many translations for multiple expressions. Compared with original GANs, StarGAN can increase the efficiency of sample generation. Nevertheless, there are some defects in essential areas of the generated face, such as the mouth and the fuzzy side face image generation. To address these limitations, we improved StarGAN to alleviate the defects of images generation by modifying the reconstruction loss and adding the Contextual loss. Meanwhile, we added the Attention U-Net to StarGAN's generator, replacing StarGAN's original generator. Therefore, we proposed the Contextual loss and Attention U-Net (LAUN) improved StarGAN. The U-shape structure and skip connection in Attention U-Net can effectively integrate the details and semantic features of images. The network's attention structure can pay attention to the essential areas of the human face. The experimental results demonstrate that the improved model can alleviate some flaws in the face generated by the original StarGAN. Therefore, it can generate person images with better quality with different poses and expressions. The experiments were conducted on the Karolinska Directed Emotional Faces database, and the accuracy of facial expression recognition is 95.97%, 2.19% higher than that by using StarGAN. Meanwhile, the experiments were carried out on the MMI Facial Expression Database, and the accuracy of expression is 98.30%, 1.21% higher than that by using StarGAN. Moreover, experiment results have better performance based on the LAUN improved StarGAN enhanced databases than those without enhancement.

...read moreread less

Journal Article•DOI•

BeAware: Convolutional neural network(CNN) based user behavior understanding through WiFi channel state information

[...]

Leyuan Jia¹, Yu Gu¹, Ken Cheng¹, Huan Yan¹, Fuji Ren² - Show less +1 more•Institutions (2)

Hefei University of Technology¹, University of Tokushima²

15 Jul 2020-Neurocomputing

TL;DR: The key idea is to visualize the channel data affected by human movements into time -series heat-map images, which are processed by a Convolutional Neural Network to understand the corresponding user behaviors.

...read moreread less

Journal Article•DOI•

Facial Expression Imitation Method for Humanoid Robot Based on Smooth-Constraint Reversed Mechanical Model (SRMM)

[...]

Zhong Huang¹, Fuji Ren², Min Hu¹, Sugen Chen³•Institutions (3)

Hefei University of Technology¹, University of Tokushima², Anqing Teachers College³

03 Sep 2020-IEEE Transactions on Human-Machine Systems

TL;DR: A real-time FEI method for a humanoid robot is proposed based on smooth-constraint reversed mechanical model (SRMM) by combining a sequence-to-sequence deep learning model and a motion-smoothing constraint to improve space–time similarity and motion smoothness of facial expression imitation.

...read moreread less

Abstract: To improve the space–time similarity and motion smoothness of facial expression imitation (FEI), a real-time FEI method for a humanoid robot is proposed based on smooth-constraint reversed mechanical model (SRMM) by combining a sequence-to-sequence deep learning model and a motion-smoothing constraint. First, on the basis of facial data from a Kinect capture device, a facial feature vector is characterized based on 3 head postures, 17 facial animation units, and facial geometric deformation cascaded by Laplace coordinates. Second, a reversed mechanical model is constructed via a multilayer long short-term memory neural network to accomplish direct mapping from facial feature sequences to motor position sequences. Additionally, to overcome the motor chattering phenomenon during real-time FEI, a high-order polynomial is constructed to fit the position sequence of motors, and an SRMM is proposed and designed based on the deviation of position, velocity, and acceleration. Finally, aiming to imitate the real-time facial feature sequences of a performer captured from Kinect, the optimal position sequences generated based on the SRMM is sent to the hardware system to keep the space–time characteristics consistent with those of the performer. The experimental results demonstrate that the motor position deviation of the SRMM is less than 8%. The space–time similarity between the robot and the performer is greater than 85%, and the motion smoothness of the online FEI exceeded 90%. Compared with other related methods, the proposed method achieves a remarkable improvement in motor position deviation, space–time similarity, and motion smoothness.

...read moreread less

Proceedings Article•DOI•

A fusion model for multi-label emotion classification based on BERT and topic clustering

[...]

Fei Ding¹, Fei Ding², Xin Kang¹, Shun Nishide¹, Zhijin Guan², Fuji Ren¹ - Show less +2 more•Institutions (2)

University of Tokushima¹, Nantong University²

12 Oct 2020

TL;DR: A fusion model for text based on self-attention and topic clustering for multi-label emotion classification is proposed, which outperforms several strong baselines and related works.

...read moreread less

Abstract: As one of the most critical tasks of natural language processing (NLP), emotion classification has a wide range of applications in many fields. However, restricted by corpus, semantic ambiguity, and other constraints, researchers in emotion classification face many difficulties, and the accuracy of multi-label emotion classification is not ideal. In this paper, to improve the accuracy of multi-label emotion classification, especially when semantic ambiguity occurs, we proposed a fusion model for text based on self-attention and topic clustering. We use the Pre-trained BERT to extract the hidden emotional representations of the sentence, and use the improved LDA topic model to cluster the topics of different levels of text. Then we fuse the hidden representations of the sentence and use a classification neural network to calculate the multi-label emotional intensity of the sentence. After testing on the Chinese emotion corpus Ren_CECPs corpus, extensive experimental results demonstrate that our model outperforms several strong baselines and related works. The F1-score of our model reaches 0.484, which is 0.064 higher than the best results in similar studies.

...read moreread less

Journal Article•DOI•

An Emotion Expression Extraction Method for Chinese Microblog Sentences

[...]

Fuji Ren¹, Qian Zhang¹•Institutions (1)

University of Tokushima¹

06 Apr 2020-IEEE Access

TL;DR: An emotion expression extraction method is proposed to process millions of user-generated opinionated sentences automatically in this paper and experimental results demonstrate the effectiveness of algorithms in the proposed method.

...read moreread less

Abstract: With the rapid spread of Chinese microblog, a large number of microblog topics are being generated in real-time. More and more users pay attention to emotion expressions of these opinionated sentences in different topics. It is challenging to label the emotion expressions of opinionated sentences manually. For this endeavor, an emotion expression extraction method is proposed to process millions of user-generated opinionated sentences automatically in this paper. Specifically, the proposed method mainly contains two tasks: emotion classification and opinion target extraction. We first use a lexicon-based emotion classification method to compute different emotion values in emotion label vectors of opinionated sentences. Then emotion label vectors of opinionated sentences are revised by an unsupervised emotion label propagation algorithm. After extracting candidate opinion targets of opinionated sentences, the opinion target extraction task is performed on a random walk-based ranking algorithm, which considers the connection between candidate opinion targets and the textual similarity between opinionated sentences, ranks candidate opinion targets of opinionated sentences. Experimental results demonstrate the effectiveness of algorithms in the proposed method.

...read moreread less

Journal Article•DOI•

Active learning with complementary sampling for instructing class-biased multi-label text emotion classification

[...]

Xin Kang, Xuefeng Shi, Yunong Wu, Fuji Ren

01 Jan 2020-IEEE Transactions on Affective Computing

Journal Article•DOI•

Multi-reservoirs EEG signal feature sensing and recognition method based on generative adversarial networks

[...]

Yindong Dong¹, Fuji Ren¹, Fuji Ren²•Institutions (2)

Hefei University of Technology¹, University of Tokushima²

01 Dec 2020-Computer Communications

TL;DR: A multi-reservoirs feature coding continuous label fusion semi-supervised Generative Adversarial Networks (MCLFS-GAN) is proposed by using permutation phase transfer entropy as the EEG signal feature to effectively improve the recognition performance.

...read moreread less

Journal Article•DOI•

Knowledge graph entity typing via learning connecting embeddings

[...]

Yu Zhao¹, Anxiang Zhang², Huali Feng¹, Qing Li¹, Patrick Gallinari, Fuji Ren³ - Show less +2 more•Institutions (3)

Southwestern University of Finance and Economics¹, Carnegie Mellon University², University of Tokushima³

21 May 2020-Knowledge Based Systems

TL;DR: A novel approach for KG entity typing is proposed which is trained by jointly utilizing local typing knowledge from existing entity type assertions and global triple knowledge in KGs, and two distinct knowledge-driven effective mechanisms of entity type inference are presented.

...read moreread less

Abstract: Knowledge graph (KG) entity typing aims at inferring possible missing entity type instances in KG, which is a very significant but still under-explored subtask of knowledge graph completion. In this paper, we propose a novel approach for KG entity typing which is trained by jointly utilizing local typing knowledge from existing entity type assertions and global triple knowledge in KGs. Specifically, we present two distinct knowledge-driven effective mechanisms of entity type inference. Accordingly, we build two novel embedding models to realize the mechanisms. Afterward, a joint model via connecting them is used to infer missing entity type instances, which favors inferences that agree with both entity type instances and triple knowledge in KGs. Experimental results on two real-world datasets (Freebase and YAGO) demonstrate the effectiveness of our proposed mechanisms and models for improving KG entity typing.

...read moreread less

Posted Content•

WiFE: WiFi and Vision based Intelligent Facial-Gesture Emotion Recognition

[...]

Yu Gu, Xiang Zhang, Zhi Liu, Fuji Ren

21 Apr 2020-arXiv: Human-Computer Interaction

TL;DR: A hybrid emotion recognition system leveraging two emotion-rich and tightly-coupled modalities, i.e., facial expression and body gesture and a signal sensitivity enhancement method based on the Rician K factor theory is proposed.

...read moreread less

Abstract: Emotion is an essential part of Artificial Intelligence (AI) and human mental health. Current emotion recognition research mainly focuses on single modality (e.g., facial expression), while human emotion expressions are multi-modal in nature. In this paper, we propose a hybrid emotion recognition system leveraging two emotion-rich and tightly-coupled modalities, i.e., facial expression and body gesture. However, unbiased and fine-grained facial expression and gesture recognition remain a major problem. To this end, unlike our rivals relying on contact or even invasive sensors, we explore the commodity WiFi signal for device-free and contactless gesture recognition, while adopting a vision-based facial expression. However, there exist two design challenges, i.e., how to improve the sensitivity of WiFi signals and how to process the large-volume, heterogeneous, and non-synchronous data contributed by the two-modalities. For the former, we propose a signal sensitivity enhancement method based on the Rician K factor theory; for the latter, we combine CNN and RNN to mine the high-level features of bi-modal data, and perform a score-level fusion for fine-grained recognition. To evaluate the proposed method, we build a first-of-its-kind Vision-CSI Emotion Database (VCED) and conduct extensive experiments. Empirical results show the superiority of the bi-modality by achieving 83.24\% recognition accuracy for seven emotions, as compared with 66.48% and 66.67% recognition accuracy by gesture-only based solution and facial-only based solution, respectively. The VCED database download link is this https URL.

...read moreread less

Journal Article•DOI•

Vowel priority lip matching scheme and similarity evaluation model based on humanoid robot Ren-Xin

[...]

Zheng Liu¹, Xin Kang¹, Shun Nishide¹, Fuji Ren¹•Institutions (1)

University of Tokushima¹

11 Jun 2020-Journal of Ambient Intelligence and Humanized Computing

TL;DR: A lip matching scheme based on vowel priority and a similarity evaluation model based on the Manhattan distance by using computer vision lip features, which quantifies the lip shape similarity between 0–1 provides an effective recommendation of evaluation standard are proposed.

...read moreread less

Abstract: At present, the significance of humanoid robots dramatically increased while this kind of robots rarely enters human life because of its immature development. The lip shape of humanoid robots is crucial in the speech process since it makes humanoid robots look like real humans. Many studies show that vowels are the essential elements of pronunciation in all languages in the world. Based on the traditional research of viseme, we increased the priority of the smooth transition of lip between vowels and propose a lip matching scheme based on vowel priority. Additionally, we also designed a similarity evaluation model based on the Manhattan distance by using computer vision lip features, which quantifies the lip shape similarity between 0–1 provides an effective recommendation of evaluation standard. Surprisingly, this model successfully compensates the disadvantages of lip shape similarity evaluation criteria in this field. We applied this lip-matching scheme to Ren-Xin humanoid robot and performed robot teaching experiments as well as a similarity comparison experiment of 20 sentences with two males and two females and the robot. Notably, all the experiments have achieved excellent results.

...read moreread less

Journal Article•DOI•

Correction to: Emotion recognition based on physiological signals using brain asymmetry index and echo state network

[...]

Fuji Ren¹, Fuji Ren², Yindong Dong¹, Wei Wang¹•Institutions (2)

Hefei University of Technology¹, University of Tokushima²

01 May 2020-Neural Computing and Applications

TL;DR: The authors quite agree that the AsI model is first developed by Panagiotis Petrantonakis from references [11] and [18] in their thesis.

...read moreread less

Abstract: The authors quite agree that the AsI model is first developed by Panagiotis Petrantonakis from references [11] and [18] in their thesis.

...read moreread less

Proceedings Article•DOI•

Gait recognition in real environment using gait energy image generated by Mask R-CNN

[...]

Shota Inui¹, Fuji Ren¹, Shun Nishide¹, Xin Kang¹•Institutions (1)

University of Tokushima¹

13 Oct 2020

TL;DR: In this paper, GEI with noise removal is created by using Mask R-CNN, and CNN is strengthened by applying Batch Normalization to CNN used for gait recognition.

...read moreread less

Abstract: Currently, biometric authentication is being actively researched as a personal authentication technology for security, and gait recognition that uses Convolutional Neural Networks (CNN) for recognizing human walking is one of them. When creating a Gait Energy Image (GEI) using background subtraction, noises such as shadows and illumination fluctuations often hinder the accuracy of the method. In this paper, GEI with noise removal is created by using Mask R-CNN, and CNN is strengthened by applying Batch Normalization to CNN used for gait recognition. The effectiveness of this method was confirmed by conducting experiments on two types of gaits, one with no bag and one with a bag.

...read moreread less

Journal Article•DOI•

Sentence-Embedding and Similarity via Hybrid Bidirectional-LSTM and CNN Utilizing Weighted-Pooling Attention

[...]

Degen Huang¹, Anil Ahmed¹, Syed Yasser Arafat², Khawaja Iftekhar Rashid¹, Qasim Abbas¹, Fuji Ren³ - Show less +2 more•Institutions (3)

Dalian University of Technology¹, University of Engineering and Technology, Lahore², University of Tokushima³

01 Oct 2020-IEICE Transactions on Information and Systems

TL;DR: Investigations show that the proposed method outperforms the state-of-theart approaches to datasets for two tasks, namely semantic relatedness and Microsoft research paraphrase identification and also boosts the similarity accuracy.

...read moreread less

Abstract: Neural networks have received considerable attention in sentence similarity measuring systems due to their efficiency in dealing with semantic composition. However, existing neural network methods are not sufficiently effective in capturing the most significant semantic information buried in an input. To address this problem, a novel weighted-pooling attention layer is proposed to retain the most remarkable attention vector. It has already been established that long short-term memory and a convolution neural network have a strong ability to accumulate enriched patterns of whole sentence semantic representation. First, a sentence representation is generated by employing a siamese structure based on bidirectional long short-term memory and a convolutional neural network. Subsequently, a weighted-pooling attention layer is applied to obtain an attention vector. Finally, the attention vector pair information is leveraged to calculate the score of sentence similarity. An amalgamation of both, bidirectional long short-term memory and a convolutional neural network has resulted in a model that enhances information extracting and learning capacity. Investigations show that the proposed method outperforms the state-of-theart approaches to datasets for two tasks, namely semantic relatedness and Microsoft research paraphrase identification. The new model improves the learning capability and also boosts the similarity accuracy as well. key words: sentence similarity, sentence embedding, deep learning, long short-term memory, convolutional neural network

...read moreread less

Journal Article•DOI•

Toward action comprehension for searching: Mining actionable intents in query entities

[...]

Xin Kang¹, Yunong Wu¹, Fuji Ren¹•Institutions (1)

University of Tokushima¹

01 Feb 2020-Journal of the Association for Information Science and Technology

TL;DR: This article presents a novel research method for mining the actionable intents for search users, by generating a ranked list of the potentially most informative actions based on a massive pool of action samples.

...read moreread less

Abstract: Understanding search engine users' intents has been a popular study in information retrieval, which directly affects the quality of retrieved information. One of the fundamental problems in this field is to find a connection between the entity in a query and the potential intents of the users, the latter of which would further reveal important information for facilitating the users' future actions. In this article, we present a novel research method for mining the actionable intents for search users, by generating a ranked list of the potentially most informative actions based on a massive pool of action samples. We compare different search strategies and their combinations for retrieving the action pool and develop three criteria for measuring the informativeness of the selected action samples, that is, the significance of an action sample within the pool, the representativeness of an action sample for the other candidate samples, and the diverseness of an action sample with respect to the selected actions. Our experiment, based on the Action Mining (AM) query entity data set from the Actionable Knowledge Graph (AKG) task at NTCIR‐13, suggests that the proposed approach is effective in generating an informative and early‐satisfying ranking of potential actions for search users.

...read moreread less

Proceedings Article•DOI•

Attention-based Bi-LSTM-CRF Network for Emotion Cause Extraction in Texts

[...]

Liyuan Liang¹, Xiaodong Ji¹, Fuji Ren²•Institutions (2)

Nantong University¹, University of Tokushima²

13 Oct 2020

TL;DR: An attention-based Bi-LSTM-CRF network is proposed to integrate both the contextual information and the latent semantic relations of emotion expression and candidate clause to identify the causes behind an emotion expressed in a document.

...read moreread less

Abstract: Emotion cause extraction is to identify the causes behind an emotion expressed in a document, a more challenging task for the fine-grained emotion analysis in natural language processing. Most existing methods regard the task as an independent clause classification problem, ignoring the relationships among multiple clauses in the same document. Moreover, the relative position of the candidate clause and emotion clause provide critical emotion cause clue. In the paper, an attention-based Bi-LSTM-CRF network is proposed to integrate the above information. In this network, a bi-directional long short-term memory is first used to capture both the contextual information and the latent semantic relations of emotion expression and candidate clause. Then, two attention mechanisms are designed to encode the mutual influence of the emotion expression and candidate clause, the relative position and candidate clause. Better-distributed representations are created with the former design. Finally, these representations are into the Condition Random Fields for labeling. The results experimented on a benchmark Chinese emotion cause dataset proved the effectiveness of our method by achieving the F score of 88.40 %.

...read moreread less

Book Chapter•DOI•

Multi-view Weighted Kernel Fuzzy Clustering Algorithm Based on the Collaboration of Visible and Hidden Views

[...]

Yiming Tang¹, Bowen Xia¹, Fuji Ren¹, Xiaocheng Song¹, Hongmang Li¹, Wenbin Wu¹ - Show less +2 more•Institutions (1)

Hefei University of Technology¹

07 Nov 2020

TL;DR: A multi-view weighted kernel fuzzy clustering method with collaborative evident and concealed views (MV-Co-KFCM) is put forward, which is found that the algorithm is more excellent as for 5 clustering validity indexes.

...read moreread less

Abstract: With the development of media technology, data types that cluster analysis needs to face become more and more complicated. One of the more typical problems is the clustering of multi-view data sets. Existing clustering methods are difficult to handle such data well. To remedy this deficiency, a multi-view weighted kernel fuzzy clustering method with collaborative evident and concealed views (MV-Co-KFCM) is put forward. To begin with, the hidden shared information is extracted from several different views of the data set by means of non-negative matrix factorization, then applied to this iterative process of clustering. This not only takes advantage of the difference information in distinct views, but also utilizes the consistency knowledge in distinct views. This pre-processing algorithm of extracting hidden information from multiple views (EHI-MV) is obtained. Furthermore, in order to coordinate different views during the iteration, a weight is distributed. In addition, so as to regulate the weight adaptively, shannon entropy regularization term is also introduced. Entropy can be maximized as far as possible by minimizing the objective function, thus MV-Co-KFCM algorithm is proposed. Facing 5 multi-view databases and comparing with 6 current leading algorithms, it is found that the algorithm which we put forward is more excellent as for 5 clustering validity indexes.

...read moreread less

Proceedings Article•DOI•

ELMo+Gated Self-attention Network Based on BiDAF for Machine Reading Comprehension

[...]

Weiwei Zhang¹, Fuji Ren¹•Institutions (1)

Hefei University of Technology¹

16 Oct 2020

TL;DR: This paper introduces ELMo representations and add a gated self-attention layer to the Bi-Directional Attention Flow network (BIDAF) and employs the feature reuse method and modify the linear function of answer layer to further improve the model and prove the validity of this model.

...read moreread less

Abstract: Machine reading comprehension (MRC) has always been a significant part of artificial intelligence and the focus in the field of natural language processing (NLP). Given context paragraph, to answer its query, we need to encode complex interaction between the question and the context. In the late years, with the rapid progress of neural network model and attention theory, MRC has made great advances. Especially, attention theory has been widely used in MRC. However, the accuracy of the previous classic baseline model has some upside potential and some of them did not take into account the long context dependence and polysemy. In this paper, for resolving the above problems and further improve the model, we introduce ELMo representations and add a gated self-attention layer to the Bi-Directional Attention Flow network (BIDAF). In addition, we employ the feature reuse method and modify the linear function of answer layer to further improve the performance. In the experiment of SQuAD, we prove this model greatly exceeds the baseline BIDAF model and its performance is close to the average level of human test, which proves the validity of this model.

...read moreread less

Proceedings Article•DOI•

Environment Recognition Using Robot Camera

[...]

Hiroaki Sumida¹, Fuji Ren¹, Shun Nishide¹, Xin Kang¹•Institutions (1)

University of Tokushima¹

01 May 2020

TL;DR: The purpose of this research is to create an environmental recognition system, specifically on emotion expressions and human states, for a humanoid robot aiming for interpersonal services using a Mask R-CNN model.

...read moreread less

Abstract: The purpose of this research is to create an environmental recognition system, specifically on emotion expressions and human states, for a humanoid robot aiming for interpersonal services. Region Convolutional Neural Networks (R-CNN) are often used for detecting objects in the environment. We employ a Mask R-CNN model for detection of emotions and states of a target person from the robot’s field of view. The model was trained using various images of a human’s body in several emotional states. Experiments were conducted to validate the effectiveness of the model to detect the states of surrounding people from the robot’s camera. Although the set of human states assumed in the experiment was limited, the results of the experiments imply the potential of the proposed method to act as a basis of a recognition model for an intelligent humanoid robot for interpersonal services.

...read moreread less

Proceedings Article•DOI•

User Authentication leveraging behavioral information using Commodity WiFi devices

[...]

Shulin Yang¹, Yantong Wang¹, Xiaoxiao Yu¹, Yu Gu¹, Fuji Ren² - Show less +1 more•Institutions (2)

Hefei University of Technology¹, University of Tokushima²

09 Aug 2020

TL;DR: This paper prototypes a new method for user authentication by leveraging commodity WiFi, and explores four classifiers including K Nearest Neighbor(KNN), Support Vector Machine (SVM), Random Forest, and Decision Tree for recognizing users, showing that KNN provides the best performance.

...read moreread less

Abstract: User authentication is a major area of interest within the field of Human Computer Interaction (HCI). Meanwhile, it prevents unauthorized accesses to certain the security of data. Personal Identification Number (PIN) and biometrics are the main approaches for identifying the user on the basis of his/her identity. However, PIN can be easily leaked to others, and biometrics usually require specialized devices. In this paper, we prototype our system, a new method for user authentication by leveraging commodity WiFi. The basic methodology is to explore the typing habit of users from Channel State Information (CSI). The design and implementation of our system face two challenges, i.e. extracting keystroke features from wireless channel data and authenticating the user via typing habit from the corresponding keystroke features. For the former, we capture signal fluctuations caused by the micro movements like typing and extract the keystroke features on channel response obtained from commodity WiFi devices. For the latter, we design a computational intelligence driven mechanism to authenticate users from the corresponding keystroke feature. We prototype our system on the low-cost off-the-shelf WiFi devices and evaluate its performance in real-world experiments. We have explored four classifiers including K Nearest Neighbor(KNN), Support Vector Machine (SVM), Random Forest, and Decision Tree for recognizing users. Empirical results show that KNN provides the best performance, i.e., 85.2% authentication accuracy, 12.8% false accept rate, and 11.2% false reject rate on average over 9 participants.

...read moreread less