Bio: Jin Wang is an academic researcher from Chongqing University of Posts and Telecommunications. The author has contributed to research in topics: Feature extraction & Convolutional neural network. The author has an hindex of 6, co-authored 23 publications receiving 85 citations.
TL;DR: The change rules of the Euclidean similarity degree with the different knowledge granularities are discussed, and these rules are in accord with human cognitive mechanism in a multi-granularity knowledge space.
Abstract: Vague set is a further generalization of fuzzy set. In rough set theory, a target concept may be a defined set, fuzzy set or vague set. That the target concept is a defined set or fuzzy set was analyzed in detail in our other papers respectively. In general, we can only get two boundaries of an uncertain concept when we use rough set to deal with the uncertain problems and can not get a useable approximation defined set which is a union set with many granules in Pawlak's approximation space. In order to overcome above shortcoming, we mainly discuss the approximation set of a vague set in Pawlak's approximation space in the paper. Firstly, many preliminary concepts or definitions related to the vague set and the rough set are reviewed briefly. And then, many new definitions, such as 0.5-crisp set, step-vague set and average-step-vague set, are defined one by one. The Euclidean similarity degrees between a vague set and its 0.5-crisp set, step-vague set and average-step-vague set are analyzed in detail respectively. And then, the conclusion that the Euclidean similarity degree between a vague set and its 0.5-crisp set is better than the Euclidean similarity degree between the vague set and the other defined set in the approximation space ( U , R ) is drawn. Afterward, it is proved that average-step-vague set is an optimal step-vague set because the Euclidean similarity degree between a vague set and its average-step-vague set in the approximation space ( U , R ) can reach the maximum value. Finally, the change rules of the Euclidean similarity degree with the different knowledge granularities are discussed, and these rules are in accord with human cognitive mechanism in a multi-granularity knowledge space.
TL;DR: An undulatory locomotion model of C. elegans to achieve the chemotaxis behaviors based on the biological neuronal and neuromuscular structure is provided to verify the realness and effectiveness of the model, which could serve as a prototype for other footless animals.
Abstract: This paper provides an undulatory locomotion model of C. elegans to achieve the chemotaxis behaviors based on the biological neuronal and neuromuscular structure. The on-cell and off-cell mechanism, as well as the proprioceptive mechanism is incorporated into the locomotion model. The nervous system of C. elegans is modeled by a dynamic neural network (DNN) that involves two parts: head DNN and motor neurons. The head DNN perceives the outside concentrations and generates the undulatory wave to the body. The motor neurons are responsible for transiting the undulatory wave along the body. The body of C. elegans is represented as a multi-joint rigid link model with 11 links. The undulatory locomotion behavior is achieved by using the DNN to control the lengths of muscles on ventral and dorsal sides, and then using the muscle lengths to control the angles between two consecutive links. In this work, the relations between the outputs of DNN and muscle lengths, as well as the muscle lengths and the angles between two consecutive links, are determined. Furthermore, owing to the learning capability of DNN, a set of nonlinear functions that are designed to represent the chemotaxis behaviors of C. elegans are learned by the head DNN. The testing results show good performance of the locomotion model for the chemotaxis behaviors of finding food and avoiding toxin, as well as slight and ? turns. At last, quantitative analyses by comparing with the experiment results are provided to verify the realness and effectiveness of the locomotion model, which could serve as a prototype for other footless animals.
••11 Oct 2013
TL;DR: Experimental results show that TSMOTE-AB outperforms the SMOTE and other previously known algorithms.
Abstract: Synthetic minority over-sampling technique SMOTE is an effective over-sampling technique and specifically designed for learning from imbalanced data sets. However, in the process of synthetic sample generation, SMOTE is of some blindness. This paper proposes a novel approach for imbalanced problem, based on a combination of the Threshold SMOTE TSMOTE and the Attribute Bagging AB algorithms. TSMOTE takes full advantage of majority samples to adjust the neighbor selective strategy of SMOTE in order to control the quality of the new sample. Attribute Bagging, a famous ensemble learning algorithm, is also used to improve the predictive power of the classifier. A comprehensive suite of experiments tested on 7 imbalanced data sets collected from UCI machine learning repository is conducted. Experimental results show that TSMOTE-AB outperforms the SMOTE and other previously known algorithms.
••01 Oct 2017
TL;DR: The proposed model, a model named Multi-Scale Fusion Convolutional Neural Network (MSF-CNN) is proposed to train the face detector and the performance of results outperforms the previous methods in some well-known face detection benchmark datasets.
Abstract: Nowadays, more and more methods have been proposed to solve the problem of face detection based on computer implementation. Due to the variations in background, illumination, pose and facial expressions, the problem of machine face detection is complex. Recently, deep learning approaches achieve an impressive performance on face detection. In this paper, a model named Multi-Scale Fusion Convolutional Neural Network (MSF-CNN) is proposed to train the face detector. The model is trained by Convolutional Neural Network and detecting is based on the Viola & Jones detector's sliding windows structure. Particularly, in the process of feature extraction, we adopt the design of multi-scale feature fusion with different scale convolution kernels. The results are as follows: First, the fusion of multi-scale features are rich in the characteristics of learning, and the classification accuracy is higher than the single-scale. Second, we decrease the model of complexity compared with existed methods of the cascaded CNN. Third, we achieve end-to-end learning compared with cascaded separate training. Meanwhile, the proposed model has showed that the performance of results outperforms the previous methods in some well-known face detection benchmark datasets.
••01 Oct 2014
TL;DR: Experimental results show that the proposed multi-feature fusion and sparse coding based framework for image retrieval is much more effective than the state-of-the-art methods not only in traditional image dataset but also in varying image dataset.
Abstract: In traditional image retrieval techniques, the query results are severely affected when the images of varying illumination and scale, as well as occlusion and corrosion. Seeking to solve this problem, this paper proposed a novel multi-feature fusion and sparse coding based framework for image retrieval. In the framework, firstly, inherent features of an image are extracted, and then dictionary learning method is utilized to construct them to be dictionary features. Finally, the proposed framework introduces sparse representation model to measure the similarity between two images. The merit is that a feature descriptor is coded as a sparse linear combination with respect to dictionary feature so as to achieve efficient feature representation and robust similarity measure. In order to check the validity of the framework, this paper conducted two groups of experiments on Corel-1000 image dataset and the Stirmark benchmark based database respectively. Experimental results show that the proposed framework is much more effective than the state-of-the-art methods not only in traditional image dataset but also in varying image dataset.
TL;DR: The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is considered "de facto" standard in the framework of learning from imbalanced data because of its simplicity in the design, as well as its robustness when applied to different type of problems.
Abstract: The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is considered "de facto" standard in the framework of learning from imbalanced data. This is due to its simplicity in the design of the procedure, as well as its robustness when applied to different type of problems. Since its publication in 2002, SMOTE has proven successful in a variety of applications from several different domains. SMOTE has also inspired several approaches to counter the issue of class imbalance, and has also significantly contributed to new supervised learning paradigms, including multilabel classification, incremental learning, semi-supervised learning, multi-instance learning, among others. It is standard benchmark for learning from imbalanced data. It is also featured in a number of different software packages -- from open source to commercial. In this paper, marking the fifteen year anniversary of SMOTE, we reect on the SMOTE journey, discuss the current state of affairs with SMOTE, its applications, and also identify the next set of challenges to extend SMOTE for Big Data problems.
TL;DR: The final experimental results indicate that the MR-CNN is superior at detecting small traffic signs, and that it achieves the state-of-the-art performance compared with other methods.
Abstract: Small traffic sign recognition is a challenging problem in computer vision, and its accuracy is important to the safety of intelligent transportation systems (ITS). In this paper, we propose the multi-scale region-based convolutional neural network (MR-CNN). At the detection stage, MR-CNN uses a multi-scale deconvolution operation to up-sample the features of the deeper convolution layers and concatenates them to those of the shallow layer to construct the fused feature map. The fused feature map has the ability to generate fewer region proposals and achieve higher recall values. At the classification stage, we leverage the multi-scale contextual regions to exploit the information surrounding a given object proposal and construct the fused feature for the fully connected layers. The fused feature map inside the region proposal network (RPN) focuses primarily on improving the image resolution and semantic information for small traffic sign detection, while outside the RPN, the fused feature enhances the feature representation by leveraging the contextual information. Finally, we evaluated MR-CNN on the largest dataset, Tsinghua-Tencent 100K, which is suitable for our problem and more challenging than the GTSDB and GTSRB datasets. The final experimental results indicate that the MR-CNN is superior at detecting small traffic signs, and that it achieves the state-of-the-art performance compared with other methods.
TL;DR: The proposed method can well distinguish the four health states, which are high deionized water inlet temperature fault, hydrogen leakage fault, low air pressure fault and the normal state, with an accuracy of 99.97% for the training sample and 100% forThe test sample.
Abstract: The running state of the hybrid tram and the service life of fuel cell stacks are related to the fault diagnosis strategy of the proton exchange membrane fuel cell (PEMFC) system. In order to accurately detect various fault types, a novel method is proposed to classify the different health states, which is composed of simulated annealing genetic algorithm fuzzy c-means clustering (SAGAFCM) and deep belief network (DBN) combined with synthetic minority over-sampling technique (SMOTE). Operation data generated by the tram are clustered by SAGAFCM algorithm, and valid data are selected as fault diagnosis samples which include the training sample and the test sample. However, the fault samples are usually unbalanced data. To reduce the influence of unbalanced data on the fault diagnosis accuracy, SMOTE is employed to form a new training sample by supplementing the data of the small sample. Then DBN is trained by the new training sample to obtain the fault diagnosis model. In this paper, the proposed method can well distinguish the four health states, which are high deionized water inlet temperature fault, hydrogen leakage fault, low air pressure fault and the normal state, with an accuracy of 99.97% for the training sample and 100% for the test sample.
TL;DR: An improved (Single Shot Detector) SSD algorithm via multi-feature fusion and enhancement, named MF-SSD, for traffic sign recognition, achieves higher detection accuracy, better efficiency, and better robustness in complex traffic environment.
Abstract: Road traffic sign detection and recognition play an important role in advanced driver assistance systems (ADAS) by providing real-time road sign perception information. In this paper, we propose an improved (Single Shot Detector) SSD algorithm via multi-feature fusion and enhancement, named MF-SSD, for traffic sign recognition. First, low-level features are fused into high-level features to improve the detection performance of small targets in the SSD. We then enhance the features in different channels to detect the target by enhancing effective channel features and suppressing invalid channel features. Our algorithm gets good results in domestic real-time traffic signs. The proposed MF-SSD algorithm is evaluated with the German Traffic Sign Recognition Benchmark (GTSRB) dataset. The experimental results show that the MF-SSD algorithm has advantages in detecting small traffic signs. Compared with existing methods, it achieves higher detection accuracy, better efficiency, and better robustness in complex traffic environment.
TL;DR: A method for measuring the uncertainty of probabilistic rough sets is proposed, and the change rules of uncertainty with changing knowledge spaces are presented and analyzed, and a comparative analysis on uncertainty of a target concept in rough sets and probabilistically rough sets model is presented.
Abstract: Pawlak's rough sets model describes an uncertain target set (concept) with two crisp boundary lines (i.e. lower and upper approximation sets) and as an effective tool has successfully been used to deal with uncertain information systems. Based on the rough sets model, a probabilistic rough sets model with a pair of thresholds was proposed to improve the fault-tolerance ability of rough sets. The uncertainty of Pawlak's rough sets model is rooted in the objects contained in the boundary region of the target concept, while the uncertainty of probabilistic rough sets model comes from three regions, because the objects in the positive or negative regions are probably uncertain, and the membership degrees of these objects are not necessary equal to 1 or 0. In this paper, a method for measuring the uncertainty of probabilistic rough sets is proposed, and the change rules of uncertainty with changing knowledge spaces are presented and analyzed. Then, for an uncertain target concept, the uncertainties of the three regions are discussed, and the related change rules of uncertainty with changing knowledge spaces are revealed and successfully proved. Finally, a comparative analysis on the uncertainty of a target concept in rough sets and probabilistic rough sets model is presented. These results are important to further enrich and improve probabilistic rough sets theory, and effectively promote the development of uncertainty artificial intelligence. A kind of method for measuring uncertainty of probabilistic rough sets is proposed at first.The change rules of uncertainty with changing knowledge spaces are presented and analyzed.The uncertainties of three regions are discussed with the changing knowledge spaces.The comparative analysis on uncertainty of a target concept is presented.