scispace - formally typeset
Search or ask a question

Showing papers by "Qinghua Hu published in 2011"


Journal ArticleDOI
TL;DR: This study integrates kernel functions with fuzzy rough set models and proposes two types of kernelized fuzzy rough sets, and extends the measures existing in classical rough sets to evaluate the approximation quality and approximation abilities of the attributes.
Abstract: Kernel machines and rough sets are two classes of commonly exploited learning techniques. Kernel machines enhance traditional learning algorithms by bringing opportunities to deal with nonlinear classification problems, rough sets introduce a human-focused way to deal with uncertainty in learning problems. Granulation and approximation play a pivotal role in rough sets-based learning and reasoning. However, a way how to effectively generate fuzzy granules from data has not been fully studied so far. In this study, we integrate kernel functions with fuzzy rough set models and propose two types of kernelized fuzzy rough sets. Kernel functions are employed to compute the fuzzy T-equivalence relations between samples, thus generating fuzzy information granules in the approximation space. Subsequently fuzzy granules are used to approximate the classification based on the concepts of fuzzy lower and upper approximations. Based on the models of kernelized fuzzy rough sets, we extend the measures existing in classical rough sets to evaluate the approximation quality and approximation abilities of the attributes. We discuss the relationship between these measures and feature evaluation function ReliefF, and augment the ReliefF algorithm to enhance the robustness of these proposed measures. Finally, we apply these measures to evaluate and select features for classification problems. The experimental results help quantify the performance of the KFRS.

153 citations


Journal ArticleDOI
TL;DR: It is shown that the proposed measure is a natural extension of classical mutual information which reduces to the classical one if features are discrete; thus the new measure can also be used to compute the relevance between discrete variables.
Abstract: Measures of relevance between features play an important role in classification and regression analysis. Mutual information has been proved an effective measure for decision tree construction and feature selection. However, there is a limitation in computing relevance between numerical features with mutual information due to problems of estimating probability density functions in high-dimensional spaces. In this work, we generalize Shannon's information entropy to neighborhood information entropy and propose a measure of neighborhood mutual information. It is shown that the new measure is a natural extension of classical mutual information which reduces to the classical one if features are discrete; thus the new measure can also be used to compute the relevance between discrete variables. In addition, the new measure introduces a parameter delta to control the granularity in analyzing data. With numeric experiments, we show that neighborhood mutual information produces the nearly same outputs as mutual information. However, unlike mutual information, no discretization is required in computing relevance when used the proposed algorithm. We combine the proposed measure with four classes of evaluating strategies used for feature selection. Finally, the proposed algorithms are tested on several benchmark data sets. The results show that neighborhood mutual information based algorithms yield better performance than some classical ones.

136 citations


Journal ArticleDOI
TL;DR: This work extends Pawlak's rough set theory to numerical feature spaces by replacing partition of universe with neighborhood covering and derive a neighborhood covering reduction based approach to extracting rules from numerical data.

113 citations


Journal ArticleDOI
TL;DR: A heuristic algorithm is designed to compute reducts with Gaussian kernel fuzzy rough sets and parameterized attribute reduction with the derived model of fuzzy Rough sets is introduced.

88 citations


Journal ArticleDOI
Daren Yu1, Xiao Yu1, Qinghua Hu1, Jinfu Liu1, Anqi Wu1 
TL;DR: Nearest neighbor (NN) classifier with dynamic time warping (DTW) with global path constraint of DTW is learned for optimization of the alignment of time series by maximizing the nearest neighbor hypothesis margin.

72 citations


Journal ArticleDOI
TL;DR: This work introduces a new model of fuzzy rough set model, called soft fuzzy rough sets, and design a robust classification algorithm based on the model, and experimental results show the effectiveness of the proposed algorithm.

52 citations


Journal ArticleDOI
TL;DR: A set of experimental results show the model can select very few features and samples for training; in the mean time the classification performances are preserved or even improved.

51 citations


Journal ArticleDOI
Liping Zhu1, Shanshan Song1, Yubo Pi1, Yang Yu1, Weibin She1, Hong Ye1, Yuan Su1, Qinghua Hu1 
TL;DR: Molecular assays demonstrated that, when generated in ‘artificial’ models alone, under physiologically simulated conditions or repetitive pulses of agonist exposure, [Ca2+]i oscillation regulates NFκB transcriptional activity, phosphorylation of IκBα and Ca2+-dependent gene expression all in a way actually dependent on cumulated [ Ca2-i spike duration whether or not frequency varies.
Abstract: [Ca(2+)](i) oscillations drive downstream events, like transcription, in a frequency-dependent manner. Why [Ca(2+)](i) oscillation frequency regulates transcription has not been clearly revealed. A variation in [Ca(2+)](i) oscillation frequency apparently leads to a variation in the time duration of cumulated [Ca(2+)](i) elevations or cumulated [Ca(2+)](i) spike duration. By manipulating [Ca(2+)](i) spike duration, we generated a series of [Ca(2+)](i) oscillations with the same frequency but different cumulated [Ca(2+)](i) spike durations, as well as [Ca(2+)](i) oscillations with the different frequencies but the same cumulated [Ca(2+)](i) spike duration. Molecular assays demonstrated that, when generated in 'artificial' models alone, under physiologically simulated conditions or repetitive pulses of agonist exposure, [Ca(2+)](i) oscillation regulates NFκB transcriptional activity, phosphorylation of IκBα and Ca(2+)-dependent gene expression all in a way actually dependent on cumulated [Ca(2+)](i) spike duration whether or not frequency varies. This study underlines that [Ca(2+)](i) oscillation frequency regulates NFκB transcriptional activity through cumulated [Ca(2+)](i) spike-duration-mediated IκBα phosphorylation.

43 citations


Proceedings ArticleDOI
06 Nov 2011
TL;DR: A novel LSL approach by sparse coding and feature grouping is proposed to learn the desired subspace, where the MDP is preserved and the LDP is suppressed simultaneously.
Abstract: Linear subspace learning (LSL) is a popular approach to image recognition and it aims to reveal the essential features of high dimensional data, e.g., facial images, in a lower dimensional space by linear projection. Most LSL methods compute directly the statistics of original training samples to learn the subspace. However, these methods do not effectively exploit the different contributions of different image components to image recognition. We propose a novel LSL approach by sparse coding and feature grouping. A dictionary is learned from the training dataset, and it is used to sparsely decompose the training samples. The decomposed image components are grouped into a more discriminative part (MDP) and a less discriminative part (LDP). An unsupervised criterion and a supervised criterion are then proposed to learn the desired subspace, where the MDP is preserved and the LDP is suppressed simultaneously. The experimental results on benchmark face image databases validated that the proposed methods outperform many state-of-the-art LSL schemes.

39 citations


Journal ArticleDOI
TL;DR: This paper introduces the fuzzy information entropy and fuzzy mutual information for computing relevance between numerical or fuzzy features and decision, and combines them with ”min-Redundancy-Max-Relevance”, ”Max-Dependency” and ” min-Redundy- Max-Dependence” algorithms.
Abstract: Feature selection is an important preprocessing step in pattern classification and machine learning, and mutual information is widely used to measure relevance between features and decision. However, it is difficult to directly calculate relevance between continuous or fuzzy features using mutual information. In this paper we introduce the fuzzy information entropy and fuzzy mutual information for computing relevance between numerical or fuzzy features and decision. The relationship between fuzzy information entropy and differential entropy is also discussed. Moreover, we combine fuzzy mutual information with ”min-Redundancy-Max-Relevance”, ”Max-Dependency” and ”min-Redundancy-Max-Dependency” algorithms. The performance and stability of the proposed algorithms are tested on benchmark data sets. Experimental results show the proposed algorithms are effective and stable.

34 citations


Journal ArticleDOI
TL;DR: An approach to learning sample weights for enlarging margin by using a gradient descent algorithm to minimize margin based classification loss and consistently outperforms nearest neighbor classification and some other state-of-the-art methods.

Journal ArticleDOI
TL;DR: This work constructs an algorithm for feature evaluation and selection based on fuzzy rough set model and introduces a kernelized fuzzy rough sets based technique to evaluate quality of candidate features and select the useful subset.
Abstract: Driver fatigue detection based on computer vision is considered as one of the most hopeful applications of image recognition technology. The key issue is to extract and select useful features from the driver images. In this work, we use the properties of image sequences to describe states of drivers. In addition, we introduce a kernelized fuzzy rough sets based technique to evaluate quality of candidate features and select the useful subset. Fuzzy rough sets are widely discussed in dealing with uncertainty in data analysis. We construct an algorithm for feature evaluation and selection based on fuzzy rough set model. Two classification algorithms are introduced to validate the selected features. The experimental results show the effectiveness of the proposed techniques.

Journal ArticleDOI
TL;DR: In this article, the authors introduced support vector machine algorithms to acquire the classifier model of hypersonic inlet start/unstart, and the minimum total cost of the start and unstart classifier can be obtained by the maximum classifier utility theories.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the dependence of granular continuum intensity, mean Doppler velocity, and magnetic fields on granular diameter and found that diameter is a dominating parameter in classifying the granules and two families of granules are derived.
Abstract: The normal mode observations of seven quiet regions obtained by the Hinode spacecraft are analyzed to study the physical properties of granules. An artificial intelligence technique is introduced to automatically find the spatial distribution of granules in feature spaces. In this work, we investigate the dependence of granular continuum intensity, mean Doppler velocity, and magnetic fields on granular diameter. We recognized 71,538 granules by an automatic segmentation technique and then extracted five properties: diameter, continuum intensity, Doppler velocity, and longitudinal and transverse magnetic flux density to describe the granules. To automatically explore the intrinsic structures of the granules in the five-dimensional parameter space, the X-means clustering algorithm and one-rule classifier are introduced to define the rules for classifying the granules. It is found that diameter is a dominating parameter in classifying the granules and two families of granules are derived: small granules with diameters smaller than 144, and large granules with diameters larger than 144. Based on statistical analysis of the detected granules, the following results are derived: (1) the averages of diameter, continuum intensity, and Doppler velocity in the upward direction of large granules are larger than those of small granules; (2) the averages of absolute longitudinal, transverse, and unsigned flux density of large granules are smaller than those of small granules; (3) for small granules, the average of continuum intensity increases with their diameters, while the averages of Doppler velocity, transverse, absolute longitudinal, and unsigned magnetic flux density decrease with their diameters. However, the mean properties of large granules are stable; (4) the intensity distributions of all granules and small granules do not satisfy Gaussian distribution, while that of large granules almost agrees with normal distribution with a peak at 1.04 I 0.

Journal ArticleDOI
TL;DR: A new notion ofknowledge granularity is introduced, called conditional knowledge granularity, reflecting relationship between conditional attributes and decision attribute, and an evaluation function to measure significance of conditional attributes is proposed and equivalent characterization of attribute reduction is established.
Abstract: Feature selection is an important technique for dimension reduction in machine learning and pattern recognition communities. Feature evaluation functions play essential roles in constructing feature selection algorithms. This paper introduces a new notion of knowledge granularity, called conditional knowledge granularity, reflecting relationship between conditional attributes and decision attribute. An evaluation function to measure significance of conditional attributes is proposed and equivalent characterization of attribute reduction is established based on the conditional knowledge granularity. An optimal algorithm for feature selection is developed on the basis of the proposed evaluation function. Furthermore, a novel approach to performing feature selection in an inconsistent decision system is put forward through establishing a rough communication between the inconsistent decision system and a consistent decision system. Simulated experiments verifies feasibility and efficiency of the proposed tech...

Book
01 Jan 2011
TL;DR: In this article, some authors in the 1960s studied three-valued logics and pairs of sets with a meaning similar to those we can encounter nowadays in modern theories such as rough sets, decision theory and granular computing.
Abstract: Before the advent of fuzzy and rough sets, some authors in the 1960s studied three-valued logics and pairs of sets with a meaning similar to those we can encounter nowadays in modern theories such as rough sets, decision theory and granular computing. We revise these studies using the modern terminology and making reference to the present literature. Finally, we put forward some future directions of investigation.

Journal ArticleDOI
TL;DR: In this article, the physical properties of granules with mean upward and down- ward Doppler velocity were analyzed using normal-mode observations of seven quiet regions, obtained by the Hinode space- craft.
Abstract: Normal-mode observations of seven quiet regions, obtained by the Hinode space- craft, are used to analyze the physical properties of granules with mean upward and down- ward Doppler velocity. We identify 75 146 granules from the observations with a granule- detection method. Then the granules are divided into two subsets: one with negative mean Doppler velocity (granule-upflows), and the other with positive mean Doppler velocity (granule-downflows). Next, the statistical properties and distributions of these two subsets of granules are measured and discussed. We also study the relation between the Doppler velocity of granules and other properties. Several conclusions are drawn from the statistical analysis: i) The majority (73.5%) of granules have negative mean Doppler velocity (blueshift). ii) The continuum-intensity distri- bution of granule-upflows reaches a peak at 1.05, while that of granules-downflows reaches a peak at 0.99. iii) Granule-upflows are greater than granule-downflows if transverse, abso- lute longitudinal and unsigned flux density are smaller than 100 G, while granule-upflows are less than granule-downflows if the flux densities are greater than 100 G. iv) Granule- downflows are - on average - slightly smaller and fainter than granule-upflows. Also, the flux densities of granule-downflows are slightly higher. v) The mean Doppler velocity within intergranular lanes is the most highly correlated with that within granules among the eight properties of granules.

Book ChapterDOI
09 Oct 2011
TL;DR: This work designs a case-based classifier with fuzzy rough set theory that takes lower approximation of fuzzy rough sets as the theoretic foundation of selecting cases and case reasoning.
Abstract: Fuzzy rough sets are widely studied and applied in the domain of machine learning and data mining these years. In this work, we design a case-based classifier with fuzzy rough set theory. The new classifier takes lower approximation of fuzzy rough sets as the theoretic foundation of selecting cases and case reasoning. Some numerical experiments are conducted to show the effectiveness of the proposed algorithm.

Proceedings ArticleDOI
16 Dec 2011
TL;DR: This paper investigates bagging of one class support vector machines (OCSVM), which just use one class of objects for training, and shows that the performance with bagging method is better than single OCSVM.
Abstract: A large number of training samples is requiredin developing visual object recognition systems. However, the size of samples is limited sometimes. This paper investigates bagging of one class support vector machines (OCSVM), which just use one class of objects for training. Experiments are performed on Caltech101 database. Our findings show that the performance with bagging method is better than single OCSVM. Furthermore, bagging of OCSVM can also keep better performance with limited number of training samples.

Journal Article
TL;DR: The real-time PCR assay was rapid, sensitive and specific and could be applied to the rapid diagnosis of S. choleraesuis in food and stool samples of food poisoning and the identification of Salmonella C to guarantee food safety.
Abstract: Objective To develop real-time PCR assay based on modified molecular beacon for simultaneous detection of S. choleraesuis and S. paratyphi C. The established method was applied to the rapid detection of S. choleraesuis in food and stool samples of food poisoning, and then was applied to the identification of Salmonella C. Methods Based on the sequences (CP000857.1) published in GenBank, Two sets of primers and modified molecular beacon were designed. The Real-time PCR assay for the simultaneous detection of S. paratyphi C and S. choleraesuis was developed with optimized PCR procedures and PCR components, while other 11 different bacterial species were as the control. Then the sensitivity and specificity of the assay were tested using 77 Samonella strains. The assay was applied to the detection of 70 food samples. Results The limit of detection achieved was 10 fg/reaction or 20 CUF/reaction, Only Salmonella paratyphi C and Salmonella choleraesuis strains generated fluorescent signals. No cross-reaction was observed with other 11 bacterium, the sensitivity and specificity were both 100%. No samples among 70 food samples were found Salmonella positive by both real-time PCR assay and traditional culture method. It could be finished within 2 hours from template preparation to detection and the overall test would be finished within one day. Conclusion The real-time PCR assay was rapid, sensitive and specific. It could be applied to the rapid diagnosis of S. paratyphi C and S. choleraesuis in food and stool samples of food poisoning and the identification of Salmonella C to guarantee food safety.

Book ChapterDOI
09 Oct 2011
TL;DR: This work focuses on several fatigue indicating areas and three types of features are extracted from them and a neighborhood rough set technique is introduced to evaluate quality of candidate features and select the effective subset.
Abstract: Driver fatigue recognition based on computer vision is considered as a challenging issue. Though human face carries most information related to human status, the information is redundant and overlapped. In this work, we concentrate on several fatigue indicating areas and three types of features are extracted from them. Then a neighborhood rough set technique is introduced to evaluate quality of candidate features and select the effective subset. A rule learning classifier based on neighborhood covering reduction is employed for the classification task. Compared with classic classifiers, the designed recognition system performs well. The experiments are presented to show the effectiveness of the proposed technique.