scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Structural Minimax Probability Machine

TL;DR: This paper uses two finite mixture models to capture the structural information of the data from binary classification and proposes a structural MPM, which can be interpreted as a large margin classifier and can be transformed to support vector machine and maxi–min margin machine under certain special conditions.
Abstract: Minimax probability machine (MPM) is an interesting discriminative classifier based on generative prior knowledge. It can directly estimate the probabilistic accuracy bound by minimizing the maximum probability of misclassification. The structural information of data is an effective way to represent prior knowledge, and has been found to be vital for designing classifiers in real-world problems. However, MPM only considers the prior probability distribution of each class with a given mean and covariance matrix, which does not efficiently exploit the structural information of data. In this paper, we use two finite mixture models to capture the structural information of the data from binary classification. For each subdistribution in a finite mixture model, only its mean and covariance matrix are assumed to be known. Based on the finite mixture models, we propose a structural MPM (SMPM). SMPM can be solved effectively by a sequence of the second-order cone programming problems. Moreover, we extend a linear model of SMPM to a nonlinear model by exploiting kernelization techniques. We also show that the SMPM can be interpreted as a large margin classifier and can be transformed to support vector machine and maxi–min margin machine under certain special conditions. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of SMPM.
Citations
More filters
Journal ArticleDOI
Wu Deng, Rui Yao1, Huimin Zhao, Xinhua Yang1, Guangyu Li1 
01 Apr 2019
TL;DR: The fuzzy information entropy can accurately and more completely extract the characteristics of the vibration signal, the improved PSO algorithm can effectively improve the classification accuracy of LS-SVM, and the proposed fault diagnosis method outperforms the other mentioned methods.
Abstract: Aiming at the problem that the most existing fault diagnosis methods could not effectively recognize the early faults in the rotating machinery, the empirical mode decomposition, fuzzy information entropy, improved particle swarm optimization algorithm and least squares support vector machines are introduced into the fault diagnosis to propose a novel intelligent diagnosis method, which is applied to diagnose the faults of the motor bearing in this paper. In the proposed method, the vibration signal is decomposed into a set of intrinsic mode functions (IMFs) by using empirical mode decomposition method. The fuzzy information entropy values of IMFs are calculated to reveal the intrinsic characteristics of the vibration signal and considered as feature vectors. Then the diversity mutation strategy, neighborhood mutation strategy, learning factor strategy and inertia weight strategy for basic particle swarm optimization (PSO) algorithm are used to propose an improved PSO algorithm. The improved PSO algorithm is used to optimize the parameters of least squares support vector machines (LS-SVM) in order to construct an optimal LS-SVM classifier, which is used to classify the fault. Finally, the proposed fault diagnosis method is fully evaluated by experiments and comparative studies for motor bearing. The experiment results indicate that the fuzzy information entropy can accurately and more completely extract the characteristics of the vibration signal. The improved PSO algorithm can effectively improve the classification accuracy of LS-SVM, and the proposed fault diagnosis method outperforms the other mentioned methods in this paper and published in the literature. It provides a new method for fault diagnosis of rotating machinery.

365 citations

Journal ArticleDOI
01 Oct 2017
TL;DR: The experiment results show that the DOADAPO algorithm can improve the convergence speed and enhance the local search ability and global search ability, and the multi-objective optimization model of gate assignment can improved the comprehensive service of gate assignments.
Abstract: Display Omitted An improved adaptive PSO based on Alpha-stable distribution and dynamic fractional calculus is studied.A new multi-objective optimization model of gate assignment problem is proposed.The actual data are used to demonstrate the effectiveness of the proposed method. Gate is a key resource in the airport, which can realize rapid and safe docking, ensure the effective connection between flights and improve the capacity and service efficiency of airport. The minimum walking distances of passengers, the minimum idle time variance of each gate, the minimum number of flights at parking apron and the most reasonable utilization of large gates are selected as the optimization objectives, then an efficient multi-objective optimization model of gate assignment problem is proposed in this paper. Then an improved adaptive particle swarm optimization(DOADAPO) algorithm based on making full use of the advantages of Alpha-stable distribution and dynamic fractional calculus is deeply studied. The dynamic fractional calculus with memory characteristic is used to reflect the trajectory information of particle updating in order to improve the convergence speed. The Alpha-stable distribution theory is used to replace the uniform distribution in order to escape from the local minima in a certain probability and improve the global search ability. Next, the DOADAPO algorithm is used to solve the constructed multi-objective optimization model of gate assignment in order to fast and effectively assign the gates to different flights in different time. Finally, the actual flight data in one domestic airport is used to verify the effectiveness of the proposed method. The experiment results show that the DOADAPO algorithm can improve the convergence speed and enhance the local search ability and global search ability, and the multi-objective optimization model of gate assignment can improve the comprehensive service of gate assignment. It can effectively provide a valuable reference for assigning the gates in hub airport.

324 citations

Journal ArticleDOI
TL;DR: This work proposes a framework for privacy-preserving outsourced classification in cloud computing (POCC), and proves that the scheme is secure in the semi-honest model.
Abstract: Classifier has been widely applied in machine learning, such as pattern recognition, medical diagnosis, credit scoring, banking and weather prediction. Because of the limited local storage at user side, data and classifier has to be outsourced to cloud for storing and computing. However, due to privacy concerns, it is important to preserve the confidentiality of data and classifier in cloud computing because the cloud servers are usually untrusted. In this work, we propose a framework for privacy-preserving outsourced classification in cloud computing (POCC). Using POCC, an evaluator can securely train a classification model over the data encrypted with different public keys, which are outsourced from the multiple data providers. We prove that our scheme is secure in the semi-honest model

252 citations

Journal ArticleDOI
TL;DR: Results show that the EWT outperforms empirical mode decomposition for decomposing the signal into multiple components, and the proposed EWTFSFD method can accurately and effectively achieve the fault diagnosis of motor bearing.
Abstract: Motor bearing is subjected to the joint effects of much more loads, transmissions, and shocks that cause bearing fault and machinery breakdown. A vibration signal analysis method is the most popular technique that is used to monitor and diagnose the fault of motor bearing. However, the application of the vibration signal analysis method for motor bearing is very limited in engineering practice. In this paper, on the basis of comparing fault feature extraction by using empirical wavelet transform (EWT) and Hilbert transform with the theoretical calculation, a new motor bearing fault diagnosis method based on integrating EWT, fuzzy entropy, and support vector machine (SVM) called EWTFSFD is proposed. In the proposed method, a novel signal processing method called EWT is used to decompose vibration signal into multiple components in order to extract a series of amplitude modulated–frequency modulated (AM-FM) components with supporting Fourier spectrum under an orthogonal basis. Then, fuzzy entropy is utilized to measure the complexity of vibration signal, reflect the complexity changes of intrinsic oscillation, and compute the fuzzy entropy values of AM-FM components, which are regarded as the inputs of the SVM model to train and construct an SVM classifier for fulfilling fault pattern recognition. Finally, the effectiveness of the proposed method is validated by using the simulated signal and real motor bearing vibration signals. The experiment results show that the EWT outperforms empirical mode decomposition for decomposing the signal into multiple components, and the proposed EWTFSFD method can accurately and effectively achieve the fault diagnosis of motor bearing.

225 citations


Cites methods from "Structural Minimax Probability Mach..."

  • ...[25] proposed a structural minimax probability machine for constructing a margin classifier....

    [...]

Journal ArticleDOI
TL;DR: A novel hybrid text classification model based on deep belief network and softmax regression that can converge at fine-tuning stage and perform significantly better than the classical algorithms, such as SVM and KNN.
Abstract: In this paper, we propose a novel hybrid text classification model based on deep belief network and softmax regression. To solve the sparse high-dimensional matrix computation problem of texts data, a deep belief network is introduced. After the feature extraction with DBN, softmax regression is employed to classify the text in the learned feature space. In pre-training procedures, the deep belief network and softmax regression are first trained, respectively. Then, in the fine-tuning stage, they are transformed into a coherent whole and the system parameters are optimized with Limited-memory Broyden---Fletcher---Goldfarb---Shanno algorithm. The experimental results on Reuters-21,578 and 20-Newsgroup corpus show that the proposed model can converge at fine-tuning stage and perform significantly better than the classical algorithms, such as SVM and KNN.

209 citations

References
More filters
Journal ArticleDOI
Lawrence R. Rabiner1
01 Feb 1989
TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Abstract: This tutorial provides an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and gives practical details on methods of implementation of the theory along with a description of selected applications of the theory to distinct problems in speech recognition. Results from a number of original sources are combined to provide a single source of acquiring the background required to pursue further this area of research. The author first reviews the theory of discrete Markov chains and shows how the concept of hidden states, where the observation is a probabilistic function of the state, can be used effectively. The theory is illustrated with two simple examples, namely coin-tossing, and the classic balls-in-urns system. Three fundamental problems of HMMs are noted and several practical techniques for solving these problems are given. The various types of HMMs that have been studied, including ergodic as well as left-right models, are described. >

21,819 citations

Journal ArticleDOI
TL;DR: In this paper, a procedure for forming hierarchical groups of mutually exclusive subsets, each of which has members that are maximally similar with respect to specified characteristics, is suggested for use in large-scale (n > 100) studies when a precise optimal solution for a specified number of groups is not practical.
Abstract: A procedure for forming hierarchical groups of mutually exclusive subsets, each of which has members that are maximally similar with respect to specified characteristics, is suggested for use in large-scale (n > 100) studies when a precise optimal solution for a specified number of groups is not practical. Given n sets, this procedure permits their reduction to n − 1 mutually exclusive sets by considering the union of all possible n(n − 1)/2 pairs and selecting a union having a maximal value for the functional relation, or objective function, that reflects the criterion chosen by the investigator. By repeating this process until only one group remains, the complete hierarchical structure and a quantitative estimate of the loss associated with each stage in the grouping can be obtained. A general flowchart helpful in computer programming and a numerical example are included.

17,405 citations


"Structural Minimax Probability Mach..." refers methods in this paper

  • ...More specifically, we first use Ward’s hierarchical clustering [29]4 to detect the clusters for each class on the training set, and then compute the mean and covariance matrix for each cluster....

    [...]

Book ChapterDOI
TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Abstract: Publisher Summary This chapter provides an account of different neural network architectures for pattern recognition. A neural network consists of several simple processing elements called neurons. Each neuron is connected to some other neurons and possibly to the input nodes. Neural networks provide a simple computing paradigm to perform complex recognition tasks in real time. The chapter categorizes neural networks into three types: single-layer networks, multilayer feedforward networks, and feedback networks. It discusses the gradient descent and the relaxation method as the two underlying mathematical themes for deriving learning algorithms. A lot of research activity is centered on learning algorithms because of their fundamental importance in neural networks. The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue. It closes with the discussion of performance and implementation issues.

13,033 citations


"Structural Minimax Probability Mach..." refers background in this paper

  • ...Typical discriminative approaches include support vector machine (SVM) [4], neural network [5], Gaussian processes [6], and so on....

    [...]

Journal ArticleDOI
Jos F. Sturm1
TL;DR: This paper describes how to work with SeDuMi, an add-on for MATLAB, which lets you solve optimization problems with linear, quadratic and semidefiniteness constraints by exploiting sparsity.
Abstract: SeDuMi is an add-on for MATLAB, which lets you solve optimization problems with linear, quadratic and semidefiniteness constraints. It is possible to have complex valued data and variables in SeDuMi. Moreover, large scale optimization problems are solved efficiently, by exploiting sparsity. This paper describes how to work with this toolbox.

7,655 citations

Journal ArticleDOI
TL;DR: Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.
Abstract: Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such "all-together" methods. We then compare their performance with three methods based on binary classifications: "one-against-all," "one-against-one," and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the "one-against-one" and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.

6,562 citations


"Structural Minimax Probability Mach..." refers background or methods in this paper

  • ...class learning [31], [34], multiclass learning [35], [36], and so on)....

    [...]

  • ...In the future, we plan to extend SMPM to one-class learning [15], [33], ordinal-class learning [31], [34], multiclass learning [35], [36], and other learning tasks....

    [...]