scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Representational learning with ELMs for big data

TL;DR: Huang et al. as mentioned in this paper proposed ELM-AE, a special case of ELM, where the input is equal to output, and the randomly generated weights are chosen to be orthogonal.
Abstract: Geoffrey Hinton and Pascal Vincent showed that a restricted Boltzmann machine (RBM) and auto-encoders (AE) could be used for feature engineering. These engineered features then could be used to train multiple-layer neural networks, or deep networks. Two types of deep networks based on RBM exist: the deep belief network (DBN)1 and the deep Boltzmann machine (DBM). Guang-Bin Huang and colleagues introduced the extreme learning machine (ELM) as an single-layer feed-forward neural networks (SLFN) with a fast learning speed and good generalization capability. The ELM for SLFNs shows that hidden nodes can be randomly generated. ELM-AE output weights can be determined analytically, unlike RBMs and traditional auto-encoders, which require iterative algorithms. ELM-AE can be seen as a special case of ELM, where the input is equal to output, and the randomly generated weights are chosen to be orthogonal.
Citations
More filters
Journal ArticleDOI
TL;DR: The applications of deep learning in machine health monitoring systems are reviewed mainly from the following aspects: Auto-encoder and its variants, Restricted Boltzmann Machines, Convolutional Neural Networks, and Recurrent Neural Networks.

1,569 citations

Journal ArticleDOI
TL;DR: In this paper, the authors report the current state of the theoretical research and practical advances on this subject and provide a comprehensive view of these advances in ELM together with its future perspectives.

1,289 citations

Journal ArticleDOI
TL;DR: Extensive experiments on various widely used classification data sets show that the proposed algorithm achieves better and faster convergence than the existing state-of-the-art hierarchical learning methods, and multiple applications in computer vision further confirm the generality and capability of the proposed learning scheme.
Abstract: Extreme learning machine (ELM) is an emerging learning algorithm for the generalized single hidden layer feedforward neural networks, of which the hidden node parameters are randomly generated and the output weights are analytically computed. However, due to its shallow architecture, feature learning using ELM may not be effective for natural signals (e.g., images/videos), even with a large number of hidden nodes. To address this issue, in this paper, a new ELM-based hierarchical learning framework is proposed for multilayer perceptron. The proposed architecture is divided into two main components: 1) self-taught feature extraction followed by supervised feature classification and 2) they are bridged by random initialized hidden weights. The novelties of this paper are as follows: 1) unsupervised multilayer encoding is conducted for feature extraction, and an ELM-based sparse autoencoder is developed via $\ell _{1}$ constraint. By doing so, it achieves more compact and meaningful feature representations than the original ELM; 2) by exploiting the advantages of ELM random feature mapping, the hierarchically encoded outputs are randomly projected before final decision making, which leads to a better generalization with faster learning speed; and 3) unlike the greedy layerwise training of deep learning (DL), the hidden layers of the proposed framework are trained in a forward manner. Once the previous layer is established, the weights of the current layer are fixed without fine-tuning. Therefore, it has much better learning efficiency than the DL. Extensive experiments on various widely used classification data sets show that the proposed algorithm achieves better and faster convergence than the existing state-of-the-art hierarchical learning methods. Furthermore, multiple applications in computer vision further confirm the generality and capability of the proposed learning scheme.

1,166 citations

Journal ArticleDOI
TL;DR: Compared with existing deep neural networks, experimental results on the Modified National Institute of Standards and Technology database and NYU NORB object recognition dataset benchmark data demonstrate the effectiveness of the proposed Broad Learning System.
Abstract: Broad Learning System (BLS) that aims to offer an alternative way of learning in deep structure is proposed in this paper. Deep structure and learning suffer from a time-consuming training process because of a large number of connecting parameters in filters and layers. Moreover, it encounters a complete retraining process if the structure is not sufficient to model the system. The BLS is established in the form of a flat network, where the original inputs are transferred and placed as “mapped features” in feature nodes and the structure is expanded in wide sense in the “enhancement nodes.” The incremental learning algorithms are developed for fast remodeling in broad expansion without a retraining process if the network deems to be expanded. Two incremental learning algorithms are given for both the increment of the feature nodes (or filters in deep structure) and the increment of the enhancement nodes. The designed model and algorithms are very versatile for selecting a model rapidly. In addition, another incremental learning is developed for a system that has been modeled encounters a new incoming input. Specifically, the system can be remodeled in an incremental way without the entire retraining from the beginning. Satisfactory result for model reduction using singular value decomposition is conducted to simplify the final structure. Compared with existing deep neural networks, experimental results on the Modified National Institute of Standards and Technology database and NYU NORB object recognition dataset benchmark data demonstrate the effectiveness of the proposed BLS.

1,061 citations

Journal ArticleDOI
TL;DR: An insight into ELMs in three aspects, viz: random neurons, random features and kernels is provided and it is shown that in theory ELMs (with the same kernels) tend to outperform support vector machine and its variants in both regression and classification applications with much easier implementation.
Abstract: Extreme learning machines (ELMs) basically give answers to two fundamental learning problems: (1) Can fundamentals of learning (i.e., feature learning, clus- tering, regression and classification) be made without tuning hidden neurons (including biological neurons) even when the output shapes and function modeling of these neurons are unknown? (2) Does there exist unified frame- work for feedforward neural networks and feature space methods? ELMs that have built some tangible links between machine learning techniques and biological learning mechanisms have recently attracted increasing attention of researchers in widespread research areas. This paper provides an insight into ELMs in three aspects, viz: random neurons, random features and kernels. This paper also shows that in theory ELMs (with the same kernels) tend to outperform support vector machine and its variants in both regression and classification applications with much easier implementation.

871 citations

References
More filters
Journal ArticleDOI
TL;DR: A new learning algorithm called ELM is proposed for feedforward neural networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs which tends to provide good generalization performance at extremely fast learning speed.

10,217 citations

Journal ArticleDOI
01 Apr 2012
TL;DR: ELM provides a unified learning platform with a widespread type of feature mappings and can be applied in regression and multiclass classification applications directly and in theory, ELM can approximate any target continuous function and classify any disjoint regions.
Abstract: Due to the simplicity of their implementations, least square support vector machine (LS-SVM) and proximal support vector machine (PSVM) have been widely used in binary classification applications. The conventional LS-SVM and PSVM cannot be used in regression and multiclass classification applications directly, although variants of LS-SVM and PSVM have been proposed to handle such cases. This paper shows that both LS-SVM and PSVM can be simplified further and a unified learning framework of LS-SVM, PSVM, and other regularization algorithms referred to extreme learning machine (ELM) can be built. ELM works for the “generalized” single-hidden-layer feedforward networks (SLFNs), but the hidden layer (or called feature mapping) in ELM need not be tuned. Such SLFNs include but are not limited to SVM, polynomial network, and the conventional feedforward neural networks. This paper shows the following: 1) ELM provides a unified learning platform with a widespread type of feature mappings and can be applied in regression and multiclass classification applications directly; 2) from the optimization method point of view, ELM has milder optimization constraints compared to LS-SVM and PSVM; 3) in theory, compared to ELM, LS-SVM and PSVM achieve suboptimal solutions and require higher computational complexity; and 4) in theory, ELM can approximate any target continuous function and classify any disjoint regions. As verified by the simulation results, ELM tends to have better scalability and achieve similar (for regression and binary class cases) or much better (for multiclass cases) generalization performance at much faster learning speed (up to thousands times) than traditional SVM and LS-SVM.

4,835 citations

Journal ArticleDOI
TL;DR: This paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer.
Abstract: According to conventional neural network theories, single-hidden-layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes are universal approximators when all the parameters of the networks are allowed adjustable. However, as observed in most neural network implementations, tuning all the parameters of the networks may cause learning complicated and inefficient, and it may be difficult to train networks with nondifferential activation functions such as threshold networks. Unlike conventional neural network theories, this paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer. In such SLFNs implementations, the activation functions for additive nodes can be any bounded nonconstant piecewise continuous functions g:R→R and the activation functions for RBF nodes can be any integrable piecewise continuous functions g:R→R and ∫Rg(x)dx≠0. The proposed incremental method is efficient not only for SFLNs with continuous (including nondifferentiable) activation functions but also for SLFNs with piecewise continuous (such as threshold) activation functions. Compared to other popular methods such a new network is fully automatic and users need not intervene the learning process by manually tuning control parameters.

2,413 citations

Journal ArticleDOI
TL;DR: Under the ELM learning framework, SVM's maximal margin property and the minimal norm of weights theory of feedforward neural networks are actually consistent and ELM for classification tends to achieve better generalization performance than traditional SVM.

814 citations