Extreme learning machine: Theory and applications

doi:10.1016/J.NEUCOM.2005.12.126

Home
/
Papers
/
Extreme learning machine: Theory and applications

Journal Article•DOI•

Extreme learning machine: Theory and applications

Guang-Bin Huang, Qin-Yu Zhu, Chee Kheong Siew

01 Dec 2006-Neurocomputing (Elsevier)-Vol. 70, Iss: 1, pp 489-501

TL;DR: A new learning algorithm called ELM is proposed for feedforward neural networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs which tends to provide good generalization performance at extremely fast learning speed.

read less

About: This article is published in Neurocomputing.The article was published on 2006-12-01. It has received 10217 citations till now. The article focuses on the topics: Extreme learning machine & Wake-sleep algorithm.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Extreme Learning Machine for Regression and Multiclass Classification

[...]

Guang-Bin Huang¹, Hongming Zhou¹, Xiaojian Ding², Rui Zhang¹•Institutions (2)

Nanyang Technological University¹, Xi'an Jiaotong University²

01 Apr 2012

TL;DR: ELM provides a unified learning platform with a widespread type of feature mappings and can be applied in regression and multiclass classification applications directly and in theory, ELM can approximate any target continuous function and classify any disjoint regions.

...read moreread less

Abstract: Due to the simplicity of their implementations, least square support vector machine (LS-SVM) and proximal support vector machine (PSVM) have been widely used in binary classification applications. The conventional LS-SVM and PSVM cannot be used in regression and multiclass classification applications directly, although variants of LS-SVM and PSVM have been proposed to handle such cases. This paper shows that both LS-SVM and PSVM can be simplified further and a unified learning framework of LS-SVM, PSVM, and other regularization algorithms referred to extreme learning machine (ELM) can be built. ELM works for the “generalized” single-hidden-layer feedforward networks (SLFNs), but the hidden layer (or called feature mapping) in ELM need not be tuned. Such SLFNs include but are not limited to SVM, polynomial network, and the conventional feedforward neural networks. This paper shows the following: 1) ELM provides a unified learning platform with a widespread type of feature mappings and can be applied in regression and multiclass classification applications directly; 2) from the optimization method point of view, ELM has milder optimization constraints compared to LS-SVM and PSVM; 3) in theory, compared to ELM, LS-SVM and PSVM achieve suboptimal solutions and require higher computational complexity; and 4) in theory, ELM can approximate any target continuous function and classify any disjoint regions. As verified by the simulation results, ELM tends to have better scalability and achieve similar (for regression and binary class cases) or much better (for multiclass cases) generalization performance at much faster learning speed (up to thousands times) than traditional SVM and LS-SVM.

...read moreread less

4,835 citations

Cites background or methods from "Extreme learning machine: Theory an..."

...ELM [12], [13] and its variants [14]–[16], [24]–[28] mainly focus on the regression applications....
[...]
...ELM is to minimize the training error as well as the norm of the output weights [12], [13]...
[...]
...The original solutions (21) of ELM [12], [13], [26], TERELM [22], and the weighted regularized ELM [21] are not able to apply kernels in their implementations....
[...]
...The minimal norm least square method instead of the standard optimization method was used in the original implementation of ELM [12], [13]...
[...]

Journal Article•DOI•

Universal approximation using incremental constructive feedforward networks with random hidden nodes

[...]

Guang-Bin Huang¹, Lei Chen, Chee-Kheong Siew•Institutions (1)

Nanyang Technological University¹

01 Jul 2006-IEEE Transactions on Neural Networks

TL;DR: This paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer.

...read moreread less

Abstract: According to conventional neural network theories, single-hidden-layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes are universal approximators when all the parameters of the networks are allowed adjustable. However, as observed in most neural network implementations, tuning all the parameters of the networks may cause learning complicated and inefficient, and it may be difficult to train networks with nondifferential activation functions such as threshold networks. Unlike conventional neural network theories, this paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer. In such SLFNs implementations, the activation functions for additive nodes can be any bounded nonconstant piecewise continuous functions g:R→R and the activation functions for RBF nodes can be any integrable piecewise continuous functions g:R→R and ∫Rg(x)dx≠0. The proposed incremental method is efficient not only for SFLNs with continuous (including nondifferentiable) activation functions but also for SLFNs with piecewise continuous (such as threshold) activation functions. Compared to other popular methods such a new network is fully automatic and users need not intervene the learning process by manually tuning control parameters.

...read moreread less

2,413 citations

Journal Article•DOI•

Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources

[...]

Xiao Xiang Zhu¹, Devis Tuia², Lichao Mou¹, Gui-Song Xia³, Liangpei Zhang³, Feng Xu⁴, Friedrich Fraundorfer⁵ - Show less +3 more•Institutions (5)

Technische Universität München¹, Wageningen University and Research Centre², Wuhan University³, Fudan University⁴, Graz University of Technology⁵

01 Dec 2017-IEEE Geoscience and Remote Sensing Magazine

TL;DR: The challenges of using deep learning for remote-sensing data analysis are analyzed, recent advances are reviewed, and resources are provided that hope will make deep learning in remote sensing seem ridiculously simple.

...read moreread less

Abstract: Central to the looming paradigm shift toward data-intensive science, machine-learning techniques are becoming increasingly important. In particular, deep learning has proven to be both a major breakthrough and an extremely powerful tool in many fields. Shall we embrace deep learning as the key to everything? Or should we resist a black-box solution? These are controversial issues within the remote-sensing community. In this article, we analyze the challenges of using deep learning for remote-sensing data analysis, review recent advances, and provide resources we hope will make deep learning in remote sensing seem ridiculously simple. More importantly, we encourage remote-sensing scientists to bring their expertise into deep learning and use it as an implicit general model to tackle unprecedented, large-scale, influential challenges, such as climate change and urbanization.

...read moreread less

2,095 citations

Cites methods from "Extreme learning machine: Theory an..."

...The ELM was introduced for efficient feature pooling and classification, making the ship detection accurate and fast....
[...]
...Tang et al. [106] offered a compressed-domain ship detection framework combined with SDA and an extreme learning machine (ELM) [107] for optical spaceborne images....
[...]
...[106] offered a compressed-domain ship detection framework combined with SDA and an extreme learning machine (ELM) [107] for optical spaceborne images....
[...]

Journal Article•DOI•

A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks

[...]

Nanying Liang¹, Guang-Bin Huang¹, Paramasivan Saratchandran¹, Narasimhan Sundararajan¹•Institutions (1)

Nanyang Technological University¹

01 Nov 2006-IEEE Transactions on Neural Networks

TL;DR: The results show that the OS-ELM is faster than the other sequential algorithms and produces better generalization performance on benchmark problems drawn from the regression, classification and time series prediction areas.

...read moreread less

Abstract: In this paper, we develop an online sequential learning algorithm for single hidden layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes in a unified framework. The algorithm is referred to as online sequential extreme learning machine (OS-ELM) and can learn data one-by-one or chunk-by-chunk (a block of data) with fixed or varying chunk size. The activation functions for additive nodes in OS-ELM can be any bounded nonconstant piecewise continuous functions and the activation functions for RBF nodes can be any integrable piecewise continuous functions. In OS-ELM, the parameters of hidden nodes (the input weights and biases of additive nodes or the centers and impact factors of RBF nodes) are randomly selected and the output weights are analytically determined based on the sequentially arriving data. The algorithm uses the ideas of ELM of Huang developed for batch learning which has been shown to be extremely fast with generalization performance better than other batch training methods. Apart from selecting the number of hidden nodes, no other control parameters have to be manually chosen. Detailed performance comparison of OS-ELM is done with other popular sequential learning algorithms on benchmark problems drawn from the regression, classification and time series prediction areas. The results show that the OS-ELM is faster than the other sequential algorithms and produces better generalization performance

...read moreread less

1,800 citations

Cites background or methods from "Extreme learning machine: Theory an..."

...Huang et al. [ 27 ], the basic idea of the proof can be summarized as follows....
[...]
...These have been formally stated in the following theorems [ 27 ]....
[...]
...In real applications, the number of hidden nodes will always be less than the number of training samples and, hence, the training error cannot be made exactly zero but can approach a nonzero training error . The following theorem formally states this fact [ 27 ]....
[...]
...OS-ELM originates from the batch learning extreme learning machine (ELM) [20]–[22], [ 27 ], [30] developed for SLFNs with additive and RBF nodes....
[...]
...Huang et al. [20]–[22], [ 27 ], [30] to provide the necessary background for the development of OS-ELM in Section III....
[...]

Journal Article•DOI•

Extreme learning machines: a survey

[...]

Guang-Bin Huang¹, Dianhui Wang², Yuan Lan¹•Institutions (2)

Nanyang Technological University¹, La Trobe University²

25 May 2011-International Journal of Machine Learning and Cybernetics

TL;DR: A survey on Extreme learning machine (ELM) and its variants, especially on (1) batch learning mode of ELM, (2) fully complex ELm, (3) online sequential ELM; and (4) incremental ELM and (5) ensemble ofELM.

...read moreread less

Abstract: Computational intelligence techniques have been used in wide applications. Out of numerous computational intelligence techniques, neural networks and support vector machines (SVMs) have been playing the dominant roles. However, it is known that both neural networks and SVMs face some challenging issues such as: (1) slow learning speed, (2) trivial human intervene, and/or (3) poor computational scalability. Extreme learning machine (ELM) as emergent technology which overcomes some challenges faced by other techniques has recently attracted the attention from more and more researchers. ELM works for generalized single-hidden layer feedforward networks (SLFNs). The essence of ELM is that the hidden layer of SLFNs need not be tuned. Compared with those traditional computational intelligence techniques, ELM provides better generalization performance at a much faster learning speed and with least human intervene. This paper gives a survey on ELM and its variants, especially on (1) batch learning mode of ELM, (2) fully complex ELM, (3) online sequential ELM, (4) incremental ELM, and (5) ensemble of ELM.

...read moreread less

1,767 citations

Cites background from "Extreme learning machine: Theory an..."

...The hidden layer of ELM need not be iteratively tuned [5, 6]....
[...]
...[6–9]....
[...]
...The ith row of H is the hidden layer feature mapping with respect to the ith input xi : hðxiÞ: It has been proved [6] that from the interpolation capability point of view, if the activation function g is infinitely differentiable in any interval the hidden layer parameters can be randomly generated....
[...]
...1 [6] Given any small positive value [ 0; activation function g : R ! R which is infinitely differentiable in any interval, and N arbitrary distinct samples ðxi; tiÞ 2 R R; there exists L B N such that for any Int....
[...]
...The learning capability of extreme learning machines have been studied in two aspects: interpolation capability [6] and universal approximation capability [7–9]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Book•

Neural Networks: A Comprehensive Foundation

[...]

Simon Haykin

16 Jul 1998

TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.

...read moreread less

Abstract: From the Publisher: This book represents the most comprehensive treatment available of neural networks from an engineering perspective. Thorough, well-organized, and completely up to date, it examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks. Written in a concise and fluid manner, by a foremost engineering textbook author, to make the material more accessible, this book is ideal for professional engineers and graduate students entering this exciting field. Computer experiments, problems, worked examples, a bibliography, photographs, and illustrations reinforce key concepts.

...read moreread less

29,130 citations

UCI Repository of machine learning databases

[...]

Catherine Blake

01 Jan 1998

12,940 citations

"Extreme learning machine: Theory an..." refers methods in this paper

...The forest cover type [2] for 30 30m cells was obtained from US forest service (USFS) region 2 resource information system (RIS) data....
[...]
...Medium size classification applications The ELM performance has also been tested on the Banana database(7) and some other multiclass databases from the Statlog collection [2]: Landsat satellite image (SatImage), Image segmentation (Segment) and Shuttle landing control database....
[...]

Proceedings Article•

Experiments with a new boosting algorithm

[...]

Yoav Freund¹, Robert E. Schapire¹•Institutions (1)

AT&T¹

03 Jul 1996

TL;DR: This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.

...read moreread less

Abstract: In an earlier paper, we introduced a new "boosting" algorithm called AdaBoost which, theoretically, can be used to significantly reduce the error of any learning algorithm that con- sistently generates classifiers whose performance is a little better than random guessing. We also introduced the related notion of a "pseudo-loss" which is a method for forcing a learning algorithm of multi-label concepts to concentrate on the labels that are hardest to discriminate. In this paper, we describe experiments we carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems. We performed two sets of experiments. The first set compared boosting to Breiman's "bagging" method when used to aggregate various classifiers (including decision trees and single attribute- value tests). We compared the performance of the two methods on a collection of machine-learning benchmarks. In the second set of experiments, we studied in more detail the performance of boosting using a nearest-neighbor classifier on an OCR problem.

...read moreread less

7,601 citations

"Extreme learning machine: Theory an..." refers methods in this paper

...For this problem, as usually done in the literature [20,21,5,25] 75% and 25% samples are randomly chosen for training and testing at each trial, respectively....
[...]
...57% with 20 nodes, which is obviously higher than all the results so far reported in the literature using various popular algorithms such as SVM [20], SAOCIF [21], Cascade-Correlation algorithm [21], bagging and boosting methods [5], C4....
[...]

Experiment with a new boosting algorithm

[...]

Y. Freund

01 Jan 1996

7,386 citations

Journal Article•DOI•

A comparison of methods for multiclass support vector machines

[...]

Hsu Chih-Wei¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

01 Mar 2002-IEEE Transactions on Neural Networks

TL;DR: Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.

...read moreread less

Abstract: Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such "all-together" methods. We then compare their performance with three methods based on binary classifications: "one-against-all," "one-against-one," and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the "one-against-one" and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.

...read moreread less

6,562 citations

"Extreme learning machine: Theory an..." refers methods in this paper

...As proposed by Hsu and Lin [8], for each problem, we estimate the generalized accuracy using different combination of cost parameters C and kernel parameters g: C 1⁄4 1⁄22(12); 2(11); ....
[...]
...As proposed by Hsu and Lin [8], for each problem, we estimate the generalized accuracy using different combination of cost parameters C and kernel parameters g: C ¼ ½212; 211; . . . ; 2 1; 2 2 and g ¼ ½24; 23; . . . ; 2 9; 2 10 ....
[...]