Home
/
Authors
/
Hava T. Siegelmann

Author

Hava T. Siegelmann

Other affiliations: Harvard University, Technion – Israel Institute of Technology, University of Minnesota ...read more

Bio: Hava T. Siegelmann is an academic researcher from University of Massachusetts Amherst. The author has contributed to research in topics: Artificial neural network & Recurrent neural network. The author has an hindex of 34, co-authored 172 publications receiving 7092 citations. Previous affiliations of Hava T. Siegelmann include Harvard University & Technion – Israel Institute of Technology.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Support vector clustering

[...]

Asa Ben-Hur, David Horn¹, Hava T. Siegelmann², Vladimir Vapnik³•Institutions (3)

Tel Aviv University¹, Massachusetts Institute of Technology², AT&T Labs³

01 Mar 2002-Journal of Machine Learning Research

TL;DR: In this paper, a Gaussian kernel based clustering method using support vector machines (SVM) is proposed to find the minimal enclosing sphere, which can separate into several components, each enclosing a separate cluster of points.

...read moreread less

Abstract: We present a novel clustering method using the approach of support vector machines. Data points are mapped by means of a Gaussian kernel to a high dimensional feature space, where we search for the minimal enclosing sphere. This sphere, when mapped back to data space, can separate into several components, each enclosing a separate cluster of points. We present a simple algorithm for identifying these clusters. The width of the Gaussian kernel controls the scale at which the data is probed while the soft margin constant helps coping with outliers and overlapping clusters. The structure of a dataset is explored by varying the two parameters, maintaining a minimal number of support vectors to assure smooth cluster boundaries. We demonstrate the performance of our algorithm on several datasets.

...read moreread less

1,389 citations

Journal Article•DOI•

On the Computational Power of Neural Nets

[...]

Hava T. Siegelmann¹, Eduardo D. Sontag¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 Feb 1995-Journal of Computer and System Sciences

TL;DR: It is proved that one may simulate all Turing machines by such nets, and any multi-stack Turing machine in real time, and there is a net made up of 886 processors which computes a universal partial-recursive function.

...read moreread less

837 citations

Journal Article•DOI•

Computational capabilities of recurrent NARX neural networks

[...]

Hava T. Siegelmann¹, Bill G. Horne², C.L. Giles³•Institutions (3)

Technion – Israel Institute of Technology¹, Princeton University², University of Maryland, College Park³

01 Apr 1997

TL;DR: It is constructively proved that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines, raising the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power.

...read moreread less

Abstract: Recently, fully connected recurrent neural networks have been proven to be computationally rich-at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. These networks are based upon Nonlinear AutoRegressive models with eXogenous Inputs (NARX models), and are therefore called NARX networks. As opposed to other recurrent networks, NARX networks have a limited feedback which comes only from the output neuron rather than from hidden states. They are formalized by y(t)=/spl Psi/(u(t-n/sub u/), ..., u(t-1), u(t), y(t-n/sub y/), ..., y(t-1)) where u(t) and y(t) represent input and output of the network at time t, n/sub u/ and n/sub y/ are the input and output order, and the function /spl Psi/ is the mapping performed by a Multilayer Perceptron. We constructively prove that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines. We conclude that in theory one can use the NARX models, rather than conventional recurrent networks without any computational loss even though their feedback is limited. Furthermore, these results raise the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power.

...read moreread less

462 citations

Book•

Neural networks and analog computation: beyond the Turing limit

[...]

Hava T. Siegelmann

01 Mar 1999

TL;DR: This chapter discusses Neural Networks and Turing Machines, which are concerned with the construction of neural networks based on the explicit specification of a discrete-time Turing machine.

...read moreread less

Abstract: 1 Computational Complexity.- 1.1 Neural Networks.- 1.2 Automata: A General Introduction.- 1.2.1 Input Sets in Computability Theory.- 1.3 Finite Automata.- 1.3.1 Neural Networks and Finite Automata.- 1.4 The Turing Machine.- 1.4.1 Neural Networks and Turing Machines.- 1.5 Probabilistic Turing Machines.- 1.5.1 Neural Networks and Probabilistic Machines.- 1.6 Nondeterministic Turing Machines.- 1.6.1 Nondeterministic Neural Networks.- 1.7 Oracle Turing Machines.- 1.7.1 Neural Networks and Oracle Machines.- 1.8 Advice Turing Machines.- 1.8.1 Circuit Families.- 1.8.2 Neural Networks and Advice Machines.- 1.9 Notes.- 2 The Model.- 2.1 Variants of the Network.- 2.1.1 A "System Diagram" Interpretation.- 2.2 The Network's Computation.- 2.3 Integer Weights.- 3 Networks with Rational Weights.- 3.1 The Turing Equivalence Theorem.- 3.2 Highlights of the Proof.- 3.2.1 Cantor-like Encoding of Stacks.- 3.2.2 Stack Operations.- 3.2.3 General Construction of the Network.- 3.3 The Simulation.- 3.3.1 P-Stack Machines.- 3.4 Network with Four Layers.- 3.4.1 A Layout Of The Construction.- 3.5 Real-Time Simulation.- 3.5.1 Computing in Two Layers.- 3.5.2 Removing the Sigmoid From the Main Layer.- 3.5.3 One Layer Network Simulates TM.- 3.6 Inputs and Outputs.- 3.7 Universal Network.- 3.8 Nondeterministic Computation.- 4 Networks with Real Weights.- 4.1 Simulating Circuit Families.- 4.1.1 The Circuit Encoding.- 4.1.2 A Circuit Retrieval.- 4.1.3 Circuit Simulation By a Network.- 4.1.4 The Combined Network.- 4.2 Networks Simulation by Circuits.- 4.2.1 Linear Precision Suffices.- 4.2.2 The Network Simulation by a Circuit.- 4.3 Networks versus Threshold Circuits.- 4.4 Corollaries.- 5 Kolmogorov Weights: Between P and P/poly.- 5.1 Kolmogorov Complexity and Reals.- 5.2 Tally Oracles and Neural Networks.- 5.3 Kolmogorov Weights and Advice Classes.- 5.4 The Hierarchy Theorem.- 6 Space and Precision.- 6.1 Equivalence of Space and Precision.- 6.2 Fixed Precision Variable Sized Nets.- 7 Universality of Sigmoidal Networks.- 7.1 Alarm Clock Machines.- 7.1.1 Adder Machines.- 7.1.2 Alarm Clock and Adder Machines.- 7.2 Restless Counters.- 7.3 Sigmoidal Networks are Universal.- 7.3.1 Correctness of the Simulation.- 7.4 Conclusions.- 8 Different-limits Networks.- 8.1 At Least Finite Automata.- 8.2 Proof of the Interpolation Lemma.- 9 Stochastic Dynamics.- 9.1 Stochastic Networks.- 9.1.1 The Model.- 9.2 The Main Results.- 9.2.1 Integer Networks.- 9.2.2 Rational Networks.- 9.2.3 Real Networks.- 9.3 Integer Stochastic Networks.- 9.4 Rational Stochastic Networks.- 9.4.1 Rational Set of Choices.- 9.4.2 Real Set of Choices.- 9.5 Real Stochastic Networks.- 9.6 Unreliable Networks.- 9.7 Nondeterministic Stochastic Networks.- 10 Generalized Processor Networks.- 10.1 Generalized Networks: Definition.- 10.2 Bounded Precision.- 10.3 Equivalence with Neural Networks.- 10.4 Robustness.- 11 Analog Computation.- 11.1 Discrete Time Models.- 11.2 Continuous Time Models.- 11.3 Hybrid Models.- 11.4 Dissipative Models.- 12 Computation Beyond the Turing Limit.- 12.1 The Analog Shift Map.- 12.2 Analog Shift and Computation.- 12.3 Physical Relevance.- 12.4 Conclusions.

...read moreread less

407 citations

Journal Article•DOI•

Turing computability with neural nets

[...]

Hava T. Siegelmann¹, Eduardo D. Sontag¹•Institutions (1)

Rutgers University¹

01 Jan 1991-Applied Mathematics Letters

TL;DR: The existence of a finite neural network, made up of sigmoidal neurons, which simulates a universal Turing machine, composed of less than 10 5 synchronously evolving processors, interconnected linearly is shown.

...read moreread less

388 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

Collapse

Cited by

PDF

Open Access

More filters

Book•

Deep Learning

[...]

Ian Goodfellow¹, Yoshua Bengio², Aaron Courville²•Institutions (2)

Google¹, Université de Montréal²

18 Nov 2016

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

...read moreread less

38,208 citations

Journal Article•DOI•

Deep learning in neural networks

[...]

Jürgen Schmidhuber¹•Institutions (1)

University of Lugano¹

01 Jan 2015-Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

14,635 citations

Book Chapter•DOI•

Introduction to Algorithms

[...]

Xin-She Yang

01 Jan 2014

TL;DR: This chapter provides an overview of the fundamentals of algorithms and their links to self-organization, exploration, and exploitation.

...read moreread less

Abstract: Algorithms are important tools for solving problems computationally. All computation involves algorithms, and the efficiency of an algorithm largely determines its usefulness. This chapter provides an overview of the fundamentals of algorithms and their links to self-organization, exploration, and exploitation. A brief history of recent nature-inspired algorithms for optimization is outlined in this chapter.

...read moreread less

8,285 citations

Journal Article•DOI•

Gene Selection for Cancer Classification using Support Vector Machines

[...]

Isabelle Guyon, Jason Weston, Stephen Barnhill, Vladimir Vapnik¹•Institutions (1)

AT&T Labs¹

11 Mar 2002-Machine Learning

TL;DR: In this article, a Support Vector Machine (SVM) method based on recursive feature elimination (RFE) was proposed to select a small subset of genes from broad patterns of gene expression data, recorded on DNA micro-arrays.

...read moreread less

Abstract: DNA micro-arrays now permit scientists to screen thousands of genes simultaneously and determine whether those genes are active, hyperactive or silent in normal or cancerous tissue. Because these new micro-array devices generate bewildering amounts of raw data, new analytical methods must be developed to sort out whether cancer tissues have distinctive signatures of gene expression over normal tissues or other types of cancer tissues. In this paper, we address the problem of selection of a small subset of genes from broad patterns of gene expression data, recorded on DNA micro-arrays. Using available training examples from cancer and normal patients, we build a classifier suitable for genetic diagnosis, as well as drug discovery. Previous attempts to address this problem select genes with correlation techniques. We propose a new method of gene selection utilizing Support Vector Machine methods based on Recursive Feature Elimination (RFE). We demonstrate experimentally that the genes selected by our techniques yield better classification performance and are biologically relevant to cancer. In contrast with the baseline method, our method eliminates gene redundancy automatically and yields better and more compact gene subsets. In patients with leukemia our method discovered 2 genes that yield zero leave-one-out error, while 64 genes are necessary for the baseline method to get the best result (one leave-one-out error). In the colon cancer database, using only 4 genes our method is 98% accurate, while the baseline method is only 86% accurate.

...read moreread less

7,939 citations

Journal Article•DOI•

Survey of clustering algorithms

[...]

Rui Xu¹, Donald C. Wunsch¹•Institutions (1)

Missouri University of Science and Technology¹

01 May 2005-IEEE Transactions on Neural Networks

TL;DR: Clustering algorithms for data sets appearing in statistics, computer science, and machine learning are surveyed, and their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts are illustrated.

...read moreread less

Abstract: Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts. Several tightly related topics, proximity measure, and cluster validation, are also discussed.

...read moreread less

5,744 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse