scispace - formally typeset
Open AccessJournal ArticleDOI

Training Deep Spiking Neural Networks Using Backpropagation.

TLDR
In this paper, the membrane potentials of spiking neurons are treated as differentiable signals, where discontinuities at spike times are considered as noise, which enables an error backpropagation mechanism for deep spiking neural networks.
Abstract
Deep spiking neural networks (SNNs) hold the potential for improving the latency and energy efficiency of deep neural networks through data-driven event-based computation. However, training such networks is difficult due to the non-differentiable nature of spike events. In this paper, we introduce a novel technique, which treats the membrane potentials of spiking neurons as differentiable signals, where discontinuities at spike times are considered as noise. This enables an error backpropagation mechanism for deep SNNs that follows the same principles as in conventional deep networks, but works directly on spike signals and membrane potentials. Compared with previous methods relying on indirect training and conversion, our technique has the potential to capture the statistics of spikes more precisely. We evaluate the proposed framework on artificially generated events from the original MNIST handwritten digit benchmark, and also on the N-MNIST benchmark recorded with an event-based dynamic vision sensor, in which the proposed method reduces the error rate by a factor of more than three compared to the best previous SNN, and also achieves a higher accuracy than a conventional convolutional neural network (CNN) trained and tested on the same data. We demonstrate in the context of the MNIST task that thanks to their event-driven operation, deep SNNs (both fully connected and convolutional) trained with our method achieve accuracy equivalent with conventional neural networks. In the N-MNIST example, equivalent accuracy is achieved with about five times fewer computational operations.

read more

Content maybe subject to copyright    Report

ORIGINAL RESEARCH
published: 08 November 2016
doi: 10.3389/fnins.2016.00508
Frontiers in Neuroscience | www.frontiersin.org 1 November 2016 | Volume 10 | Article 508
Edited by:
Bernabe Linares-Barranco,
Instituto de Microelectrónica de
Sevilla, Spain
Reviewed by:
Tara Julia Hamilton,
Western Sydney University, Australia
Thomas Nowotny,
University of Sussex, UK
*Correspondence:
Jun Haeng Lee
junhaeng.lee@gmail.com;
junhaeng2.lee@samsung.com
Specialty section:
This article was submitted to
Neuromorphic Engineering,
a section of the journal
Frontiers in Neuroscience
Received: 30 August 2016
Accepted: 24 October 2016
Published: 08 November 2016
Citation:
Lee JH, Delbruck T and Pfeiffer M
(2016) Training Deep Spiking Neural
Networks Using Backpropagation.
Front. Neurosci. 10:508.
doi: 10.3389/fnins.2016.00508
Training Deep Spiking Neural
Networks Using Backpropagation
Jun Haeng Lee
1, 2
*
, Tobi Delbruck
2
and Michael Pfeiffer
2
1
Samsung Advanced Institute of Technology, Samsung Electronics, Suwon, South Korea,
2
Institute of Neuroinformatics,
University of Zurich and ETH Zurich, Zurich, Switzerland
Deep spiking neural networks (SNNs) hold the potential for improving the latency and
energy efficiency of deep neural networks through data-driven event-based computation.
However, training such networks is difficult due to the non-differentiable nature of spike
events. In this paper, we introduce a novel technique, which treats the membrane
potentials of spiking neurons as differentiable signals, where discontinuities at spike
times are considered as noise. This enables an error backpropagation mechanism for
deep SNNs that follows the same principles as in conventional deep networks, but
works directly on spike signals and membrane potentials. Compared with previous
methods relying on indirect training and conversion, our technique has the potential to
capture the statistics of spikes more precisely. We evaluate the proposed framework
on artificially generated events from the original MNIST handwritten digit benchmark,
and also on the N-MNIST benchmark recorded with an event-based dynamic vision
sensor, in which the proposed method reduces the error rate by a factor of more than
three compared to the best previous SNN, and also achieves a higher accuracy than a
conventional convolutional neural network (CNN) trained and tested on the same data.
We demonstrate in the context of the MNIST task that thanks to their event-driven
operation, deep SNNs (both fully connected and convolutional) trained with our method
achieve accuracy equivalent with conventional neural networks. In the N-MNIST example,
equivalent accuracy is achieved with about five times fewer computational operations.
Keywords: spiking neural network, deep neural network, backpropagation, neuromorphic, DVS, MNIST, N-MNIST
1. INTRODUCTION
Deep learning is achieving outstanding results in various machine learning tasks (
He et al.,
2
015a; LeCun et al., 2015
), but for applications that require real-time interaction with the real
environment, the repeated and often redundant update of large numbers of units becomes a
bottleneck for efficiency. An alternative has been proposed in the form of spiking neural networks
(SNNs), a major research topic in theoretical neuroscience and neuromorphic engineering. SNNs
exploit event-based, data-driven updates to gain efficiency, especially if they are combined with
inputs from event-based sensors, which reduce redundant information based on asynchronous
event processing (Camunas-Mesa et al., 2012; O’Connor et al., 2013; Merolla et al., 2014; Neil and
Liu, 2016). This feature makes spiking systems attractive for real-time applications where speed
and power consumption are important factors, especially once adequate neuromorphic hardware
platforms become more widely available. Even though in theory (
Maass and Markram, 2004) SNNs
have been shown to be as computationally powerful as conventional artificial neural networks

Lee et al. SNN Backprop
(ANNs; this term will be used to describe conventional deep
neural networks in contrast with SNNs), practically SNNs have
not quite reached the same accuracy levels of ANNs in traditional
machine learning tasks. A major reason for this is the lack of
adequate training algorithms for deep SNNs, since spike signals
(i.e., discrete events produced by a spiking neuron whenever its
internal state crosses a threshold condition) are not differentiable,
but differentiable activation functions are fundamental for using
error b ackpropagation, which is still by far the most widely used
algorithm for training deep neural networks.
A recently proposed solution is to use different data
representations between training and processing, i.e., training a
conventional ANN and developing conversion algorithms that
transfer the weights into e quivalent deep SNNs (O’Connor et al.,
2013; Diehl et al., 2015; Esser et al., 2015; Hunsberger and
Eliasmith, 2015
). However, in these methods, details of statistics
in spike trains that go beyond ideal mean rate modeling, such as
required for processing practical e vent-based sensor data cannot
be precisely represented by the signals used for training. It is
therefore desirable to devise learning rules operating directly on
spike trains, but so far it has only been possible to train single
layers, and use unsupervised learning rules, which leads to a
deterioration of accuracy (
Masquelier and Thorpe, 2007; Neftci
et al., 2014; Diehl and Cook, 2015). An alternative approach has
recently been introduced by O’Connor and Welling (2016), in
which a SNN learns from spikes, but requires keeping statistics
for computing stochastic gradient descent (SGD) updates in
order to approximate a conventional ANN.
In this paper we introduce a novel supervised learning meth od
for SNNs, which closely follows the successful backpropagation
algorithm for deep ANNs, but here is used to train general
forms of deep SNNs directly from spike signals. This framework
includes both fully connected and convolutional SNNs, SNNs
with leaky membrane potential, and layers implementing spiking
winner-takes-all (WTA) circuits. The key idea of our approach
is to generate a continuous and differentiable signal on which
SGD can work, using low-pass filtered spiking signals added
onto the membrane potential and treating abrupt changes of
the membrane potential as noise during error backpropagation.
Additional techniques are presented that address particular
challenges of SNN training: Spiking neurons typically require
large thresholds to achieve stability and reasonable firing rates,
but large thresholds may result in many “dead neurons, which
do not participate in the optimization during training. Novel
regularization and normalization techniques are proposed that
contribute to stable and balanced learning. Our techniques lay
the foundations for closing the performance gap between SNNs
and ANNs, and promote their use for practical applications.
1.1. Related Work
Gradient descent methods for SNNs have not been deeply
investigated because both spike trains and the underlying
membrane potentials are not differentiable at the time of spikes.
The most successful approaches to d ate have used indirec t
methods, such as training a network in the continuous rate
domain and converting it into a spiking version.
O’Connor
et al. (2013)
pioneered t his area by training a spiking deep
belief network based on the Siegert event-rate approximation
model. However, on the MNIST hand written digit classification
task (
LeCun et al., 1998), which is nowadays almost perfectly
solved by ANNs (0.21% error rate in
Wan et al., 2013), their
approach only reached an accuracy around 94.09%.
Hunsberger
and Eliasmith (2015) used the softened rate model, in which a
hard threshold in the response function of leaky integrate and
fire (LIF) neuron is replaced with a continuous differentiable
function to make it amenable to use in backpropagation. After
training an ANN with the rate model they converted it into a SNN
consisting of LIF neurons. With the help of pre-training based on
denoising autoencoders they achieved 98.6% in the permutation-
invariant (PI) MNIST task (see Secti on 3.1). Diehl et al. (2015)
trained deep neural networks with conventional deep learning
techniques and additional constraints necessary for conversion
to SNNs. After training, the ANN units were converted into
non-leaky spiking neurons and the performance was optimized
by normalizing weight parameters. This approach resulted in
the current state-of-the-art accuracy for SNNs of 98.64% in
the PI MNIST t ask.
Esser et al. (2015) used a differentiable
probabilistic spiking neuron model for training and statistically
sampled the trained network for deployment. In all of these
methods, training was performed indirectly using continuous
signals, which may not capture important statistics of spikes
generated by real sensors used during processing. Even though
SNNs are well-suited for processing si gnals from event-based
sensors such as the Dynamic Vision Sensor (DVS) (Lichtsteiner
et al., 2008), the previous SNN training models require removing
time information and generating image frames from the e vent
streams. Instead, in this article we use the same signal format
for training and processing deep SNNs, and can thus train SNNs
directly on spatio-temporal event streams considering non-ideal
factors such as pixel variation in sensors. This is demonstrated
on the neuromorphic N-MNIST benchmark dataset (
Orchard
et al., 2015), achieving higher accuracy with a smaller number of
neurons than all previous attempts that ignored spike timing by
using event-rate approximation models for training.
2. MATERIALS AND METHODS
2.1. Spiking Neural Networks
In this article we study two t ypes of networks: Fully connected
SNNs with multiple hidden layers and convolutional SNNs. Let
M and N be the number of synapses of a neuron and the number
of neurons in a layer, respectively. On the other hand, m and n
are the number of active synapses (i.e., synapses receiving spike
inputs) of a neuron and the number of active neurons (sending
spike outputs) in a layer during the presentation of an input
sample. We will also use the simplified form of indices for active
synapses and neurons throughout the paper as
Active synapses: {v
1
, ··· , v
m
}{1, ··· , m}
Active neurons: {u
1
, ··· , u
n
}{1, ··· , n}
Thus, if an index i, j, or k is used for a sy napse over [1, m] or a
neuron over [1, n] (e.g., in Equation 5), then it actually represents
an index of an active synapse (v
i
) or an active neuron (u
j
).
Frontiers in Neuroscience | www.frontiersin.org 2 November 2016 | Volume 10 | Article 508

Lee et al. SNN Backprop
2.1.1. Leaky Integrate-and-Fire (LIF) Neuron
The LIF neuron is one of the simplest models used for describing
dynamics of spiking neurons (
Gerstner and Kistler, 2002). Since
the states of LIF neurons can be updated asynchronously based
solely on the timing of input events (i.e., without timestepped
integration), LIF is computationally efficient. For a given input
spike th e membrane potential of a LIF neuron can be updated as
V
mp
(t
p
) = V
mp
(t
p 1
)e
t
p 1
t
p
τ
mp
+ w
(p)
i
w
dyn
, (1)
where V
mp
is the membrane potential, τ
mp
is the membrane time
constant, t
p
and t
p 1
are t h e present and previous input spike
times, w
(p)
i
is the synaptic weight of the i-th synapse (through
which the present p-th input spike arrives). We introduce here
a dynamic weight w
dyn
, which controls the refractory period
following
w
dyn
=
(
(1
t
/T
ref
)
2
if 1
t
< T
ref
and w
dyn
< 1
1 otherwise
(2)
where T
ref
is t he maximum duration of the refractory period, and
1
t
= t
out
t
p
, where t
out
is the time of the latest output spike
produced by the neuron or an external trigger signal th rough
lateral inhibition as discussed in Section 2.1.2. Thus, the effect
of input spikes on V
mp
is suppressed for a short period of time
T
ref
after an output spike. w
dyn
recovers quadratically to 1 after
the output spike at t
out
. Since w
dyn
is a neuron parameter and
applied to all synapses identically, it is different from short-
term plasticity, which is a synapse specific mechanism. The
motivation to use dynamic weights instead of simpler refractory
mechanisms, such as simply blocking the generation of output
spikes, is that it allows controlling refractory states by external
mechanisms. One example is the introduction of WTA circuits
in Section 2.1.2, where lateral inhibition simultaneously puts
all neurons competing in a WTA into the refractory state.
This ensures that the winning neuron gets another chance to
win the competition, since otherwise another neuron could fire
while only the winner has to reset its membrane potential after
generating a spike.
When V
mp
crosses the threshold value V
th
, the LIF neuron
generates an output spike and V
mp
is decreased by the amount
of t he threshold:
V
mp
(t
+
p
) = V
mp
(t
p
) V
th
, (3)
where t
+
p
is the time right after t h e reset. A lower bound for the
membrane potential is set at V
th
, and V
mp
is clipped whenever
it falls below this value. This strategy helps balancing the
participation of neurons during training by preventing neurons
from having highly negative membrane potentials. We will revisit
this issue when we introduce threshold regularization in Section
2.3.2.
2.1.2. Winner-Take-All (WTA) Circuit
We found that the accuracy of SNNs could be improved by
introducing a competitive recurrent architecture in the form of
adding WTA circuits in certain layers. In a WTA circuit, multiple
neurons form a group with lateral inhibitory connections. Thus,
as soon as any neuron produces an output spike, it inhibits all
other neurons in the circuit and prevents them from spiking
(
Rozell et al., 2008; Oster et al., 2009). In this work, all lateral
connections in a WTA circuit have the same strength, which
reduces memory and computational costs for implementing
them. The amount of lateral inhibition applied to the membrane
potential is proportional to the inhibited neuron’s membrane
potential threshold (the exact form is defined in Equation 5
in Section 2.2.2). With this scheme, lateral connections inhibit
neurons having small V
th
weakly and those having large V
th
strongly. This improves the balance of activities among neurons
during training since neurons with higher activities have larger
V
th
due to the threshold regularization scheme described in
Section 2.3.2. Furthermore, as described previously in Section
2.1.1, lateral inhibition is used to put the dynamic weights of all
inhibited neurons in a WTA circuit into the refractory state. As
shown in Figure 3 and discussed later in Section 3. 1, we found
that adding WTA circuits both improves classification accuracy,
and improves the stability and speed of convergence during
training.
2.2. Using Backpropagation in SNNs
In order to derive and apply the backpropagation equations for
training SNNs, after summarizing the classical backpropagation
method (
Rumelhart and Zipser, 1985) we derive differentiable
transfer functions for spiking neurons in WTA configuration.
Furthermore, we introduce simple methods to initialize
parameters and normalize backpropagating errors to address
vanishing or exploding gradients, and to stabilize training. These
are variations of successful methods used commonly in deep
learning, but ad apted to the specific requirements of SNNs.
2.2.1. Backpropagation Revisited
Neural networks are typically optimized by SGD, meaning that
the vector of network parameters or weights θ is moved in
the direction of the negative gradient of some loss function L
according to θ = θ η∂L/∂θ, where η is the learning rate.
The backpropagation algorithm uses the chain rule to compute
the partial derivatives L/∂θ . For completeness we provide here
a summary of backprop for conventional fully-connected deep
neural networks:
1. Propagation inputs in the forward direction to compute the
pre-activations (z
(l)
) and activations (a
(l)
= f
(l)
(z
(l)
)) for all
the layers up to the output layer l
n
l
, where f is the transfer
function of units.
2. Calculate the error at the output layer:
δ
(n
l
)
=
L(a
(n
l
)
,y)
z
(n
l
)
=
L(a
(n
l
)
,y)
a
(n
l
)
· f
(z
(n
l
)
)
where y is the label vector indicating the desired output
activation and · is element-wise multiplication.
3. Backpropagate the error to lower layers l = n
l
1, n
l
2, . . . , 2:
δ
(l)
=
(W
(l)
)
T
δ
(l +1)
· f
(z
(l)
)
Frontiers in Neuroscience | www.frontiersin.org 3 November 2016 | Volume 10 | Article 508

Lee et al. SNN Backprop
where W
(l)
is t he weight matrix of the layer l.
4. Compute the partial derivatives for the update:
W
(l)
L = δ
(l +1)
(a
(l)
)
T
b
(l)
L = δ
(l +1)
where b
(l)
is t he bias vector of the layer l.
5. Update the parameters:
W
(l)
= W
(l)
η
W
(l)
L
b
(l)
= b
(l)
η
b
(l)
L
2.2.2. Transfer Function and Derivatives
Starting from the event-based update of the membrane potentials
in Equation (1), we ca n define the accumulated effect (normalized
by synaptic weight) of the k-th active input synapse onto the
membrane potential of a target neuron as x
k
(t). Similarly, the
generation of spikes in neuron i acts on its own membrane
potential via the term a
i
, which is due to the reset in Equation
(3) (normalized by V
th
). Both x
k
and a
i
can be expressed as sums
of exponentially decaying terms
x
k
(t) =
X
p
exp
t
p
t
τ
mp
, a
i
(t) =
X
q
exp
t
q
t
τ
mp
, (4)
where the first sum is over all input spik e times t
p
< t at
the k-th input synapse, and the second sum is over the output
spike times t
q
< t for a
i
. The accumulated effects of lateral
inhibitory signals in WTA circuits can be expressed analogously
to Equation (4). The activities in Equation (4) are real-valued and
continuous except for the time points where spikes occur and the
activities jump up. We use these numerically computed lowpass-
filtered activities for backpropagation instead of directly using
spike signals.
Ignoring the effect of refractory periods for now, t h e
membrane potential of the i-th active neuron in a WTA circuit
can be written in terms of x
k
and a
i
defined in Equation (4) as
V
mp,i
(t) =
m
X
k =1
w
ik
x
k
(t) V
th,i
a
i
(t) +σ V
th,i
n
X
j =1,j 6=i
κ
ij
a
j
(t). (5)
The terms on the right side represent the input, membrane
potential resets, and lateral inhibition, respectively. κ
ij
is the
strength of lateral inhibition (1 κ
ij
0) from the j-th
active neuron to the i-th active neuron, and σ is the expected
efficacy of lateral inhibition. σ should be smaller than 1, since
lateral inhibitions can affect th e membrane potential only down
to its lower bound (i.e., V
th
). We found a value of σ 0.5
to work well in practice. Equation (5) reveals the relationship
between inputs and outputs of spiking neurons which is not
clearly shown in Equations (1) and (3). Nonlinear activation of
neurons is considered in Equation (5) by including only active
synapses and neurons. Figure 1 shows the relationship between
signals presented in Equations (4) and (5). Since the output (a
i
)
of the current layer becomes the input (x
k
) of the next layer if all
the neurons have same τ
mp
, Equation (5) provides the basis for
deriving th e backpropagation algorithm via the chain rule.
Differentiation is not defined in Equation (4) at the moment of
each spike because there is a discontinuous step jump. However,
we propose here to ignore these fluctuations, and treat Equations
(4) and (5) as if they were differentiable continuous signals
to derive the necessary error gradients for backpropagation.
In previous works (
O’Connor et al., 2013; Diehl et al., 2015;
Esser et al., 2015; Hunsberger and Eliasmith, 2015), continuous
variables were introduced as a surrogate for x
k
and a
i
in Equation
(5) for backpropagation. In this work, however, we directly use
the contribution of spike signals to the membrane potential
as defined in Equation (4). Thus, the real statistics of spike
signals, including temporal effects such as synchrony between
inputs, can influence the training process. Ignoring the step
jumps caused by spikes in the calculation of gradients might of
course introduce errors, but as our results show, in practice this
seems to have very little influence on SNN training. A potential
explanation for this robustness of our training scheme is that
by treating the signals in Equation (4) as continuous signals
that fluctuate suddenly at times of spikes, we achieve a similar
positive effect as th e widely used approach of noise injection
during training, which can improve the generalization capability
of neural networks (Vincent et al., 2008). In the case of SNNs,
several papers have used the trick of treating spike-induced
abrupt changes as noise for gradient descent optimization
(Bengio et al., 2015; Hunsberger and Eliasmith, 2015). However,
in these cases the model added Gaussian random noise instead
of spike-induced perturba tions. In this work, we directly use the
actual contribution of spike signals to the membrane potential as
described in Equation (4) for training SNNs. Our results show
empirically that this approach works well for learning in SNNs
where information is encoded in spike rates. Importantly, the
presented framework also provides the basis for utilizing specific
spatio-temporal codes, which we demonstrate on a task using
inputs from event-based sensors.
For the backpropagation equations it is necessary to obtain
the transfer functions of LIF neurons in WTA c ircuits (which
generalizes to non-WTA layers by setting κ
ij
= 0 for all i and j).
For this we set the residual V
mp
term in the left side of Equation
(5) to zero (since it is not rele vant to the transfer function),
resulting in the transfer function
a
i
s
i
V
th,i
+ σ
n
X
j =1,j 6=i
κ
ij
a
j
, where s
i
=
m
X
k =1
w
ik
x
k
. (6)
Refractory periods are not considered here since the activity of
neurons in SNNs is rarely dominated by refractory periods in
a normal operating regime. For example, we used a refractory
period of 1 ms and the event rates of individual neurons were
kept within a few tens of events per second (eps). Equation (6)
is consistent with (4.9) in
Gerstner and Kistler (2002) with out
WTA terms. The equation can also be simplified to a spiking
version of a rectified-linear unit by introducing a unit threshold
and non-leaky membrane potential as in
O’Connor and Welling
(2016)
.
Frontiers in Neuroscience | www.frontiersin.org 4 November 2016 | Volume 10 | Article 508

Lee et al. SNN Backprop
FIGURE 1 | Conceptual diagram showing the relationship between signals in the proposed spiking neural network model. Error gradients are
back-propagated through the components of the membrane potential defined in Equation (4).
Directly differentiating Equation (6) yields the
backpropagation equations
a
i
s
i
1
V
th,i
,
a
i
w
ik
a
i
s
i
x
k
,
a
i
V
th,i
a
i
s
i
(a
i
+ σ
n
X
j 6=i
κ
ij
a
j
),
a
i
κ
ih
a
i
s
i
(σ V
th,i
a
h
), (7)
a
1
x
k
.
.
.
a
1
x
k
1
σ
q ··· κ
1n
.
.
.
.
.
.
.
.
.
κ
n1
··· q
1
w
1k
V
th,1
.
.
.
w
nk
V
th,n
(8)
where q = 1 . When all the lateral inhibitory connections have
the same strength (κ
ij
= µ, i, j) and are not learned, a
i
/∂κ
ih
is
not necessary and Equation (8) can be simplified to
a
i
x
k
a
i
s
i
1
(1 µσ )
w
ik
µσ V
th,i
1 + µσ (n 1)
n
X
j =1
w
jk
V
th,j
. (9)
By inserting the above derivatives in Equations (7) and (9) into
the standard error backpropagation algorithm, we obtain an
effective learning rule for SNNs. We consider only the first-order
effect of the lateral connections in the derivation of gradients.
Higher-order terms propagating back through multiple lateral
connections are neglected for simplicity. This is mainly because
all the lateral connections considered here are inhibitory. For
inhibitory lateral connections, the effect of small parameter
changes decays rapidly with connection distance. Thus, first-
order approximation saves a lot of computational cost without
loss of accuracy.
2.2.3. Weight Initialization and Backprop Error
Normalization
Good initialization of weight parameters in supervised learning
is critical to handle the exploding or vanishing gradients problem
in deep neural networks (
Glorot and Bengio, 2010; He et al.,
2015b). The basic idea behind those methods is to maintain
the balance of forward activations and backward propagating
errors among layers. Recently, the batch normalization technique
has been proposed to make sure that such balance is
maintained through the whole training process (Ioffe and
Szegedy, 2015
). However, normalization of activities as in the
batch normalization scheme is difficult for SNNs, because
there is no efficient method for amplifying event rates above
the input rate. The initialization methods proposed in
Glorot
and Bengio (2010) or He et al. (2015b) are not appropriate
for SNNs either, because SNNs have posit ive thresholds
that are usually much larger than individual weight values.
In this work, we propose simple methods for initializing
parameters and normalizing backprop errors for training
deep SNNs. Even though the proposed technique does not
guarantee the balance of forward activations, it is effective for
addressing the exploding and vanishing gradients problems.
Error normalization is not critical for training SNNs with
a si ngle hidden layer. However, we observed that training
deep SNNs without normalizing backprop errors mostly failed
due to exploding gradients. We describe here the method in
case of fully-connected deep networks for simplici ty. However,
Frontiers in Neuroscience | www.frontiersin.org 5 November 2016 | Volume 10 | Article 508

Citations
More filters
Journal ArticleDOI

Towards spike-based machine intelligence with neuromorphic computing.

TL;DR: An overview of the developments in neuromorphic computing for both algorithms and hardware is provided and the fundamentals of learning and hardware frameworks are highlighted, with emphasis on algorithm–hardware codesign.
Journal ArticleDOI

Memory devices and applications for in-memory computing

TL;DR: This Review provides an overview of memory devices and the key computational primitives enabled by these memory devices as well as their applications spanning scientific computing, signal processing, optimization, machine learning, deep learning and stochastic computing.
Journal ArticleDOI

Deep learning in spiking neural networks

TL;DR: The emerging picture is that SNNs still lag behind ANNs in terms of accuracy, but the gap is decreasing, and can even vanish on some tasks, while SNN's typically require many fewer operations and are the better candidates to process spatio-temporal data.
Journal ArticleDOI

Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification.

TL;DR: This paper shows conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset.
Journal ArticleDOI

Event-based Vision: A Survey

TL;DR: This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

Deep learning

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Journal ArticleDOI

Gradient-based learning applied to document recognition

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Journal Article

Dropout: a simple way to prevent neural networks from overfitting

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Related Papers (5)