# Neuromorphic learning and recognition with one-transistor-one-resistor synapses and bistable metal oxide RRAM

Stefano Ambrogio, *Student Member, IEEE*, Simone Balatti, *Student Member, IEEE*, Valerio Milo, Roberto Carboni, ZhongQiang Wang, Alessandro Calderoni, Nirmal Ramaswamy, *Senior Member, IEEE*, Daniele Ielmini, *Senior Member, IEEE* 

Abstract—Resistive switching memory (RRAM) has been proposed as artificial synapse in neuromorphic circuits due to its tunable resistance, low power operation, and scalability. For the development of high-density neuromorphic circuits, it is essential to validate state-of-the-art bistable RRAM and to introduce small-area building blocks serving as artificial synapses. This work introduces a new synaptic circuit consisting of a one-transistor/one-resistor (1T1R) structure, where the resistive element is a HfO<sub>2</sub> RRAM with bipolar switching. The spike-timing dependent plasticity (STDP) is demonstrated in both the deterministic and stochastic regimes of the RRAM. Finally, a fully-connected neuromorphic network is simulated showing online unsupervised pattern learning and recognition for various voltages of the POST spike. The results support bistable RRAM for high-performance artificial synapses in neuromorphic circuits.

Keywords: resistive switching memory (RRAM), artificial synapse, neuromorphic network, memristive device, pattern learning.

#### I. INTRODUCTION

Emerging memory devices, such as the resistive switching memory (RRAM), are currently being investigated for future memory generation featuring high density, low cost, high speed and nonvolatile retention [1], [2]. As the device size and the operating current are reduced, however, the statistical variations of device parameters increase [3], [4], raising the demand for control algorithms and the associated circuit overhead [5]. Statistical variations are generally detrimental for digital memory operation, however they can be tolerated in some computing applications, such as the generation of random numbers [6]–[9] and the neuromorphic networks [10]– [12]. Neuromorphic computing can even take advantage of stochastic variations, which contribute to the normal operation of fuzzy neural networks in animals and humans [13].

In this work, we present a new synapse circuit with one-transistor/one-resistor (1T1R) structure that is used as a tunable connection between a pre-synaptic neuron (PRE) and a post synaptic neuron (POST). The RRAM synapse allows, on the one hand, to passively transmit spikes, and, on the other hand, to update its weight in accordance to a spike-timing dependent plasticity (STDP) protocol. The STDP characteristics are characterized and modeled for deterministic and stochastic switching. Finally, we simulate a 2-layer neuromorphic network based on the experimentally observed STDP characteristics, taking into account resistance-dependent STDP [12], [14], [15] and demonstrating on-line pattern learning and recognition with deterministic and stochastic switching. These results support state-of-the-art RRAM for neuromorphic circuits capable of learning, updating and recognizing realworld visual and auditory patterns.

## II. RRAM SAMPLES AND CHARACTERISTICS

Our RRAM devices consist of a Si-doped HfO<sub>2</sub> layer with TiN bottom electrode (BE) and Ti top electrode (TE) [4]. 1T1R structures, as shown in Fig. 1a, were used to conduct pulsed experiments driving TE and gate nodes by an arbitrary waveform generator, while the TE voltage and RRAM current were monitored by an oscilloscope as in Fig. 1b [16]. Fig. 1c shows a typical I-V curve obtained in response to bipolar triangular pulses for set (positive voltage) and reset (negative voltage) [16]. The pulse-width  $t_P$  was 1 ms, while the compliance current  $I_C$  was adjusted to 50  $\mu$ A by proper tuning of the gate voltage  $V_G$ . Set transition from the highresistance state (HRS) to the low-resistance state (LRS) takes place at  $V_{set} \approx 1.5$  V. On the other hand, the onset of the reset transition from LRS to HRS is seen at  $V_{reset} \approx$  -1 V and is completed at  $V_{stop} = -1.5$  V, which is the maximum voltage in the negative sweep, as shown in Fig. 1c [9]. Note that both set and reset transitions are rather abrupt, which contrasts with the gradual adjustment of synaptic weight observed in the biological STDP [17], [18]. Complementary switching during set process was avoided in our devices by use of an asymmetric structure of RRAM with a Ti oxygen exchange layer at the TE, and of a relatively low  $I_C$  [19], [20].

### III. 1T1R SYNAPSE

The 1T1R structure in Fig. 1a can be adopted as a synapse circuit as shown in Fig. 2a. This is a simplified version of the 2-transistor/one-resistor (2T1R) synapse [12], where one transistor could activate the communication of the PRE

S. Ambrogio, S. Balatti, R. Carboni, Z.Q. Wang and D. Ielmini are with the Dipartimento di Elettronica, Informazione e Bioingegneria and Italian Universities Nanoelectronics Team (IU.NET), Politecnico di Milano, piazza L. da Vinci 32, 20133 Milano, Italy. A. Calderoni and N. Ramaswamy are with Micron Technology Inc., Boise, Idaho. E-mail: daniele.ielmini@polimi.it. This work was supported in part by the ERC Consolidator Grant No. 648635 "Resistive-switch computing Beyond CMOS".



Fig. 1. Schematic illustration of the 1T1R structure used in this work (a), experimental setup (b) and measured I-V curve showing the definition of parameters  $V_{set}$ ,  $V_{reset}$ ,  $V_{stop}$ ,  $I_C$  and  $I_{reset}$  (c). The RRAM stack includes a HfO<sub>x</sub> switching layer, a Ti cap layer and TiN BE.

spike to the POST, while the other transistor was responsible for updating the synaptic weight according to STDP. The 1T1R circuit in Fig. 2a is capable of both functions with just one transistor, which alternatively activates communication or plasticity in the synapse. As shown in Fig. 2a, the PRE spike controls the gate voltage  $V_G$  of the transistor, while the TE voltage  $V_{TE}$  is controlled by the POST and is generally biased to a relatively low constant voltage. As a result, every PRE spike activates a current which is inversely proportional to the 1T1R resistance. The 1T1R current is collected by the virtual ground input node of the POST neuron, which also collects the current from other synapses. As the integrated current exceeds an internal threshold, the POST experiences a fire event according to the typical 'integrate and fire' behavior of the neuron [21]. Upon fire, besides sending a spike pulse to the subsequent layer of neurons, the POST also delivers a pulse back to the top electrode (TE) of the 1T1R synapse according to the waveform in Fig. 2b. The TE waveform shows 2 phases, the first one consisting of a positive voltage pulse of 1 ms followed by a 9 ms pause, while the second phase has a negative pulse of 1 ms width followed by a 9 ms pause. Before and after the backward spike, the same low-amplitude  $V_{TE}$  is maintained with the purpose of activating current spikes to the POST. In our experiments, the  $V_G$  spike of the PRE consists of a first phase with positive voltage 2.1 V and width 10 ms followed by a second phase of zero voltage for 10 ms. The value of  $V_G$  was chosen in correspondence of a compliance current  $I_C = 50 \ \mu A$ , which is small enough to allow a relatively small power consumption during set/reset transitions. The TE voltage during communication was kept constant and equal to a relatively low value  $V_{TE} = 20$  mV, which is low enough to induce no change in the RRAM resistance. The positive and negative peaks during the fire events were  $V_{TE+} = +2.5$  V and  $V_{TE-} = -1.6$  V, respectively.

The large values of  $V_{TE+}$  and  $V_{TE-}$ , in contrast to the low value of  $V_{TE} < V_{set}$  in the communication stage, allow to activate STDP according to the timing between the PRE and POST spikes. In fact, defining a relative delay  $\Delta t$  given by:

$$\Delta t = t_{post} - t_{pre},\tag{1}$$

where  $t_{pre}$  and  $t_{post}$  are measured in correspondence of the onset of the PRE and POST pulses, respectively, as shown in Fig. 2b, the sign of  $\Delta t$  dictates the change of RRAM resistance. For  $\Delta t > 0$ , the positive  $V_{TE}$  pulse overlaps with the  $V_G$  spike, thus resulting in a set transition corresponding to long-term potentiation (LTP). On the other hand, for  $\Delta t < 0$ , the negative  $V_{TE}$  peak overlaps with the  $V_G$  spike, thus resulting in reset transition and consequent long-term depression (LTD) [12].

### IV. STDP CHARACTERISTICS

To validate the proposed 1T1R synapse, we applied the  $V_G$ and  $V_{TE}$  pulse waveforms in Fig. 2b to a 1T1R device with variable  $\Delta t$  and initial resistance  $R_0$ , with the purpose of collecting the STDP characteristic. After every combined gate/TE pulse application, the new resistance R of the device was measured. Fig. 3a shows  $R_0/R$ , namely the relative increase of conductance induced by application of the 2 pulses, as a function of the pulse delay  $\Delta t$ . Various curves are reported corresponding to increasing initial resistance R<sub>0</sub>, which was changed in a range from 25 k $\Omega$  to 500 k $\Omega$  by initially preparing the device by a partial reset operation with variable voltage  $V_{stop}$  [22]. The curves show STDP with LTP and LTD at positive and negative delay  $\Delta t$ , respectively. As previously noted [12], the STDP depends on the initial resistance: for instance, virtually no LTP can be observed on LRS (R<sub>0</sub> = 25 k $\Omega$  in Fig. 3a), since this state already has a very low resistance. In fact, the resistance after set transition is controlled by the size of the conductive filament (CF) which is controlled by the compliance current  $I_C$  [23]. Since a constant  $V_G$  was used in the scheme of Fig. 2b, no variation in the maximum size of the CF could be obtained, thus resulting in no possible potentiation of LRS. Similarly, no substantial LTD is possible for HRS ( $R_0 = 500 \text{ k}\Omega$  in Fig. 3a). Note that a similar dependence on the initial state was observed in biological systems, where a synapse conductance change cannot exceed minimum and maximum values [24]. On the other hand, intermediate resistance states can achieve both LTP



Fig. 2. Scheme of the 1T1R synapse connected to PRE and POST (a) and typical spike signals at  $V_G$  and  $V_{TE}$  at the basis of STDP (b). A  $V_G$  spike from PRE induces a current which is integrated by POST, eventually leading to fire. At fire,  $V_{TE}$  induces potentiation ( $\Delta t > 0$ ) or depression ( $\Delta t < 0$ ), thus resulting in STDP.

and LTD. In any case, the STDP characteristics show constant  $R_0/R$  for  $\Delta t < 0$  and  $\Delta t > 0$ , as a result of the constant  $V_{TE+}$ ,  $V_{TE-}$  and  $V_G$  in Fig. 2b.

The STDP curves were reproduced by a Simulink circuit model able to simulate the 1T1R device. The RRAM in the 1T1R was described by our previous analytical model [25] where set transition consists of the growth of the CF diameter while the reset transition occurs via the formation and growth of a depleted gap, in agreement with the results of numerical simulations of set/reset processes [26]. In the simulations, we applied the same pulses shown in Fig. 2b and used in Fig. 3a, assuming variable  $\Delta t$  and variable R<sub>0</sub>, as in Fig. 3a. Fig. 3b shows the calculated R<sub>0</sub>/R as a function of  $\Delta t$  at increasing R<sub>0</sub>, indicating a close agreement with data. Fig. 4 shows the calculated STDP characteristics in a 3D plot, where R<sub>0</sub>/R in the z-axis is reported as a function of R<sub>0</sub> (x axis) and  $\Delta t$  (y axis). LTP occurs for  $\Delta t > 0$  and increases with R<sub>0</sub>, while LTD occurs for  $\Delta t < 0$  and is more pronounced for low R<sub>0</sub>.

Note that the maximum relative LTP is around a factor 20, while the maximum relative LTD is around a factor 1/20, corresponding to the overall resistance window between HRS (about 500 k $\Omega$ ) and LRS (about 25 k $\Omega$ ) in our device. This indicates that the synapse shows a bistable behavior where, starting from any arbitrary intermediate state, even one spike is enough to change the synaptic weight to either HRS (in case of LTD) or LRS (in case of LTP). The bistable behavior arises from the abrupt set/reset transitions (Fig. 1c) and the relatively large  $V_{TE+}$  and  $V_{TE-}$  values used in our STDP protocol (Fig. 2b), and contrasts with the generally assumed analog behavior of biological synapses [17], [18], [24].

# V. PATTERN LEARNING WITH DETERMINISTIC STDP

To demonstrate the functionality of the bistable 1T1R synapse for unsupervised pattern learning, we simulated a 2-layer neuromorphic network with 64 PRE in the first layer and one 1 POST connected to the first layer with 64 synapses [12]. As schematically shown in Fig. 5a, the first layer acts as a retina, emitting spikes in correspondence of a visual pattern, e.g., an 'X' as shown in Fig. 5b, alternated with random noise (Fig. 5c). The currents generated in each activated 1T1R synapse are collected by the POST which is modeled as a leaky-integrate & fire (LIF) neuron, integrating the currents



Fig. 3. STDP characteristics, namely change of conductance  $R_0/R$  as a function of  $\Delta t$  defined in Fig. 2b, obtained from data (a) and calculations (b). Data were collected from 1T1R RRAM devices as in Fig. 1, while calculations were done with a Simulink model. The change of conductance was measured/calculated for increasing initial resistance  $R_0$ .



Fig. 4. Calculated STDP characteristics as in Fig. 3b, but showing the 3D map of  $R_0/R$  as a function of  $\Delta t$  and  $R_0$ . Note that high (low)  $R_0$  preferentially displays potentiation (depression).

and delivering a spike as the internal potential exceeds a fixed threshold. Either the pattern or noise were periodically presented by the PRE layer every epoch, corresponding to a period of 10 ms. Pattern and noise were submitted with equal probabilities of 50%, and noise had an average density of 9% activated PREs in the first layer. The RC time constant of the LIF was  $\tau = 45$  ms.

Fig. 5d shows the spiking activity of the first layer, reporting the active channel (PRE) as a function of discrete time (epoch). Either noise or pattern events occur randomly at each epoch. Fig. 5e shows the corresponding internal potential in the POST, namely the output potential of the leaky integrator, which is the equivalent of the membrane potential in biological neurons [24]. The internal potential increases due to the integration of spiking currents, then eventually exceeds the threshold resulting in a POST fire event. This dictates the generation of a POST spike and the discharge of the internal potential.

Fig. 6a shows the evolution of the calculated synapse conductance 1/R of the 64 synapses as a function of epoch number. Red and blue lines represent the weight for synapses within the pattern and the background, respectively. The color map of the weights within the 8x8 synapse array at 0, 250 and 500 epochs is shown in Figs. 6b, c and d, respectively. All weights were initially prepared in a random state uniformly distributed between HRS and LRS. The weights corresponding to the input pattern show a fast potentiation due to LTP in the initial 50 epochs. On the other hand, background patterns display gradual depression toward low conductance



Fig. 5. Schematic layout of the 2-layer neuromorphic network to demonstrate pattern learning (a), input visual pattern (b) and typical noise (c), typical pattern/noise sequence from POST (d) and corresponding internal voltage in PRE showing fire events upon reaching threshold voltage (e).



(b) (a) Initialization Partial Initialization Random 3 3 reset set Voltage [V] Voltage [V] 2 2 Rese Rese 0 0 2 t [ms] t [ms]

Fig. 7. Voltage pulse sequence for partial reset (a) and random set (b) experiments of Fig. 8 and Fig. 9.

Fig. 6. Calculated evolution of weights 1/R for pattern synapses (red) and background synapses (a), and pattern weights in the initial state (b), after 250 epochs (c) and after 500 epochs (d). The average weight of pattern synapses reveals learning in around 50 epochs, while the average weight of background synapse shows a gradual depression in 150 epochs.

due to LTD. Noise is functional in depressing background synapses since LTD generally takes place in synapses excited by noise soon after a fire event induced by presentation of the pattern. Because of uncorrelated noise behavior, depression of background synapses is relatively slow, taking approximately 150 epochs in Fig. 6a. These results support unsupervised pattern learning in RRAM-based synaptic network via STDP.

#### VI. LEARNING WITH STOCHASTIC SYNAPSES

The abrupt set/reset processes in our RRAM device causes bistable STDP in contrast with the gradual weight tuning which is believed to occur in biological STDP. It was previously reported that gradual switching can be mimicked in bistable synapses via stochastic switching, where set/reset process is induced randomly (instead of deterministically) in the STDP protocol [14], [15]. To study the impact of stochastic switching on pattern learning we changed the  $V_{TE+}$  and  $V_{TE-}$  voltages to explore both random set transition and partial reset transition of RRAM.

#### A. Partial reset characteristics

We characterized the partial-reset process in our RRAM by applying a sequence of triangular  $V_{TE}$  pulses as shown in Fig. 7a. First, the device was initialized in the full reset state (HRS) by a reset pulse, then a set pulse was applied to induce set transition to the LRS. The compliance current was 50  $\mu$ A during the set pulse by properly limiting the gate voltage (not shown). Finally, a partial reset pulse with variable  $V_{stop}$  was applied to induce transition to the partial reset state. The sequence was repeated 10<sup>3</sup> times for each value of  $V_{stop}$ to gain sufficient statistics.

Fig. 8a shows the distribution of R measured after the partial reset pulse with  $V_{stop} = -0.7 \text{ V}$ , -1 V, -1.1 V, -1.2 V, -1.3 V, -1.6 V. For  $V_{stop} = -0.7 \text{ V}$ , the R distribution coincides with the LRS distribution, since the voltage is too small for the reset transition. As  $V_{stop}$  is increased, first a high-R tail appears with increasing amplitude, then the full distribution moves toward high R [16]. The distribution dependence on  $V_{stop}$ 



Fig. 8. Cumulative distributions of measured and calculated R after partial reset at increasing  $|V_{stop}|$  (a), corresponding average values of HRS and LRS subdistributions (b), and lognormal spread of R for HRS and LRS subdistributions (c).

can be captured by an empirical model, where we described each distribution by combining 2 sub-distributions, one for HRS and one for LRS. Both sub-distributions were modeled as log-normal distributions defined by an average value  $\mu$  and a slope (or standard deviation)  $\sigma$ . We extracted the average value  $\mu_{HRS}$  and its slope  $\sigma_{HRS}$  on the lognormal scale, which are reported in Fig. 8b and c, respectively. We also extracted the average value  $\mu_{LRS}$  of the set-state distribution (i.e., the one for  $V_{stop}$  = -0.7 V in Fig. 8a) and its slope  $\sigma_{LRS}$  on the lognormal scale, which are also shown in Fig. 8b and c, respectively. Based on the extracted parameters in Figs. 8b and c, we obtained the partial reset distributions at any Vstop by combining HRS and LRS distributions with a Monte Carlo approach, as shown by calculations in Fig. 8a. Note that  $\mu_{HRS}$  can be smaller than  $\mu_{LRS}$  in Fig. 8b, as a result of extrapolating the HRS tail to lower resistance in the lognormal scale. Such low values of  $\mu_{HRS}$  have no physical meaning, but are functional to the accurate description of the overall R distribution in Fig. 8a.

### B. Random set characteristics

Fig. 7b shows the triangular pulse sequence for studying random-set distributions, similar to partial reset distributions in Fig. 8. The  $V_{TE}$  waveform in Fig. 7b includes an initialization



Fig. 9. Cumulative distributions of measured and calculated R after random set at increasing  $V_A$  (a), typical I-V curves for states A, B and C (b), and set probability at increasing  $V_A$ . Due to random switching, the device may undergo set transition (A), or display no transition (B) or partial transition (C). Calculations by Eq. (2) are shown in (c).

set pulse, a full reset pulse with  $V_{stop} = -1.6$  V, and a final pulse for random set transition with a variable voltage  $V_A$  [9]. As a result of the large stochastic cycle-to-cycle fluctuation of the set voltage  $V_{set}$ , the voltage  $V_A$  can be above or below the nominal value of V<sub>set</sub>, thus inducing set transition in a fraction of cycles. Fig. 9a shows the cycle-cycle distributions of measured R for the initial HRS and after random set transition at variable  $V_A$ . The random set pulse induces set transition in a fraction of cycles, as a result of the statistical variability of  $V_{set}$ . Therefore, the application of  $V_A$  might lead to set transition for  $V_A > V_{set}$  (state A in the distribution of Fig. 9a), or the device might remain in HRS state for  $V_A < V_{set}$  (state B). In some case for  $V_A \approx V_{set}$ , the set transition might be stopped at the end of the  $V_A$  pulse, thus resulting in an intermediate state as indicated by state C. Fig. 9b shows the I-V curves captured during the random set pulse in correspondence to states A, B and C in Fig. 9a. As  $V_A$  increases, the set probability increases as summarized in Fig. 9c, showing the fraction of cells with  $R < 80 \text{ k}\Omega$  in the distributions in Fig. 9a. We chose 80 k $\Omega$  as a threshold for separating LRS and HRS cells. Data in Fig. 9c can be described by the fraction of cells undergoing set transition, i.e., those falling below  $V_A$  in the Gaussian distribution  $P(V_{set})$ of  $V_{set}$ . As a result, the set probability  $P_{set}$  can be obtained



Fig. 10. Color maps of calculated learning efficiency  $P_{learn}$  and error probability  $P_{err}$  for 1 cell per synapse (a,b), 2 cells per synapse (c,d) and 4 cells per synapse (e,f).

as:

$$P_{set} = \int_{0}^{V_{TE+}} P(V_{set}) dV_{set} = \frac{1 + erf \frac{V_{TE+} - \mu}{\sqrt{2\sigma}}}{2}, \quad (2)$$

where  $\mu_{Vset} = 1.3$  V is the average value of  $V_{set}$  and  $\sigma_{Vset} = 0.193$  V is the standard deviation of  $V_{set}$ . Similar to partial reset, the random set distribution was modeled by a Monte Carlo approach combining the full (initial) distribution HRS and the full LRS distribution with a random-set probability given by Eq. (2). Calculations by Eq. (2) are shown in Fig. 9b, in good agreement with the observed  $P_{set}$ . Eq. (2) was used in STDP simulations by assuming  $V_A = V_{TE+}$ , namely the positive peak of the POST spike in Fig. 2.

#### VII. PATTERN LEARNING WITH STOCHASTIC SYNAPSES

To study the impact of stochastic switching on pattern learning efficiency, we simulated the neuromorphic circuit of Fig. 5a by changing the values of  $V_{TE+}$  and  $V_{TE-}$  of the POST spike in Fig. 2b. An 'X' pattern was presented to the PRE layer with a noise occurrence probability of 50% and noise average density of 9%, as in the simulations of



Fig. 11. Calculated  $P_{learn}$  and  $P_{err}$  as a function of  $V_{TE+}$  for increasing number of cells (a) and as a function of noise pixel density (b). Noise is beneficial for learning, with an optimum efficiency around 9% of noise density.

Fig. 6. After each applied pulse at voltage  $V_{TE+}$  or  $V_{TE-}$ , the resistance was updated according to the Monte Carlo model for partial reset and random set of section VI. After 1000 simulated epochs starting from a random distribution of synaptic weights, we defined the learning efficiency  $P_{learn}$  as the ratio of the number  $n_{p,f}$  of fire events in correspondence of the presentation of a pattern, divided by the number  $n_p$ of total appearances of the pattern. Note that  $P_{learn}$  should be ideally one in the case of fire occurring systematically at the presentation of the pattern. We also calculated the error probability  $P_{err}$  as the ratio of the number  $n_{n,f}$  of fire events in correspondence of the presentation of noise, divided by the number  $n_n$  of total appearances of noise. Note that  $P_{err}$  should be ideally zero, i.e., the POST never fires in correspondence of the presentation of noise.

Fig. 10 shows the calculated  $P_{learn}$  (a) and  $P_{err}$  (b) in color maps as a function of  $V_{TE-}$  in the x-axis and  $V_{TE+}$  in the y-axis.  $V_{TE+}$  controls the probability of synapse potentiation, while  $V_{TE-}$  is responsible for synapse depression. From the maps, the region with the highest  $P_{learn}$  and lowest  $P_{err}$ is for  $V_{TE+}$  ranging between 1.2 V and 1.6 V and for  $|V_{TE-}|$  above 1.3 V. LTP is too weak for  $V_{TE+} < 1.1$  V, thus causing a generalized depression of all synapses. On the other hand, synapses cannot be depressed for  $|V_{TE-}| < 1.2$ V, thus causing a generalized potentiation of all synapses and systematic spiking in response to both pattern and noise.

We studied a possible improvement of learning by using multiple 1T1R cells for each synapse, each connecting the same PRE to the POST. Fig. 10c and d shows the calculated  $P_{learn}$  and  $P_{err}$ , respectively, for the case of 2 cells per synapse, while Fig. 10e and f shows the calculated  $P_{learn}$ and  $P_{err}$ , respectively, for the case of 4 cells per synapse. The learning/error performance slightly increases due to averaging within the resistance distributions after partial reset and random set. In fact, the regions of high  $P_{learn}$  and the regions of low  $P_{err}$  show an increasing area for increasing number of cells per synapse in Fig. 10. Fig. 11a shows the calculated  $P_{learn}$  and  $P_{err}$  as a function of  $V_{TE+}$  for  $V_{TE-} = 1.6$  V (full reset) and for variable number of cells per synapse, indicating a slight improvement obtained by redundant RRAM cells.

We also studied the impact of noise on learning efficiency. Fig. 11b shows  $P_{learn}$  and  $P_{err}$  as a function of the noise activity within the PRE array, namely the average fraction of firing PREs while presenting a noise image. In the simulations, noise was presented randomly in 50% of all epochs. For zero noise activity,  $P_{learn}$  is around 60% due to the lack of background depression. As noise is increased,  $P_{learn}$  increases, reaching a maximum value around 95.3% in correspondence of 9% firing PREs. A further increase of noise activity leads to performance degradation where  $P_{learn}$  decreases and  $P_{err}$  increases. This is because excessive noise may cause a sequence of noise-induced fire of the PRE, immediately followed by pattern presentation, which results in the LTD of all pattern synapses. The results in Fig. 11b suggests that noise should be carefully tuned to maximize the learning efficiency in the neuromorphic network.

# VIII. CONCLUSIONS

We presented a novel 1T1R synapse using bipolar RRAM as tunable resistance for neuromorphic learning circuits. The STDP behavior in the synapse arises from the overlap of PRE and POST pulses across the RRAM. We demonstrated STDP characteristics by experiments and unsupervised learning in a fully-connected neuromorphic network of 64 PRE and 1 POST. The impact of stochastic switching was studied by implementing an empirical Monte Carlo model for switching variability during partial reset and random set processes. Stochastic switching simulations of learning show a large region of operation with optimum learning at large TE voltages. Optimization of noise for best learning efficiency is finally discussed.

#### REFERENCES

- H.-S. P. Wong, H.-Y. Lee, S. Yu, Y.-S. Chen, Y. Wu, P.-S. Chen, B. Lee, F. T. Chen, and M.-J. Tsai, "Metal-oxide RRAM," *Proc. IEEE*, vol. 100, no. 6, pp. 1951–1970, 2012.
- [2] J. J. Yang, D. B. Strukov, and D. R. Stewart, "Memristive devices for computing," *Nature Nanotechnology*, vol. 8, pp. 13–24, 2013.
- [3] S. Yu, Y. Wu, R. Jeyasingh, D. Kuzum, and H.-S. P. Wong, "An electronic synapse device based on metal oxide resistive switching memory for neuromorphic computation," *IEEE Trans. Electron Devices*, vol. 58, p. 2729, 2011.
- [4] S. Ambrogio, S. Balatti, A. Cubeta, A. Calderoni, N. Ramaswamy, and D. Ielmini, "Statistical fluctuations in HfO<sub>x</sub> resistive-switching memory (RRAM): Part I - Set/Reset variability," *IEEE Trans. Electron Devices*, vol. 61, no. 8, pp. 2912–2919, 2014.
- [5] T.-Y. Liu, T. H. Yan, R. Scheuerlein, Y. Chen, J. K. Lee, G. Balakrishnan, G. Yee, H. Zhang, A. Yap, J. Ouyang, T. Sasaki, A. Al-Shamma, C. Chen, M. Gupta, G. Hilton, A. Kathuria, V. Lai, M. Matsumoto, A. Nigam, A. Pai, J. Pakhale, C. H. Siau, X. Wu, Y. Yin, N. Nagel, Y. Tanaka, M. Higashitani, T. Minvielle, C. Gorla, T. Tsukamoto, T. Yamaguchi, M. Okajima, T. Okamura, S. Takase, H. Inoue, and L. Fasoli, "A 130-mm<sup>2</sup> 2-layer 32-Gb ReRAM memory device in 24nm technology," *Solid-State Circuits, IEEE Journal of*, vol. 49, no. 1, pp. 140–153, 2014.
- [6] S. Gaba, P. Sheridan, J. Zhou, S. Choi, and W. Lu, "Stochastic memristive devices for computing and neuromorphic applications," *Nanoscale*, vol. 5, no. 13, pp. 5872–5878, 2013.
- [7] C.-Y. Huang, W. C. Shen, Y.-H. Tseng, Y.-C. King, and C.-J. Lin, "A Contact-Resistive Random-Access-Memory-Based true random number generator," *IEEE Electron Device Lett.*, vol. 33, p. 1108, 2012.
- [8] W. Choi, L. Yang, J. Kim, A. Deshpande, G. Kang, J.-P. Wang, and C. Kim, "A magnetic tunnel junction based true random number generator with conditional perturb and real-time output probability tracking," *IEDM Tech. Dig.*, pp. 315–318, 2014.
- [9] S. Balatti, S. Ambrogio, Z. Wang, and D. Ielmini, "True random number generation by variability of resistive switching in oxide-based devices," *Emerging and Selected Topics in Circuits and Systems, IEEE Journal* on, vol. 5, no. 2, pp. 214–221, 2015.

- [10] G. Indiveri and S. Liu, "Memory and information processing in neuromorphic systems," *Proc. IEEE*, vol. 103, no. 8, pp. 1379–1397, 2015.
- [11] M. Suri, O. Bichler, D. Querlioz, and G. Palma, "CBRAM devices as binary synapses for low-power stochastic neuromorphic systems: Auditory (cochlea) and visual (retina) cognitive processing applications," *IEDM Tech. Dig.*, pp. 235–238, 2012.
- [12] Z.-Q. Wang, S. Ambrogio, S. Balatti, and D. Ielmini, "A 2-transistor/1resistor artificial synapse capable of communication and stochastic learning for neuromorphic systems," *Frontiers in Neuroscience*, vol. 8, no. 438, 2015.
- [13] C. Stevens, "Quantal release of neurotransmitter and long-term potentiation," *Cell Suppl.*, vol. 72, pp. 55–63, 1993.
- [14] S. Yu, B. Gao, Z. Fang, H. Yu, J. Kang, and H.-S. Wong, "Stochastic learning in oxide binary synaptic device for neuromorphic computing," *Front. Neurosci.*, vol. 7, no. 186, pp. 1–9, 2013.
- [15] M. Suri, D. Querlioz, O. Bichler, G. Palma, E. Vianello, D. Vuillaume, C. Gamrat, and B. DeSalvo, "Bio-inspired stochastic computing using binary CBRAM synapses," *IEEE Transactions on Electron Devices*, vol. 60, no. 7, pp. 2402–2409, 2013.
- [16] S. Balatti, S. Ambrogio, Z.-Q. Wang, S. Sills, A. Calderoni, N. Ramaswamy, and D. Ielmini, "Voltage-controlled cycling endurance of HfO<sub>x</sub>-based resistive-switching memory (RRAM)," *IEEE Trans. Electron Devices*, vol. 62, no. 10, pp. 3365–3372, 2015.
- [17] G.-Q. Bi and M.-M. Poo, "Synaptic modifications in cultured hippocampal neurons: Dependence on spike timing, synaptic strength, and postsynaptic cell type," *Journal of Neuroscience*, vol. 18, no. 24, pp. 10464–10472, 1998.
- [18] G. Wittenberg and S. S.-H. Wang, "Malleability of spike-timingdependent plasticity at the CA3-CA1 synapse," J. Neurosci., no. 26, pp. 6610–6617, 2006.
- [19] F. Nardi, S. Balatti, S. Larentis, D. Gilmer, and D. Ielmini, "Complementary switching in oxide-based bipolar resistive-switching random memory," *IEEE Trans. Electron Devices*, vol. 60, no. 1, pp. 70–77, 2013.
- [20] S. Balatti, S. Ambrogio, D. C. Gilmer, and D. Ielmini, "Set variability and failure induced by complementary switching in bipolar RRAM," *IEEE Electron Device Lett.*, vol. 34, no. 7, pp. 861–863, 2013.
- [21] E. Chicca, F. Stefanini, C. Bartolozzi, and G. Indiveri, "Neuromorphic electronic circuits for building autonomous cognitive systems," *Proc. IEEE*, vol. 102, no. 9, p. 1367, 2014.
- [22] F. Nardi, S. Larentis, S. Balatti, D. C. Gilmer, and D. Ielmini, "Resistive switching by voltage-driven ion migration in bipolar RRAM - Part I: Experimental study," *IEEE Trans. Electron Devices*, vol. 59, no. 9, pp. 2461–2467, 2012.
- [23] D. Ielmini, "Modeling the universal set/reset characteristics of bipolar RRAM by field- and temperature-driven filament growth," *IEEE Trans. Electron Devices*, vol. 58, no. 12, pp. 4309–4317, 2011.
- [24] P. J. Sjöström, G. G. Turrigiano, and S. B. Nelson, "Rate, timing, and cooperativity jointly determine cortical synaptic plasticity," *Neuron*, vol. 32, pp. 1149–1164, 2001.
- [25] S. Ambrogio, S. Balatti, D. Gilmer, and D. Ielmini, "Analytical modeling of oxide-based bipolar resistive memories and complementary resistive switches," *IEEE Trans. Electron Devices*, vol. 61, pp. 2378–2386, 2014.
- [26] S. Larentis, F. Nardi, S. Balatti, D. C. Gilmer, and D. Ielmini, "Resistive switching by voltage-driven ion migration in bipolar RRAM - Part II: Modeling," *IEEE Trans. Electron Devices*, vol. 59, no. 9, pp. 2468–2475, 2012.