scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Integration of STT-MRAM model into CACTI simulator

TL;DR: A system-level tool based on CACTI simulator is presented to assist memory system designers to generate high-performance and low-power cache memories comparing performance, energy consumption, and area with traditional SRAM.
Abstract: In the last decade, academies and private companies have actively explored emerging memory technologies STT-MRAM in particular is experiencing a rapid development but it is facing several challenges in terms of performance and reliability Several techniques at cell level have been proposed to mitigate such issues but currently few tools and methodologies exist to support designers in evaluating the impact that specific micro-level design choices can determine on the STT-MRAM macro design In this paper we present a system-level tool based on CACTI simulator to assist memory system designers We use our tool to generate high-performance and low-power cache memories comparing performance, energy consumption, and area with traditional SRAM

Summary (3 min read)

Introduction

  • Smullen et al. present a methodology and tool-chain for evaluating and comparing MTJs design [15].
  • CACTI is a widely used high-level cache and memory modeling tool [9] [10].
  • In order to prove the correctness of their tool, the authors generate STT-MRAM based cache memories with different sizes comparing the resulting performances with SRAM technology.
  • An overview about STT-MRAM technology in terms of operation principles and electrical model is given.

A. Basic Principles

  • STT-MRAM technology is built up upon the magnetic tunneling junction (MTJ) device which aims at persistently store logic data.
  • Commonly, an MTJ device is composed of two ferromagnetic layers (FLs) interleaved with one oxide barrier layer.
  • FLs are characterized by their magnetic orientation: one has a fixed magnetic orientation (fixed layer) and the other has a freely rotating magnetic orientation (free layer).
  • By applying a sufficiently dense current pulse through the MTJ device, the free layer magnetic direction can be dynamically switched.

B. Electrical Model

  • When the FLs exhibit the same magnetic orientation, the MTJ has a low electrical resistance, whereas MTJ experiences high electrical resistance in presence of antiparallel configuration.
  • According to the relative magnetic orientations of the two layers, the electrical resistance of the MTJ is different.
  • The most popular is the 1T-1MTJ whose structure is composed of one NMOS transistor and one MTJ device connected in series.

C. Writing Operation

  • Many device-related parameters (e.g., MTJ area, material property) determine the write current amplitude that is required to change the free later magnetic direction.
  • Moreover, it behaves differently according to the current pulse width.
  • Based on the trade-off between write current amplitude and write pulse width, three distinct switching modes were identified [12]: thermal activation (TH), processional switching (PR), and dynamic reversal (DY) (Fig. 3).
  • Looking at Figure 3, it is evident that when operating in processional switching zone small differences in write pulse width determine wide variation in current density.
  • On the other hand, in the thermal activation area the required switching current increases very slowly even though the current pulse width is dramatically increased.

D. Reading Operation

  • This current is, then, compared against a reference value (IREF) to discriminate the stored logic state.
  • It is worth noticing here that both reading currents used to discriminate the logic state have the same order of magnitude.
  • For this reason, a Sense Amplifier is commonly used to compare IR and IREF to determine the actual logic state of the cell.
  • Different circuital schemas can be implemented to generate the reference current.
  • One of the reference cells is in the parallel (low resistance) state while the other is in the antiparallel (high resistance) state.

E. Data Retention

  • One of the most important parameter characterizing storage class memory devices is the amount of time the information is reliably stored into a cell.
  • The data retention time of an STTMRAM bit-cell depends on thermal stability of the MTJ.
  • It is usually evaluated by Equation (5): 𝑅G = 𝜏0𝑒H (5) The dependence of the retention time from Δ is exponential: the higher thermal stability, the longer retention time.
  • Nevertheless, designing MTJ to increase the thermal stability corresponds to higher write energy.

F. CACTI

  • CACTI is a widely used open-source high-level cache and memory modeling tool [13] [14] supported by HP Labs.
  • CACTI models both traditional and non-uniform banked caches and memories using SRAM, and DRAM of which it can compute delay, power, and area.
  • For a user-specified set of input parameters (e.g., energy/delay, memory size), the tool performs an exhaustive design space exploration across different array sizes and on-chip interconnections to identify, if existing, an optimal configuration that meets the input constraints.
  • The authors research work aims at extending CACTI to support inplane STT-MRAM technology.
  • By modeling bit-line, read circuitry, delay, area and energy consumption, additional parameters are combined with existing analytical models and seamless integrated with CACTI.

A. Array Modeling

  • By integrating analytical models along with parameters extracted from ITRS roadmaps [17], CACTI supports modeling of array of targeted cache or memory devices.
  • Each bank is composed of one or more subbanks which are comprised of identical mats.
  • A Mat has 4 subarray which share pre-decoding logic and each subarray contain a set of wordlines and bitlines to access the basic memory cells.
  • To support STT-MRAM technology, the authors mainly focus on mat and subarray.

C. Read Latency Model

  • In order to estimate read latency the authors model both the bitline and the sense amplifier (SA).
  • Nevertheless, CACTI currently has only models for voltage-base SA.
  • The circuital schema involves two reference cells and three PMOS transistor to implement the current-to-voltage converter.
  • Interested readers can refer to [16], for further details.

D. Write Latency Model

  • The difference between read and write latency is quite relevant in STT-MRAM memories.
  • Moreover, the required write voltage is between 1 and 2 volts whereas a smaller bias voltage (0.1V ~ 0.3V) is needed for reading.
  • There exist a strong dependence between the write voltage and the expected write latency.
  • Moreover, since CACTI does not provide a mechanism to input a distribution of desired logic values to be written, the authors only consider the switching case from parallel to anti-parallel magnetization of the free layer that is the worst case in terms of latency.
  • But this contribution is not sufficient to estimate the overall latency as each STT-MRAM is connected to an access transistor to mitigate write disturbs and to reduce the energy consumption.

E. Area Estimation Model

  • The area of STT-MRAM cell strongly depends on the design of the access transistor.
  • Determining the proper size of the access transistor is one of the most critical aspects of the cell design.
  • The analytical model integrated in CACTI for cell area estimation is given in the Equation (6).
  • There is an inverse proportionality between them: a high resistance corresponds to a small cell area and high storage density, instead a low resistance increases considerably memory area.
  • Interconnections considerably impact on resulting memory size, as well.

F. Energy Estimation Model

  • For sake of completeness, the authors consider write and read energy model individually.
  • A lower read voltage reduces the probability of read disturbs while a high value privileges read latency.
  • The computation of write energy can be divided in two main contributions (see Equation (7)).
  • (7) where Vwrite is the write voltage, RMTJ is the equivalent MTJ resistance, Racc is the equivalent NMOS resistance and τwrite is the MTJ switching time.
  • In the previous section, the authors described modeling and integration of in-plane STT-MRAM technology into CACTI tool.

A. High-Performance Cache Memories

  • For this study the authors generate high-performance, eight-way setassociative cache memories with no error correction mechanism which range in size from 32 kB to 512 kB.
  • Transistors are modeled by resorting to high performance cells (itrs-hp) for both the data and tag array and peripheral circuit.
  • Figure 4 (h) compares the read latency of the three different MTJ configurations with respect to SRAM.
  • This is due to its small cell area given by the high resistance of the access transistor.

B. Low-Power Cache Memories

  • Figure 4 (c) shows the read latency for low-power cache memories.
  • The observed trend is quite similar to the one previously described in Figure (h).
  • The motivation is that CACTI performs several optimizations, according to user constraints, that can change the internal partition of the array.
  • The density improvements that STT-MRAM arrays can attain over SRAM arrays allow in-plane STT-MRAM to be a valid technology solution to design low-power cache memories compared to SRAM when read intensive applications are targeted , and Figure 4 (b)).

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

10 August 2022
POLITECNICO DI TORINO
Repository ISTITUZIONALE
Integration of STT-MRAM model into CACTI simulator / Indaco, M.; DI CARLO, Stefano; Vatajelu, E. I.; Prinetto, Paolo
Ernesto; Arcaro, S.; Pala, D.. - ELETTRONICO. - (2014), pp. 67-72. ((Intervento presentato al convegno 9th IEEE
International Design and Test Symposium (IDT) tenutosi a Algiers, DZ nel 16-18 Dec. 2014
[10.1109/IDT.2014.7038589].
Original
Integration of STT-MRAM model into CACTI simulator
Publisher:
Published
DOI:10.1109/IDT.2014.7038589
Terms of use:
openAccess
Publisher copyright
(Article begins on next page)
This article is made available under terms and conditions as specified in the corresponding bibliographic description in
the repository
Availability:
This version is available at: 11583/2587977 since: 2016-10-07T16:48:52Z
IEEE

Integration of STT-MRAM model into CACTI
simulator
S. Arcaro, S. Di Carlo, M. Indaco, D. Pala, P. Prinetto, Elena I. Vatajelu
Politecnico di Torino
Dip. di Automatica e Informatica
Turin, Italy
{firstname.lastname}@polito.it
AbstractIn the last decade, academies and private companies
have actively explored emerging memory technologies. STT-
MRAM in particular is experiencing a rapid development but it is
facing several challenges in terms of performance and reliability.
Several techniques at cell level have been proposed to mitigate
such issues but currently few tools and methodologies exist to
support designers in evaluating the impact that specific micro-
level design choices can determine on the STT-MRAM macro
design. In this paper we present a system-level tool based on
CACTI simulator to assist memory system designers. We use our
tool to generate high-performance and low-power cache memories
comparing performance, energy consumption, and area with
traditional SRAM.
KeywordsSTT-MRAM, CACTI, Emerging Memories
I. INTRODUCTION
The focus of emerging memories is placed on non-volatile
technologies which should meet the high demands of tomorrow
applications. That includes non-volatility, high performance and
high density similar to SRAMs and DRAMs respectively, good
endurance features, small devices sizes, good integration, low
power profile, resistance to radiation effects, and ability to scale
below 20nm.
One of the most promising candidate as embedded memory
is the spin-transfer torque magnetic random access memory
(STT-RAM) [1] offering faster read and write access time
(nanoseconds) and better CMOS integration compared to other
proposed technologies such as Phase-Change RAM (PCRAM)
[2], Resistive RAM (RRAM) [3] and Ferromagnetic RAM
(FeRAM) [4]. The key building block of STT-MRAM cell is the
magnetic tunneling junction (MTJ) that is integrated with
CMOS circuitry using 3-D technology [5]. The smallest STT-
MRAM cell design is a 1T1MTJ (one transistor, one magnetic
tunneling junction) device. Logical data is stored by applying the
spin polarized current through the MTJ element to switch the
memory states.
Anyway, with scaling, STT-MRAM cell is facing a set of
challenges that strongly influence performances and reliability,
severely affecting the yield of the memory array. Such issues are
mainly related to a) process variations of MOS and MTJ devices
involving the variation of geometry size, threshold voltage, and
magnetic materials [5], [6] b) the high write cost due to high
switching current required to flip the MTJ state [7], , and c) the
thermal fluctuations in the MTJ switching [8].
To tackle such issues, efficient design paradigm at cell level
from circuit and/or architecture perspective to improve the cell
robustness and integration density have been proposed.
However, achieved results for STT-MRAM cell design may be
not directly adapted to meet high-level design requirements.
It is of utmost importance to quantify and to assess the
performance degradation in terms of write/read latency, power
consumption, and area that can potentially affect the behavior of
the whole memory array when specific requirements-driven
designs at cell level are targeted.
For this reason, more comprehensive tools and
methodologies are necessary to provide flexibility for design
experiments. In this context, Smullen et al. present a
methodology and tool-chain for evaluating and comparing MTJs
design [15]. In [11] authors propose a fixed analytical STT-
MRAM model in CACTI, to analyze the power reduction in
modern microprocessors when SRAM is replaced with STT-
MRAM. CACTI is a widely used high-level cache and memory
modeling tool [9] [10].
In this paper we present a system-level tool based on CACTI
simulator to estimate area, energy consumption and write/read
latency of STT-MRAM based cache memories. The tool
supports a parameterizable interface where a wide set of physical
parameters of STT-MRAM technology can be specified. The
implemented extensions enable our tool to be integrated with
system-level emulation tools such as QEMU, as well. In order
to prove the correctness of our tool, we generate STT-MRAM
based cache memories with different sizes comparing the
resulting performances with SRAM technology. The proposed
tool, thus, can support the design of cache or main memories by
evaluating the impact that specific micro-level design choices
can determine on the STT-MRAM macro design. The tool is
made available and it can be freely downloadable from the
website of our reaserch group: http://www.testgroup.polito.it/.
The paper is organized as follows: Section II describes
operation principles of STT-MRAM technology and shortly
CACTI tool. In Section III modeling and parameterization of
STT-MRAM technology that we implemented in CACTI is
discussed while in Section IV a comparison of three MTJ
configurations for each use-case is given. Section V concludes
the paper.

II. BACKGROUND
In this section, an overview about STT-MRAM technology
in terms of operation principles and electrical model is given.
Finally, the main features of CACTI tool are described.
A. Basic Principles
STT-MRAM technology is built up upon the magnetic
tunneling junction (MTJ) device which aims at persistently store
logic data. Commonly, an MTJ device is composed of two
ferromagnetic layers (FLs) interleaved with one oxide barrier
layer. FLs are characterized by their magnetic orientation: one
has a fixed magnetic orientation (fixed layer) and the other has a
freely rotating magnetic orientation (free layer). By applying a
sufficiently dense current pulse through the MTJ device, the free
layer magnetic direction can be dynamically switched.
B. Electrical Model
When the FLs exhibit the same magnetic orientation, the
MTJ has a low electrical resistance, whereas MTJ experiences
high electrical resistance in presence of antiparallel
configuration. Typically, the low electrical resistance (R
MTJ
=
R
L
) is associated with logic state ‘0’ and the high electrical
resistance (R
MTJ
= R
H
) is associated with the logic state ‘1’, as
depicted in Fig. 1.
Figure 1: MTJ configurations
According to the relative magnetic orientations of the two
layers, the electrical resistance of the MTJ is different. The
tunneling magnetoresistance (TMR) is defined as the relative
resistance change between the two magnetized states. TMR is a
figure of merit of MTJ design and it is often analyzed by
resorting to Equation (1):
!"# $ %
&
'
(&
)
&
)
(1)
An higher TRM value is commonly preferred since it means
that a more robust read operation can be performed. Values
above 100% are typically preferred.
Despite of the wide set of STT-MRAM cell designs, the most
popular is the 1T-1MTJ whose structure is composed of one
NMOS transistor and one MTJ device connected in series. Due
to wide set of technological information that are available in
literature, we target in-plane 1T-1MTJ cell in this paper whose
equivalent electric circuit is provided in Fig. 2. Bit Line (BL),
Source Line (SL), and Word Line (WL) aim at operate cell
access.
The MTJ is modeled as a variable electrical resistance whose
value depends on voltage applied across the device. Typically,
the free layer is connected to BL. In this topology, when forcing
MTJ in R
L
state, positive voltage difference is applied between
BL and SL and the anti-parallel to parallel write current is
required. On the contrary, when MTJ is established in R
H
state,
negative voltage difference is applied between BL and SL and
the anti-parallel to parallel write current is required.
Figure 2: STT-MRAM electrical model
C. Writing Operation
Many device-related parameters (e.g., MTJ area, material
property) determine the write current amplitude that is required
to change the free later magnetic direction. Moreover, it behaves
differently according to the current pulse width. Generally, if a
longer current pulse is applied, a lower current density is
required to switch the MTJ state. Based on the trade-off between
write current amplitude and write pulse width, three distinct
switching modes were identified [12]: thermal activation (TH),
processional switching (PR), and dynamic reversal (DY) (Fig.
3). The equations are prompted as follows:
*
+,-.
/ $ *
+0
12 3
2
4
56
/
/
0
7
(τ > 20ns)
(2)
*
+,8&
/ $ *
+0
9
:
/
;
(τ < 3ns)
(3)
*
+,<=
/ $
*
+,-.
/ 9 *
+,8&
>/?@
(A>B(B
C
?
2 9 @
(A>B(B
C
?
(3ns < τ < 20ns)
(4)
where *
D0
is the critical switching current density (i.e., the
current density in presence of zero temperature), /
0
is inverse of
attempt frequency (typically equals to 1ns). :, E, F, and /
+
are
fitting constants. The thermal stability Δ is a key factor of the
MTJ. It depends on thickness or area of free layer and on
magnetic properties of MTJ materials.
Figure 3: Dependence of switching current density on write pulse
width

Looking at Figure 3, it is evident that when operating in
processional switching zone small differences in write pulse
width determine wide variation in current density. On the other
hand, in the thermal activation area the required switching
current increases very slowly even though the current pulse
width is dramatically increased.
D. Reading Operation
When a read operation is performed a small bias voltage is
applied on the control lines, resulting in a current (IR). This
current is, then, compared against a reference value (IREF) to
discriminate the stored logic state. When IR is higher than the
IREF it means that the cell stores a logic value ‘0’, whereas if IR
is lower than IREF the cell stores a logic value ‘1’.
It is worth noticing here that both reading currents used to
discriminate the logic state have the same order of magnitude.
For this reason, a Sense Amplifier is commonly used to compare
IR and IREF to determine the actual logic state of the cell.
Different circuital schemas can be implemented to generate
the reference current. In [13] a pinned MTJ device is designed
to have an electrical resistance equals to the average value of
R
L
and R
H
. Another approach to generate the reference current
requires to adopt two MTJ cells. One of the reference cells is in
the parallel (low resistance) state while the other is in the anti-
parallel (high resistance) state. In this case, the resulting
reference resistance is computed as the average between the
low and high resistance values [14].
E. Data Retention
One of the most important parameter characterizing storage
class memory devices is the amount of time the information is
reliably stored into a cell. The data retention time of an STT-
MRAM bit-cell depends on thermal stability of the MTJ. It is
usually evaluated by Equation (5):
#
G
$ % /
0
@
H
(5)
The dependence of the retention time from Δ is exponential:
the higher thermal stability, the longer retention time.
Nevertheless, designing MTJ to increase the thermal stability
corresponds to higher write energy.
F. CACTI
CACTI is a widely used open-source high-level cache and
memory modeling tool [13] [14] supported by HP Labs. CACTI
has analytical models for all the basic building blocks of a
memory: decoder, sense-amplifier, crossbar, on-chip wires,
DRAM/SRAM cell and latch. CACTI models both traditional
and non-uniform banked caches and memories using SRAM,
and DRAM of which it can compute delay, power, and area. For
a user-specified set of input parameters (e.g., energy/delay,
memory size), the tool performs an exhaustive design space
exploration across different array sizes and on-chip
interconnections to identify, if existing, an optimal configuration
that meets the input constraints.
III. MODELING
Our research work aims at extending CACTI to support in-
plane STT-MRAM technology. By modeling bit-line, read
circuitry, delay, area and energy consumption, additional
parameters are combined with existing analytical models and
seamless integrated with CACTI. The first release supports the
simulation of set-associative cache memories.
A. Array Modeling
By integrating analytical models along with parameters
extracted from ITRS roadmaps [17], CACTI supports modeling
of array of targeted cache or memory devices. Memory is
divided into an array of banks. Each bank is composed of one or
more subbanks which are comprised of identical mats. A Mat
has 4 subarray which share pre-decoding logic and each
subarray contain a set of wordlines and bitlines to access the
basic memory cells. To support STT-MRAM technology, we
mainly focus on mat and subarray.
B. MTJ Model
The 1T-1MTJ cell is modeled by considering a NMOS
access transistor connected in series with a MTJ device. MTJ is
then modeled as a resistance whose values depends on the
relative magnetization of the free layer. We provide a fully
parameterized MTJ model to give the capability to explore a
wide set of designs. Table I shows the model input parameters.
Table 1: MTJ parameters integrated into CACTI
MTJ Parameter
Description
SttType
Type of MTJ. This version supports only in-plane
Jc0
Critical current at zero temperature
Δ
Thermal Stability
MTJArea
Area of MTJ
Rp
MTJ resistance in parallel magnetization
Rap
MTJ resistance in anti-parallel magnetization
Vbitline
Write voltage
Raccess
Equivalent resistance of the access transistor
The Delta parameter is used to compute the resulting
retention time by resorting to Eq. (5). The aforementioned MTJ
parameters are integrated in CACTI to model STT-MRAM cell
and to figure out read and write latency as described further on.
C. Read Latency Model
A read operation involves several phases. A specified
voltage is applied to a bitline and the resulting current passing
through MTJ is compared to a reference value. In order to
estimate read latency we model both the bitline and the sense
amplifier (SA). In STT-MRAM memories, the sensing operation
is performed by means of current-based SA. Nevertheless,
CACTI currently has only models for voltage-base SA.
Therefore, we adapt the current-based sensing operation of the
MTJ to the existing voltage-based SA. The circuital schema
involves two reference cells and three PMOS transistor to
implement the current-to-voltage converter. Interested readers
can refer to [16], for further details. This circuit is modeled using
SPICE at 45nm and it requires about 50ps for stabilization. It is
included into CACTI as additional delay to the existing SA. The
additional area and energy due to MTJ reference cells are also
accounted.

D. Write Latency Model
The difference between read and write latency is quite
relevant in STT-MRAM memories. Performing a write
operation is typically slower. Moreover, the required write
voltage is between 1 and 2 volts whereas a smaller bias voltage
(0.1V ~ 0.3V) is needed for reading.
There exist a strong dependence between the write voltage
and the expected write latency. Such a relationship is modeled
by Eq. (2), Eq. (3), and Eq. (4) that provide an accurate MTJ
write time estimation. The voltage used to estimate latency in
the analytical model is supposed to be constant during the write
operation and identical for both free layer orientations.
Moreover, since CACTI does not provide a mechanism to input
a distribution of desired logic values to be written, we only
consider the switching case from parallel to anti-parallel
magnetization of the free layer that is the worst case in terms of
latency.
But this contribution is not sufficient to estimate the overall
latency as each STT-MRAM is connected to an access transistor
(see Figure 2) to mitigate write disturbs and to reduce the energy
consumption. Therefore, without losing accuracy, the
computation of the overall write latency for a STT-MRAM data
array is equal to the read latency added to the MTJ write time.
E. Area Estimation Model
The area of STT-MRAM cell strongly depends on the design
of the access transistor. Let us consider that a cell is composed
of an access transistor and a MTJ stacked in a 3D structure. The
resulting area is mainly dominated by the element that requires
the larger planar surface that is generally the access transistor.
Determining the proper size of the access transistor is one of the
most critical aspects of the cell design. Due to technological
constraints, a small size improves reading latency whereas a
large size enhance write performances. The analytical model
integrated in CACTI for cell area estimation is given in the
Equation (6).
I
+JKK
$ L>
M
N
9 2?O
P
(6)
where F is the minimum feature size and W and L are the width
and length, respectively. The equivalent resistance of the access
transistor influences the length. There is an inverse
proportionality between them: a high resistance corresponds to
a small cell area and high storage density, instead a low
resistance increases considerably memory area.
The computation of the total area of the memory is not
dependent only from the size of cells. Interconnections
considerably impact on resulting memory size, as well. For this
reason, according to user requirements, CACTI attempts to
optimize on-chip memory interconnections to meet latency or
energy constraints.
F. Energy Estimation Model
For sake of completeness, we consider write and read energy
model individually. Read energy per operation is evaluated by
computing the Equation (7):
Q
RJST
$ :
GUG
V
RJST
P
(7)
where C
tot
depends on the total capacitance of the bitline, on the
all wire contributions and on the access transistor. V
read
is the
read voltage. A lower read voltage reduces the probability of
read disturbs while a high value privileges read latency.
The computation of write energy can be divided in two main
contributions (see Equation (7)). The former is related to the
energy consumption due to the current flowing through MTJ
device while the latter is similarly computed by exploiting the
model in Eq. (6):
Q
WRXGJ
$
Y
Z[\]^
_
&
`ab
&
cdd
/
WRXGJe
:
GUG
V
WRXGJ
P
(7)
where V
write
is the write voltage, R
MTJ
is the equivalent MTJ
resistance, R
acc
is the equivalent NMOS resistance and τ
write
is
the MTJ switching time. It is worth noticing here, that the
computation of write energy is performed accounting for the
worst case: the MTJ switches from parallel to anti-parallel state.
IV. EXPERIMNETAL RESULTS
In the previous section, we described modeling and
integration of in-plane STT-MRAM technology into CACTI
tool. In order to prove the correctness of our tool we generate
high-performance and low-power cache memories for three
different MTJ configurations compared with SRAM technology.
Considered MTJ input parameters are listed in Table 2. MTJ
configurations differ in terms of parallel and anti-parallel
resistance, the write voltage, and the equivalent resistance of the
access transistor.
Table 2: MTJ configurations
A
B
C
In-Plane
In-Plane
In-Plane
2
2
2
40.29
40.29
40.29
2·10
-10
2·10
-10
2·10
-10
1.5
1.5
1.2
3
3
1.8
1.8
1.3
1.8
1.5
0.3
0.3
A. High-Performance Cache Memories
For this study we generate high-performance, eight-way set-
associative cache memories with no error correction mechanism
which range in size from 32 kB to 512 kB. Each cache has 64 b
IN/OUT data interface with a single read-write port. Transistors
are modeled by resorting to high performance cells (itrs-hp) for
both the data and tag array and peripheral circuit. The usage of
itrs-hp maximizes performances at expense of power
consumption.
Figure 4 (h) compares the read latency of the three different
MTJ configurations with respect to SRAM. The fastest read
latency is achieved by SRAM. Among all the MTJ
configurations, the configuration A show the best timing.

Citations
More filters
01 Jan 2012
TL;DR: In this article, two 3D stacking structures built upon bipolar RRAM crossbars are proposed to enable multilayer accesses while avoiding the overwriting induced by the cross-layer disturbance.
Abstract: For its simple structure, high density, and good scalability, the resistive random access memory (RRAM) has emerged as one of the promising candidates for large data storage in computing systems. Moreover, building up RRAM in a 3-D stacking structure further boosts its advantage in array density. Conventionally, multiple bipolar RRAM layers are piled up vertically separated with isolation material to prevent signal interference between the adjacent memory layers. The process of the isolation material increases the fabrication cost and brings in the potential reliability issue. To alleviate the situation, we introduce two novel 3-D stacking structures built upon bipolar RRAM crossbars that eliminate the isolation layers. The bigroup operation scheme dedicated for the proposed designs to enable multilayer accesses while avoiding the overwriting induced by the cross-layer disturbance is also presented. Our simulation results show that the proposed designs can increase the capacity of a memory island to 8K-bits (i.e., eight layers of 32 × 32 crossbar arrays) while maintaining the sense margin in the worst case configuration greater than 20% of the maximal sensing voltage.

22 citations

Proceedings ArticleDOI
05 Jun 2016
TL;DR: A new member of NVSim family is introduced - NVSim-VXs, which enables statistical simulation of STT-RAM for write performance, errors, and energy consumption, and strongly supports the fast-growing needs of STt-RAM research on reliability analysis and enhancement.
Abstract: Spin-transfer torque random access memory (STT-RAM) recently received significant attentions for its promising characteristics in cache and memory applications. As an early-stage modeling tool, NVSim has been widely adopted for simulations of emerging nonvolatile memory technologies in computer architecture research, including STT-RAM, ReRAM, PCM, etc. In this work, we introduce a new member of NVSim family -- NVSim-VXs, which enables statistical simulation of STT-RAM for write performance, errors, and energy consumption. This enhanced model takes into account the impacts of parametric variabilities of CMOS and MTJ devices and the chip operating temperature. It is also calibrated with Monte-Carlo Simulations based on macro-magnetic and SPICE models, covering five technology nodes between 22nm and 90nm. NVSim-VXs strongly supports the fast-growing needs of STT-RAM research on reliability analysis and enhancement, announcing the next important stage of NVSim development.

17 citations


Cites methods from "Integration of STT-MRAM model into ..."

  • ...integrated a STT-RAM model into CACTI [1] – a tool was originally used for conventional memory modeling and design [8]....

    [...]

  • ...Arcaro et al. integrated a STT-RAM model into CACTI [1] – a tool was originally used for conventional memory modeling and design [8]....

    [...]

Journal ArticleDOI
TL;DR: This paper attempts to reduce static power consumption by using non-volatile memory technology-based spin-transfer torque random access memory (STT-RAM) buffers to reduce write variation to almost 0% and improve lifetime by 3.3 and 19.9 times for intra-VNet and inter-V net, respectively.
Abstract: With multiple cores integrated on the same die, communication across cores is managed by on-chip interconnect called network-on-chip (NoC). Power and performance of these interconnect is a significant factor as the communication network consumes a considerable share of the power budget. In particular, the buffers used at every port of the NoC router consume considerable dynamic as well as static power. This paper attempts to reduce static power consumption by using non-volatile memory technology-based spin-transfer torque random access memory (STT-RAM) buffers. STT-RAM technology has the advantage of high density and low leakage but suffers from weaker write endurance. This impacts the lifetime of the router as a whole. The buffers in a router are allocated to virtual networks (VNets) and in-turn to virtual channels (VCs) within each VNet. To reduce uneven writes across the buffers, we propose policies to reduce intra-VNet write variation and inter-VNet write variation. The former performs write variation aware VC allocation in each VNet, and the latter does write variation aware buffer assignments to each VNet. Experimental evaluation on full system simulator shows that proposed policies reduce write variation to almost 0% and improve lifetime by 3.3 and 19.9 times for intra-VNet and inter-VNet, respectively. We also get significant gains in the energy delay product.

4 citations


Cites methods from "Integration of STT-MRAM model into ..."

  • ...We use Cacti-STT [35] and NVSim [36] to get SRAM and STTRAM latency, read-write energy, and leakage power....

    [...]

25 Jan 2018
TL;DR: This work introduces a new member of NVSim family – NVSim-VXs, which enables statistical simulation of STT-RAM for write performance, errors, and energy consumption and proposes two possible SHE-RAM designs from the aspects of two different write access operations.
Abstract: DEVELOPING VARIATION AWARE SIMULATION TOOLS, MODELS, AND DESIGNS FOR STT-RAM Enes Eken, PhD University of Pittsburgh, 2017 In recent years, we have been witnessing the rise of spin-transfer torque random access memory (STT-RAM) technology. There are a couple of reasons which explain why STT-RAM has attracted a great deal of attention. Although conventional memory technologies like SRAM, DRAM and Flash memories are commonly used in the modern computer industry, they have major shortcomings, such as high leakage current, high power consumption and volatility. Although these drawbacks could have been overlooked in the past, they have become major concerns. Its characteristics, including low-power consumption, fast read-write access time and non-volatility make STT-RAM a promising candidate to solve the problems of other memory technologies. However, like all other memory technologies, STT-RAM has some problems such as long switching time and large programming energy of Magnetic Tunneling Junction (MTJ) which are waiting to be solved. In order to solve these long switching time and large programming energy problems, Spin-Hall Effect (SHE) assisted STT-RAM structure (SHE-RAM) has been recently invented. In this work, I propose two possible SHE-RAM designs from the aspects of two different write access operations, namely, High Density SHE-RAM and Disturbance Free SHE-RAM, respectively. In addition to the SHE-RAM designs, I will also propose a simulation tool for STT-RAMs. As an early-stage modeling tool, NVSim has been widely adopted for simulations of emerging nonvolatile memory technologies in computer architecture research, including STT-RAM, ReRAM, PCM, etc. I will introduce a new member of NVSim family – NVSim-VXs, which enables statistical simulation of STT-RAM for write performance, errors, and energy consumption.

3 citations


Cites methods from "Integration of STT-MRAM model into ..."

  • ...integrated a STT-RAM model into CACTI [1] – a tool was originally used for conventional memory modeling and design [19]....

    [...]

  • ...Arcaro et al. integrated a STT-RAM model into CACTI [1] – a tool was originally used for conventional memory modeling and design [19]....

    [...]

Journal ArticleDOI
TL;DR: This work proposes an entire flow for obtaining/calibrating the transistor characteristics from a commercial technology and uses these characteristics within CACTI for the first time, and extends it to support negative capacitance fin field effect transistor (NC-FinFET), an emerging technology depictingnegative capacitance whose current and capacitive characteristics are very different compared to those of the FinFET.
Abstract: Cache memories are an indispensable component of many processor-based systems and contribute significantly to the overall area, power consumption, and delay. This leads to an important role played by modeling tools for estimating the area, power consumption, and access time of cache memories. However, existing modeling tools such as CACTI and its various extensions have been primarily designed using data from various projections. For the first time, we propose an entire flow for obtaining/calibrating the transistor characteristics from a commercial technology and use these characteristics within CACTI. We also improve the modeling approach to make them more fine-grained and follow recent manufacturing trends suitable for FinFET technology. Further, for the first time, we extend CACTI to support negative capacitance fin field effect transistor (NC-FinFET), an emerging technology depicting negative capacitance whose current and capacitive characteristics are very different compared to those of the FinFET. We use the proposed tool (FN-CACTI) to identify NC-FinFET-based caches to be significantly more energy-efficient than corresponding FinFET-based caches. We also study an application of FN-CACTI to determine optimal voltages corresponding to the lowest energy consumption for NC-FinFET and FinFET-based caches of various sizes.

3 citations

References
More filters
Proceedings ArticleDOI
18 Mar 2010
TL;DR: A 64Mb STTMRAM with the P-TMR device having the circuit techniques to maximize operational margin is described, and the perpendicular tunnel magnetoresistance (TMR) device is proposed, confirming its high potential to achieve lower switching current.
Abstract: In order to realize a sub-Giga bit scale NVRAM, the novel MRAM based on the spin-transfer-torque (STT) switching has been intensively investigated due to its excellent scalability compared with a conventional magnetic field induce switching MRAM [1] However, the memory cell size of STT-MRAM reported so far is still over 1µm2, and the memory capacity is limited to 32Mbit even in almost 100mm2 die size [2] The large cell size is due to the large switching current of MRAM cells In order to reduce the cell size, we have proposed the perpendicular tunnel magnetoresistance (P-TMR) device, and have confirmed its high potential to achieve lower switching current [3] In this paper, a 64Mb STTMRAM with the P-TMR device having the circuit techniques to maximize operational margin is described

217 citations


"Integration of STT-MRAM model into ..." refers background or methods in this paper

  • ...In [13] a pinned MTJ device is designed to have an electrical resistance equals to the average value of RL and RH....

    [...]

  • ...CACTI is a widely used open-source high-level cache and memory modeling tool [13] [14] supported by HP Labs....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a low-power 1-Mb magnetoresistive random access memory (MRAM) based on a one-transistor and one-magnetic tunnel junction (1T1MTJ) bit cell is demonstrated.
Abstract: A low-power 1-Mb magnetoresistive random access memory (MRAM) based on a one-transistor and one-magnetic tunnel junction (1T1MTJ) bit cell is demonstrated. This is the largest MRAM memory demonstration to date. In this circuit, the magnetic tunnel junction (MTJ) elements are integrated with CMOS using copper interconnect technology. The copper interconnects are cladded with a high-permeability layer which is used to focus magnetic flux generated by current flowing through the lines toward the MTJ devices and reduce the power needed for programming. The 25-mm/sup 2/ 1-Mb MRAM circuit operates with address access times of less than 50 ns, consuming 24 mW at 3.0 V and 20 MHz. The 1-Mb MRAM circuit is fabricated in a 0.6-/spl mu/m CMOS process utilizing five layers of metal and two layers of poly.

195 citations


"Integration of STT-MRAM model into ..." refers background or methods in this paper

  • ...In this case, the resulting reference resistance is computed as the average between the low and high resistance values [14]....

    [...]

  • ...CACTI is a widely used open-source high-level cache and memory modeling tool [13] [14] supported by HP Labs....

    [...]

Journal ArticleDOI
TL;DR: This paper analyzed and modeled the failure probabilities of STT MRAM cells due to parameter variations and developed an efficient design paradigm from circuit and/or architecture perspective-to improve the robustness and integration density.
Abstract: Spin-torque transfer magnetic RAM (STT MRAM) is a promising candidate for future embedded applications. It combines the desirable attributes of current memory technologies such as SRAM, DRAM, and flash memories (fast access time, low cost, high density, and non-volatility). It also solves the critical drawbacks of conventional MRAM technology: poor scalability and high write current. However, variations in process parameters can lead to a large number of cells to fail, severely affecting the yield of the memory array. In this paper, we analyzed and modeled the failure probabilities of STT MRAM cells due to parameter variations. Based on the model, we performed a thorough analysis of the impact of design parameters on parametric failures due to process variations. To achieve high memory yield without incurring expensive technology modification, we developed an efficient design paradigm from circuit and/or architecture perspective-to improve the robustness and integration density. The proposed technique effectively relaxes or completely decouples the conflicting design requirements for read stability, writability and cell area. It can be used at an early stage of the design cycle for yield enhancement.

131 citations


"Integration of STT-MRAM model into ..." refers background or methods in this paper

  • ...The key building block of STT-MRAM cell is the magnetic tunneling junction (MTJ) that is integrated with CMOS circuitry using 3-D technology [5]....

    [...]

  • ...Such issues are mainly related to a) process variations of MOS and MTJ devices involving the variation of geometry size, threshold voltage, and magnetic materials [5], [6] b) the high write cost due to high switching current required to flip the MTJ state [7], , and c) the thermal fluctuations in the MTJ switching [8]....

    [...]

Proceedings ArticleDOI
01 Aug 2011
TL;DR: A physically based thermal noise model for simulating the statistical variations of MTJs and shows that Invert Coding provides a 7% average reduction in the total write energy for the SPEC CPU2006 benchmark suite without any performance overhead.
Abstract: Spin-Transfer Torque RAM (STT-RAM) has emerged as a potential candidate for Universal memory. However, there are two challenges to using STT-RAM in memory system design: (1) the intrinsic variation in the storage element, the Magnetic Tunnel Junction (MTJ), and (2) the high write energy. In this paper, we present a physically based thermal noise model for simulating the statistical variations of MTJs. We have implemented it in HSPICE and validated it against analytical results. We demonstrate its use in setting the write pulse width for a given write error rate. We then propose two write-energy reduction techniques. At the device level, we propose the use of a low-M S ferromagnetic material that can reduce the write energy without sacrificing retention time. At the architecture level, we show that Invert Coding provides a 7% average reduction in the total write energy for the SPEC CPU2006 benchmark suite without any performance overhead.

112 citations


"Integration of STT-MRAM model into ..." refers background in this paper

  • ...Such issues are mainly related to a) process variations of MOS and MTJ devices involving the variation of geometry size, threshold voltage, and magnetic materials [5], [6] b) the high write cost due to high switching current required to flip the MTJ state [7], , and c) the thermal fluctuations in the MTJ switching [8]....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors measured thermally activated magnetization reversal of the free layers in submicron magnetic tunnel junctions to be used for magnetoresistive random access memory.
Abstract: We have measured thermally activated magnetization reversal of the free layers in submicron magnetic tunnel junctions to be used for magnetoresistive random access memory. We applied magnetic field pulses to the bits with a pulse duration tp ranging from nanoseconds to 0.1 ms. We have measured the switching probability as a function of tp with a fixed field amplitude H, and as a function of H for fixed tp. For both cases, we find good agreement with the switching probability predicted by the Arrhenius–Neel theory for thermal activation over a single energy barrier.

100 citations