Home
/
Authors
/
Steven Burns

Author

Steven Burns

Bio: Steven Burns is an academic researcher from GlobalFoundries. The author has contributed to research in topics: Logic gate & eDRAM. The author has an hindex of 2, co-authored 2 publications receiving 18 citations.

Topics: Logic gate, eDRAM, Sense amplifier

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A 14 nm 1.1 Mb Embedded DRAM Macro With 1 ns Access

[...]

Gregory J. Fredeman¹, Donald W. Plass¹, Abraham Mathews¹, Janakiraman Viraraghavan², Kenneth J. Reyer¹, Thomas J. Knips¹, Thomas R. Miller¹, Elizabeth L. Gerhard³, Dinesh Kannambadi¹, Chris Paone³, Dongho Lee¹, Daniel J. Rainey³, Michael A. Sperling¹, Michael Whalen¹, Steven Burns², Rajesh R. Tummuru², Herbert L. Ho², Alberto Cestero², Norbert Arnold², Babar A. Khan¹, Toshiaki Kirihata², Subramanian S. Iyer¹ - Show less +18 more•Institutions (3)

IBM¹, GlobalFoundries², University of Rochester³

01 Jan 2016-IEEE Journal of Solid-state Circuits

TL;DR: A 1.1 Mb embedded DRAM macro (eDRAM), for next-generation IBM SOI processors, employs 14 nm FinFET logic technology with 0.0174 μm2 deep-trench capacitor cell that enables a high voltage gain of a power-gated inverter at mid-level input voltage.

...read moreread less

Abstract: A 1.1 Mb embedded DRAM macro (eDRAM), for next-generation IBM SOI processors, employs 14 nm FinFET logic technology with $\hbox{0.0174}~\mu\hbox{m}^{2}$ deep-trench capacitor cell. A Gated-feedback sense amplifier enables a high voltage gain of a power-gated inverter at mid-level input voltage, while supporting 66 cells per local bit-line. A dynamic-and-gate-thin-oxide word-line driver that tracks standard logic process variation improves the eDRAM array performance with reduced area. The 1.1 $~$ Mb macro composed of 8 $\times$ 2 72 Kb subarrays is organized with a center interface block architecture, allowing 1 ns access latency and 1 ns bank interleaving operation using two banks, each having 2 ns random access cycle. 5 GHz operation has been demonstrated in a system prototype, which includes 6 instances of 1.1 Mb eDRAM macros, integrated with an array-built-in-self-test engine, phase-locked loop (PLL), and word-line high and word-line low voltage generators. The advantage of the 14 nm FinFET array over the 22 nm array was confirmed using direct tester control of the 1.1 Mb eDRAM macros integrated in 16 Mb inline monitor.

...read moreread less

18 citations

Proceedings Article•DOI•

12.4 1.4Gsearch/s 2Mb/mm 2 TCAM using two-phase-precharge ML sensing and power-grid preconditioning to reduce Ldi/dt power-supply noise by 50%

[...]

Arsovski Igor¹, Michael T. Fragano¹, Robert M. Houle¹, Akhilesh Patil¹, Van Butler¹, Raymond Kim¹, Ramon Rodriguez¹, Thomas M. Maffitt², Joseph J. Oler¹, John R. Goss¹, Christopher Parkinson¹, Michael A. Ziegerhofer¹, Steven Burns¹ - Show less +9 more•Institutions (2)

GlobalFoundries¹, IBM²

01 Feb 2017

TL;DR: The push for higher performance and increased memory density coupled with parallel TCAM array activation during search operation creates large Ldi/dt power supply noise challenges that could result in timing fails in both TCAM and its surrounding logic.

...read moreread less

Abstract: Ternary Content Addressable Memory (TCAM) executes a fully parallel search of its entire memory contents and uses powerful wild-card pattern matching to return search results in a single clock cycle. This capability makes TCAM attractive for implementing fast hardware look-up tables in network routers, processor caches, and many pattern recognition applications. However, the push for higher performance and increased memory density coupled with parallel TCAM array activation during search operation creates large Ldi/dt power supply noise challenges that could result in timing fails in both TCAM and its surrounding logic.

...read moreread less

2 citations

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

DRISA: a DRAM-based Reconfigurable In-Situ Accelerator

[...]

Shuangchen Li¹, Niu Dimin², Malladi Krishna T², Zheng Hongzhong², Bob Brennan², Yuan Xie¹ - Show less +2 more•Institutions (2)

University of California, Santa Barbara¹, Samsung²

14 Oct 2017

TL;DR: DRISA, a DRAM-based Reconfigurable In-Situ Accelerator architecture, is proposed to provide both powerful computing capability and large memory capacity/bandwidth to address the memory wall problem in traditional von Neumann architecture.

...read moreread less

Abstract: Data movement between the processing units and the memory in traditional von Neumann architecture is creating the “memory wall” problem. To bridge the gap, two approaches, the memory-rich processor (more on-chip memory) and the compute-capable memory (processing-in-memory) have been studied. However, the first one has strong computing capability but limited memory capacity/bandwidth, whereas the second one is the exact the opposite.To address the challenge, we propose DRISA, a DRAM-based Reconfigurable In-Situ Accelerator architecture, to provide both powerful computing capability and large memory capacity/bandwidth. DRISA is primarily composed of DRAM memory arrays, in which every memory bitline can perform bitwise Boolean logic operations (such as NOR). DRISA can be reconfigured to compute various functions with the combination of the functionally complete Boolean logic operations and the proposed hierarchical internal data movement designs. We further optimize DRISA to achieve high performance by simultaneously activating multiple rows and subarrays to provide massive parallelism, unblocking the internal data movement bottlenecks, and optimizing activation latency and energy. We explore four design options and present a comprehensive case study to demonstrate significant acceleration of convolutional neural networks. The experimental results show that DRISA can achieve 8.8× speedup and 1.2× better energy efficiency compared with ASICs, and 7.7× speedup and 15× better energy efficiency over GPUs with integer operations.CCS CONCEPTS• Hardware → Dynamic memory; • Computer systems organization → reconfigurable computing; Neural networks;

...read moreread less

315 citations

Proceedings Article•DOI•

Enabling scientific computing on memristive accelerators

[...]

Ben Feinberg¹, Uday Kumar Reddy Vengalam¹, Nathan Whitehair¹, Shibo Wang¹, Engin Ipek¹ - Show less +1 more•Institutions (1)

University of Rochester¹

02 Jun 2018

TL;DR: This paper presents the first proposal to enable scientific computing on memristive crossbars, and three techniques are explored — reducing overheads by exploiting exponent range locality, early termination of fixed-point computation, and static operation scheduling — that together enable a fixed- Point Memristive accelerator to perform high-precision floating point without the exorbitant cost of naïve floating-point emulation on fixed-pointers.

...read moreread less

Abstract: Linear algebra is ubiquitous across virtually every field of science and engineering, from climate modeling to macroeconomics. This ubiquity makes linear algebra a prime candidate for hardware acceleration, which can improve both the run time and the energy efficiency of a wide range of scientific applications. Recent work on memristive hardware accelerators shows significant potential to speed up matrix-vector multiplication (MVM), a critical linear algebra kernel at the heart of neural network inference tasks. Regrettably, the proposed hardware is constrained to a narrow range of workloads: although the eight- to 16-bit computations afforded by memristive MVM accelerators are acceptable for machine learning, they are insufficient for scientific computing where high-precision floating point is the norm. This paper presents the first proposal to enable scientific computing on memristive crossbars. Three techniques are explored---reducing overheads by exploiting exponent range locality, early termination of fixed-point computation, and static operation scheduling---that together enable a fixed-point memristive accelerator to perform high-precision floating point without the exorbitant cost of naive floating-point emulation on fixed-point hardware. A heterogeneous collection of crossbars with varying sizes is proposed to efficiently handle sparse matrices, and an algorithm for mapping the dense subblocks of a sparse matrix to an appropriate set of crossbars is investigated. The accelerator can be combined with existing GPU-based systems to handle datasets that cannot be efficiently handled by the memristive accelerator alone. The proposed optimizations permit the memristive MVM concept to be applied to a wide range of problem domains, respectively improving the execution time and energy dissipation of sparse linear solvers by 10.3x and 10.9x over a purely GPU-based system.

...read moreread less

54 citations

Journal Article•DOI•

Radiation Effects in Advanced and Emerging Nonvolatile Memories

[...]

Matthew J. Marinella¹•Institutions (1)

Sandia National Laboratories¹

29 Apr 2021-IEEE Transactions on Nuclear Science

TL;DR: In this paper, the material and device physics, fabrication, operational principles, and commercial status of scaled 2D flash, 3D flash and emerging memory technologies are discussed, including the physics of and errors caused by total ionizing dose, displacement damage, and single event effects.

...read moreread less

Abstract: Despite hitting major roadblocks in 2-D scaling, NAND flash continues to scale in the vertical direction and dominate the commercial nonvolatile memory market. However, several emerging nonvolatile technologies are under development by major commercial foundries or are already in small volume production, motivated by storage-class memory and embedded application drivers. These include spin-transfer torque magnetic random access memory (STT-MRAM), resistive random access memory (ReRAM), phase change random access memory (PCRAM), and conductive bridge random access memory (CBRAM). Emerging memories have improved resilience to radiation effects compared to flash, which is based on storing charge, and hence may offer an expanded selection from which radiation-tolerant system designers can choose from in the future. This review discusses the material and device physics, fabrication, operational principles, and commercial status of scaled 2-D flash, 3-D flash, and emerging memory technologies. Radiation effects relevant to each of these memories are described, including the physics of and errors caused by total ionizing dose, displacement damage, and single-event effects, with an eye toward the future role of emerging technologies in radiation environments.

...read moreread less

27 citations

Proceedings Article•DOI•

Low Cost Ternary Content Addressable Memory Using Adaptive Matchline Discharging Scheme

[...]

Woong Choi¹, Kyeongho Lee¹, Jongsun Park¹•Institutions (1)

Korea University¹

26 Apr 2018

TL;DR: The simulation results with the 65nm CMOS technology show that the proposed adaptive ML discharging scheme improves up to 19% of sensing delay and saves 81% of ML power compared to the conventional approach.

...read moreread less

Abstract: This paper presents an adaptive match-line (ML) discharging scheme for low power and high speed ternary content addressable memory (TCAM). In the proposed TCAM, by employing the gated ML pulldown path and ML boosting scheme, the redundant ML discharging and SL switching are eliminated while improving the search speed. By considering the number of mismatch and ML discharging speed, the ML discharging is adaptively controlled in the proposed TCAM. The simulation results with the 65nm CMOS technology show that the proposed adaptive ML discharging scheme improves up to 19% of sensing delay and saves 81% of ML power compared to the conventional approach. When compared with the state-of-the-art work, the post-layout simulations show 10% improvement of FOM (energy/bit/search).

...read moreread less

19 citations

Journal Article•DOI•

80-kb Logic Embedded High-K Charge Trap Transistor-Based Multi-Time-Programmable Memory With No Added Process Complexity

[...]

Balaji Jayaraman¹, Derek H. Leu¹, Janakiraman Viraraghavan², Alberto Cestero¹, Ming Yin¹, John Golz¹, Rajesh R. Tummuru¹, Ramesh Raghavan¹, Dan Moy¹, Thejas Kempanna¹, Faraz Khan¹, Toshiaki Kirihata¹, Subramanian S. Iyer³ - Show less +9 more•Institutions (3)

GlobalFoundries¹, Indian Institute of Technology Madras², University of California, Los Angeles³

09 Jan 2018-IEEE Journal of Solid-state Circuits

TL;DR: The design and implementation of an 80-kb logic-embedded non-volatile multi-time programmable memory (MTPM) with no added process complexity is described and high-temperature stress results show a projected data retention of 10 years at 125 °C.

...read moreread less

Abstract: This paper describes the design and implementation of an 80-kb logic-embedded non-volatile multi-time programmable memory (MTPM) with no added process complexity. Charge trap transistors (CTTs) that exploit charge trapping and de-trapping behavior in high-K dielectric of 32-/22-nm Logic FETs are used as storage elements with logic-compatible programming voltages. A high-gain slew-sense amplifier (SA) is used to efficiently detect the threshold voltage difference ( $\Delta V_{\textrm {DIF}}$ ) between the true and complement FETs in the twin cell. Design-assist techniques including multi-step programming with over-write protection and block write algorithm are used to enhance the programming efficiency without causing a dielectric breakdown. High-temperature stress results show a projected data retention of 10 years at 125 °C with a signal loss of <30% that is margined in while programming, by employing a sense margining logic in the SA. Scalability of CTT has been established by the first demonstration of CTT-based MTPM in 14-nm bulk FinFET technology with read cycle time of 40 ns at 0.7-V VDD.

...read moreread less

15 citations

1
2
3
4
…