scispace - formally typeset
Book ChapterDOI

3D-MAPS: 3D Massively parallel processor with stacked memory

Reads0
Chats0
TLDR
3D-MAPS (3D Massively Parallel Processor with Stacked Memory) is a two-tier 3D IC, where the logic die consists of 64 general-purpose processor cores running at 277MHz, and the memory die contains 256KB SRAM.
Abstract
Several recent works have demonstrated the benefits of through-silicon-via (TSV) based 3D integration [1–4], but none of them involves a fully functioning multicore processor and memory stacking. 3D-MAPS (3D Massively Parallel Processor with Stacked Memory) is a two-tier 3D IC, where the logic die consists of 64 general-purpose processor cores running at 277MHz, and the memory die contains 256KB SRAM (see Fig. 10.6.1). Fabrication is done using 130nm GlobalFoundries device technology and Tezzaron TSV and bonding technology. Packaging is done by Amkor. This processor contains 33M transistors, 50K TSVs, and 50K face-to-face connections in 5×5mm2 footprint. The chip runs at 1.5V and consumes up to 4W, resulting in 16W/cm2 power density. The core architecture is developed from scratch to benefit from single-cycle access to SRAM.

read more

Citations
More filters
Proceedings ArticleDOI

NDC: Analyzing the impact of 3D-stacked memory+logic devices on MapReduce workloads

TL;DR: A number of key elements necessary in realizing efficient NDC operation are described and evaluated, including low-EPI cores, long daisy chains of memory devices, and the dynamic activation of cores and SerDes links.
Proceedings ArticleDOI

NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules

TL;DR: This paper proposes near-DRAM acceleration (NDA) architectures, which process data using accelerators 3D-stacked on DRAM devices comprising off-chip main memory modules, substantially reducing energy consumption and improving performance.
Journal ArticleDOI

Transparent offloading and mapping (TOM): enabling programmer-transparent near-data processing in GPU systems

TL;DR: Extensive evaluations across a variety of modern memory-intensive GPU workloads show that TOM significantly improves performance compared to a baseline GPU system that cannot offload computation to 3D-stacked memories.
Proceedings ArticleDOI

Data reorganization in memory using 3D-stacked DRAM

TL;DR: A two pronged approach for efficient data reorganization is presented, which combines a proposed DRAM-aware reshape accelerator integrated within 3D-stacked DRAM, and a mathematical framework that is used to represent and optimize the reorganization operations.
Journal ArticleDOI

TSV stress-aware full-chip mechanical reliability analysis and optimization for 3D IC

TL;DR: An efficient and accurate full-chip thermomechanical stress and reliability analysis tool as well as a design optimization methodology to alleviate mechanical reliability issues in 3D ICs are discussed.
References
More filters
Journal ArticleDOI

3D-Stacked Memory Architectures for Multi-core Processors

TL;DR: This work explores more aggressive 3D DRAM organizations that make better use of the additional die-to-die bandwidth provided by 3D stacking, as well as the additional transistor count, to achieve a 1.75x speedup over previously proposed 3D-DRAM approaches on memory-intensive multi-programmed workloads on a quad-core processor.
Journal ArticleDOI

8 Gb 3-D DDR3 DRAM Using Through-Silicon-Via Technology

TL;DR: An 8 Gb 4-stack 3-D DDR3 DRAM with through-Si-via is presented which overcomes the limits of conventional modules and the proposed TSV check and repair scheme can increase the assembly yield up to 98%.

A cellular computer to implement the kalman filter algorithm

TL;DR: The subject of this thesis is the development of the design for a specially-organized, general-purpose computer which performs matrix operations efficiently.
Journal ArticleDOI

Bridging the processor-memory performance gap with 3D IC technology

TL;DR: It is shown that reducing memory latency by bringing main memory on chip gives near-perfect performance, and three-dimensional IC technology can provide the much needed bandwidth without the cost, design complexity, and power issues associated with a large number of off-chip pins.
Related Papers (5)