Home
/
Authors
/
Donald W. Plass

Author

Donald W. Plass

Bio: Donald W. Plass is an academic researcher from IBM. The author has contributed to research in topics: Transistor & Microprocessor. The author has an hindex of 18, co-authored 85 publications receiving 1155 citations.

Topics: Transistor, Microprocessor, eDRAM, Static random-access memory, Clock signal ...read more

Papers published on a yearly basis

2021
2020
2019
2018
2017
2016
2015
2014
2013
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1993
1989
1988

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Design of the Power6 Microprocessor

[...]

Joshua Friedrich¹, Bradley McCredie¹, Norman Karl James¹, B. Huott¹, Brian W. Curran¹, Eric Fluhr¹, Gaurav Mittal¹, E. Chan¹, Y.H. Chan¹, Donald W. Plass¹, Sam Gat-Shang Chu¹, Hung Le¹, L. Clark¹, J. Ripley¹, Scott A. Taylor¹, Jack DiLullo¹, M. Lanzerotti¹ - Show less +13 more•Institutions (1)

IBM¹

18 Jun 2007

TL;DR: The POWER6trade microprocessor combines ultra-high frequency operation, aggressive power reduction, a highly scalable memory subsystem, and mainframe-like reliability, availability, and serviceability.

...read moreread less

Abstract: The POWER6trade microprocessor combines ultra-high frequency operation, aggressive power reduction, a highly scalable memory subsystem, and mainframe-like reliability, availability, and serviceability. The 341mm2 700M transistor dual-core microprocessor is fabricated in a 65nm SOI process with 10 levels of low-k copper interconnect. It operates at clock frequencies over 5GHz in high-performance applications, and consumes under 100W in power-sensitive applications.

...read moreread less

120 citations

Proceedings Article•DOI•

Design and implementation of the POWER5 microprocessor

[...]

Joachim Gerhard Clabes¹, Joshua Friedrich¹, Mark D. Sweet¹, Jack DiLullo¹, Sam Gat-Shang Chu¹, Donald W. Plass¹, James W. Dawson¹, Paul H. Muench¹, Larry Powell¹, Michael Stephen Floyd¹, Balaram Sinharoy¹, Mike Lee¹, Michael Normand Goulet¹, James Donald Wagoner¹, Nicole Schwartz¹, Steve Runyon¹, Gary Alan Gorman¹, Phillip J. Restle¹, Ronald Nick Kalla¹, Joseph McGill¹, Steve Dodson¹ - Show less +17 more•Institutions (1)

IBM¹

07 Jun 2004

TL;DR: POWERS offers significantly increased performance over previous POWER designs by incorporating simultaneous multithreading, an enhanced memory subsystem, and extensive RAS and power management support.

...read moreread less

Abstract: POWER5 offers significantly increased performance over previous POWER designs by incorporating simultaneous multithreading, an enhanced memory subsystem, and extensive RAS and power management support. The 276M transistor processor is implemented in 130nm silicon-on-insulator technology with 8-level of Cu metallization and operates at >1.5 GHz.

...read moreread less

110 citations

Proceedings Article•DOI•

5.1 POWER8 TM : A 12-core server-class processor in 22nm SOI with 7.6Tb/s off-chip bandwidth

[...]

Eric Fluhr¹, Joshua Friedrich¹, Daniel M. Dreps¹, Victor Zyuban¹, Gregory Scott Still¹, Christopher Gonzalez¹, Allen Hall¹, David Hogenmiller¹, Frank Malgioglio¹, Ryan Nett¹, Jose Angel Paredes¹, Juergen Pille¹, Donald W. Plass¹, Ruchir Puri¹, Phillip J. Restle¹, David Shan¹, Kevin Stawiasz¹, Zeynep Toprak Deniz¹, Dieter Wendel¹, Matt Ziegler¹ - Show less +16 more•Institutions (1)

IBM¹

06 Mar 2014

TL;DR: The 12-core 649mm2 POWER8™ leverages IBM's 22nm eDRAM SOI technology, and microarchitectural enhancements to deliver up to 2.5× the socket performance of its 32nm predecessor, POWER7+™ [3].

...read moreread less

Abstract: The 12-core 649mm2 POWER8™ leverages IBM's 22nm eDRAM SOI technology [1], and microarchitectural enhancements to deliver up to 2.5× the socket performance [2] of its 32nm predecessor, POWER7+™ [3]. POWER8 contains 4.2B transistors and 31.5μF of deep-trench decoupling capacitance. Three thin-oxide transistor Vts are used for power/performance tuning, and thick-oxide transistors enable high-voltage I/O and analog designs. The 15-layer BEOL contains 5-80nm, 2-144nm, 3-288nm, and 3-640nm pitch layers for low-latency communication as well as 2-2400nm ultra-thick-metal (UTM) pitch layers for low-resistance distribution of power and clocks.

...read moreread less

81 citations

Proceedings Article•DOI•

6.6+ GHz Low Vmin, read and half select disturb-free 1.2 Mb SRAM

[...]

Rajiv V. Joshi¹, Robert M. Houle¹, Kevin A. Batson¹, Daniel Rodko¹, Pradip Patel¹, William V. Huott¹, R.L. Franch¹, Y.H. Chan¹, Donald W. Plass¹, S. Wilson¹, P. Wang¹ - Show less +7 more•Institutions (1)

IBM¹

14 Jun 2007

TL;DR: A fully functional read and half select disturb-free 1.2 Mb SRAM is demonstrated at 1.6+ GHz at IV, 25degC and yield of 90-100%.

...read moreread less

Abstract: A fully functional read and half select disturb-free 1.2 Mb SRAM is demonstrated. Measured results show an operating range of 0.4 V to 1.5 V and -25degC to 100degC, speed of 6.6+ GHz at IV, 25degC and yield of 90-100%.

...read moreread less

66 citations

Journal Article•DOI•

A 45 nm SOI Embedded DRAM Macro for the POWER™ Processor 32 MByte On-Chip L3 Cache

[...]

John E. Barth¹, Donald W. Plass¹, Erik A. Nelson¹, Chorng-Lii Hwang¹, Gregory J. Fredeman¹, Michael A. Sperling¹, Abraham Mathews¹, T. Kirihata¹, William Robert Reohr¹, K Nair¹, Nianzheng Caon¹ - Show less +7 more•Institutions (1)

IBM¹

01 Jan 2011-IEEE Journal of Solid-state Circuits

TL;DR: A 1.35 ns random access and 1.7 ns-random-cycle SOI embedded-DRAM macro has been developed for the POWER7™ high-performance microprocessor, allowing the embedded DRAM to operate reliably without constraining of the microprocessor voltage supply windows.

...read moreread less

Abstract: A 1.35 ns random access and 1.7 ns-random-cycle SOI embedded-DRAM macro has been developed for the POWER7™ high-performance microprocessor. The macro employs a 6 transistor micro sense-amplifier architecture with extended precharge scheme to enhance the sensing margin for product quality. The detailed study shows a 67% bit-line power reduction with only 1.7% area overhead, while improving a read zero margin by more than 500ps. The array voltage window is improved by the programmable BL voltage generator, allowing the embedded DRAM to operate reliably without constraining of the microprocessor voltage supply windows. The 2.5nm gate oxide transistor cell with deep-trench capacitor is accessed by the 1.7 V wordline high voltage (VPP) with V WL low voltage (VWL), and both are generated internally within the microprocessor. This results in a 32 MB on-chip L3 on-chip-cache for 8 cores in a 567 mm POWER7™ die.

...read moreread less

63 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Introduction to the cell multiprocessor

[...]

J. A. Kahle¹, M. N. Day¹, Harm Peter Hofstee¹, Charles Ray Johns¹, T. R. Maeurer¹, David Shippy¹ - Show less +2 more•Institutions (1)

IBM¹

01 Jul 2005-Ibm Journal of Research and Development

TL;DR: This paper discusses the history of the project, the program objectives and challenges, the disign concept, the architecture and programming models, and the implementation of the Cell multiprocessor.

...read moreread less

Abstract: This paper provides an introductory overview of the Cell multiprocessor. Cell represents a revolutionary extension of conventional microprocessor architecture and organization. The paper discusses the history of the project, the program objectives and challenges, the disign concept, the architecture and programming models, and the implementation.

...read moreread less

1,077 citations

Processor: A 64-Core SoC with Mesh Interconnect

[...]

Shane L. Bell, Bruce S. Edwards, John Amann, Rich Conlin, Kevin Joyce, Vince Leung, John MacKay, Mike Reif, Liewei Bao, J.F. Brown, Matthew Mattina, Chyi-Chang Miao, Carl Ramey, David Wentzlaff, Walker Anderson, Ethan Berger, Nat Fairbanks, Durlov Khan, Froilan Montenegro, Jay Stickney, John Zook - Show less +17 more

01 Jan 2010

TL;DR: The TILE64TM processor as mentioned in this paper is a multicore SoC targeting the high-performance demands of a wide range of embedded applications across networking and digital multimedia applications, with 64 tile processors arranged in an 8x8 array.

...read moreread less

Abstract: The TILE64TM processor is a multicore SoC targeting the high-performance demands of a wide range of embedded applications across networking and digital multimedia applications. A figure shows a block diagram with 64 tile processors arranged in an 8x8 array. These tiles connect through a scalable 2D mesh network with high-speed I/Os on the periphery. Each general-purpose processor is identical and capable of running SMP Linux.

...read moreread less

634 citations

Proceedings Article•DOI•

TILE64 - Processor: A 64-Core SoC with Mesh Interconnect

[...]

Shane L. Bell, Bruce S. Edwards, John Amann, Richard Conlin, Kevin Joyce, V. Leung, J. MacKay, M. Reif, Liewei Bao, J.F. Brown, Matthew Mattina, Chyi-Chang Miao, Carl Ramey, David Wentzlaff, W. Anderson, E. Berger, N. Fairbanks, D. Khan, F. Montenegro, J. Stickney, J. Zook - Show less +17 more

01 Feb 2008

TL;DR: The TILE64TM processor is a multicore SoC targeting the high-performance demands of a wide range of embedded applications across networking and digital multimedia applications.

...read moreread less

587 citations

Journal Article•DOI•

Techniques for Multicore Thermal Management: Classification and New Exploration

[...]

James Donald¹, Margaret Martonosi¹•Institutions (1)

Princeton University¹

01 May 2006

TL;DR: This paper explores various thermal management techniques that exploit the distributed nature of multicore processors in terms of core throttling policy, whether that policy is applied locally to a core or to the processor as a whole, and process migration policies.

...read moreread less

Abstract: Power density continues to increase exponentially with each new technology generation, posing a major challenge for thermal management in modern processors. Much past work has examined microarchitectural policies for reducing total chip power, but these techniques alone are insufficient if not aimed at mitigating individual hotspots. The industry's current trend has been toward multicore architectures, which provide additional opportunities for dynamic thermal management. This paper explores various thermal management techniques that exploit the distributed nature of multicore processors. We classify these techniques in terms of core throttling policy, whether that policy is applied locally to a core or to the processor as a whole, and process migration policies. We use Turandot and a HotSpot-based thermal simulator to simulate a variety of workloads under thermal duress on a 4-core PowerPCTMprocessor. Using benchmarks from the SPEC 2000 suite we characterize workloads in terms of instruction throughput as well as their effective duty cycles. Among a variety of options we find that distributed controltheoretic DVFS alone improves throughput by 2.5X under our test conditions. Our final design involves a PI-based core thermal controller and an outer control loop to decide process migrations. This policy avoids all thermal emergencies and yields an average of 2.6X speedup over the baseline across all workloads.

...read moreread less

482 citations

Journal Article•DOI•

A Survey of CPU-GPU Heterogeneous Computing Techniques

[...]

Sparsh Mittal¹, Jeffrey S. Vetter¹•Institutions (1)

Oak Ridge National Laboratory¹

21 Jul 2015-ACM Computing Surveys

TL;DR: This article surveys Heterogeneous Computing Techniques (HCTs) such as workload partitioning that enable utilizing both CPUs and GPUs to improve performance and/or energy efficiency and reviews both discrete and fused CPU-GPU systems.

...read moreread less

Abstract: As both CPUs and GPUs become employed in a wide range of applications, it has been acknowledged that both of these Processing Units (PUs) have their unique features and strengths and hence, CPU-GPU collaboration is inevitable to achieve high-performance computing. This has motivated a significant amount of research on heterogeneous computing techniques, along with the design of CPU-GPU fused chips and petascale heterogeneous supercomputers. In this article, we survey Heterogeneous Computing Techniques (HCTs) such as workload partitioning that enable utilizing both CPUs and GPUs to improve performance and/or energy efficiency. We review heterogeneous computing approaches at runtime, algorithm, programming, compiler, and application levels. Further, we review both discrete and fused CPU-GPU systems and discuss benchmark suites designed for evaluating Heterogeneous Computing Systems (HCSs). We believe that this article will provide insights into the workings and scope of applications of HCTs to researchers and motivate them to further harness the computational powers of CPUs and GPUs to achieve the goal of exascale performance.

...read moreread less

414 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse