Home
/
Authors
/
Balaram Sinharoy

Author

Balaram Sinharoy

Other affiliations: Rensselaer Polytechnic Institute

Bio: Balaram Sinharoy is an academic researcher from IBM. The author has contributed to research in topics: Cache & Thread (computing). The author has an hindex of 32, co-authored 222 publications receiving 5427 citations. Previous affiliations of Balaram Sinharoy include Rensselaer Polytechnic Institute.

Papers published on a yearly basis

2021
2019
2018
2017
2015
2014
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1997
1996
1995
1994
1993
1992

Papers

PDF

Open Access

More filters

Journal Article•DOI•

POWER4 system microarchitecture

[...]

J. M. Tendler¹, John Steven Dodson¹, J. S. Fields¹, Hung Qui Le¹, Balaram Sinharoy¹ - Show less +1 more•Institutions (1)

IBM¹

01 Jan 2002-Ibm Journal of Research and Development

TL;DR: The processor microarchitecture as well as the interconnection architecture employed to form systems up to a 32-way symmetric multiprocessor are described.

...read moreread less

Abstract: The IBM POWER4 is a new microprocessor organized in a system structure that includes new technology to form systems. The name POWER4 as used in this context refers not only to a chip, but also to the structure used to interconnect chips to form systems. In this paper we describe the processor microarchitecture as well as the interconnection architecture employed to form systems up to a 32-way symmetric multiprocessor.

...read moreread less

685 citations

Journal Article•DOI•

IBM Power5 chip: a dual-core multithreaded processor

[...]

Ronald Nick Kalla¹, Balaram Sinharoy¹, Joel M. Tendler¹•Institutions (1)

IBM¹

01 Mar 2004-IEEE Micro

TL;DR: The approach to improve chip-level performance of the Power5 was described, which specified increased performance and other functional enhancements of server virtualization, reliability, availability, and serviceability at both chip and system levels.

...read moreread less

Abstract: IBM introduced Power4-based systems in 2001. The Power4 design integrates two processor cores on a single chip, a shared second-level cache, a directory for an off-chip third-level cache, and the necessary circuitry to connect it to other Power4 chips to form a system. The dual-processor chip provides natural thread-level parallelism at the chip level. The Power5 is the next-generation chip in this line. One of our key goals in designing the Power5 was to maintain both binary and structural compatibility with existing Power4 systems to ensure that binaries continue executing properly and all application optimizations carry forward to newer systems. With that base requirement, we specified increased performance and other functional enhancements of server virtualization, reliability, availability, and serviceability at both chip and system levels. We describe the approach we used to improve chip-level performance.

...read moreread less

410 citations

Journal Article•DOI•

POWER5 System microarchitecture

[...]

Balaram Sinharoy¹, Ronald Nick Kalla¹, Joel M. Tendler¹, Richard J. Eickemeyer², J. B. Joyner¹ - Show less +1 more•Institutions (2)

IBM¹, University of Rochester²

01 Jul 2005-Ibm Journal of Research and Development

TL;DR: This paper describes the implementation of the IBM POWER5TM chip, a two-way simultaneous multithreaded dual-core chip, and systems based on it, and how it allows system scalability to 64 physical processors.

...read moreread less

Abstract: This paper describes the implementation of the IBM POWER5TM chip, a two-way simultaneous multithreaded dual-core chip, and systems based on it. With a key goal of maintaining both binary and structural compatibility with POWER4TM systems, the POWER5 microprocessor allows system scalability to 64 physical processors. A POWER5 system allows both single-threaded and multithreaded execution modes. In single-threaded execution mode, a POWER5 system allows for higher performance than its predecessor POWER4 system at equivalent frequencies. In multithreaded execution mode, the POWER5 microprocessor implements dynamic resource balancing to ensure that each thread receives its fair share of system resources. Additionally, software-settable thread priority is enforced by the POWER5 hardware. To conserve power, the POWER5 chip implements dynamic power management that allows reduced power consumption without affecting performance.

...read moreread less

337 citations

Journal Article•DOI•

Power7: IBM's Next-Generation Server Processor

[...]

Ron Kalla¹, Balaram Sinharoy¹, William J. Starke¹, Michael Stephen Floyd¹•Institutions (1)

IBM¹

01 Mar 2010-IEEE Micro

TL;DR: Power Systems™ continue strong 7th Generation Power chip: Balanced Multi-Core design EDRAM technology SMT4 greater then 4X performance in same power envelope as previous generation.

...read moreread less

Abstract: The Power7 is IBM's first eight-core processor, with each core capable of four-way simultaneous-multithreading operation. Its key architectural features include an advanced memory hierarchy with three levels of on-chip cache; embedded-DRAM devices used in the highest level of the cache; and a new memory interface. This balanced multicore design scales from 1 to 32 sockets in commercial and scientific environments.

...read moreread less

259 citations

Journal Article•DOI•

IBM POWER7 multicore server processor

[...]

Balaram Sinharoy¹, Ronald Nick Kalla¹, W. J. Starke¹, Hung Qui Le¹, Robert Alan Cargnoni¹, J. A. Van Norstrand¹, Bruce Joseph Ronchetti¹, Jeffrey A. Stuecheli¹, Jentje Leenstra¹, Guy Lynn Guthrie¹, Dung Quoc Nguyen¹, Bartholomew Blaner¹, Charles F. Marino¹, Eric E. Retter¹, Peter Williams¹ - Show less +11 more•Institutions (1)

IBM¹

01 May 2011-Journal of Reproduction and Development

TL;DR: The processor core and caches of the POWER7 processor chip are significantly enhanced to boost the performance of both single-threaded response-time-oriented, as well as multithreaded, throughput-oriented applications.

...read moreread less

Abstract: The IBM POWER® processor is the dominant reduced instruction set computing microprocessor in the world today, with a rich history of implementation and innovation over the last 20 years. In this paper, we describe the key features of the POWER7® processor chip. On the chip is an eight-core processor, with each core capable of four-way simultaneous multithreaded operation. Fabricated in IBM's 45-nm silicon-on-insulator (SOI) technology with 11 levels of metal, the chip contains more than one billion transistors. The processor core and caches are significantly enhanced to boost the performance of both single-threaded response-time-oriented, as well as multithreaded, throughput-oriented applications. The memory subsystem contains three levels of on-chip cache, with SOI embedded dynamic random access memory (DRAM) devices used as the last level of cache. A new memory interface using buffered double-data-rate-three DRAM and improvements in reliability, availability, and serviceability are discussed

...read moreread less

167 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Introduction to the cell multiprocessor

[...]

J. A. Kahle¹, M. N. Day¹, Harm Peter Hofstee¹, Charles Ray Johns¹, T. R. Maeurer¹, David Shippy¹ - Show less +2 more•Institutions (1)

IBM¹

01 Jul 2005-Ibm Journal of Research and Development

TL;DR: This paper discusses the history of the project, the program objectives and challenges, the disign concept, the architecture and programming models, and the implementation of the Cell multiprocessor.

...read moreread less

Abstract: This paper provides an introductory overview of the Cell multiprocessor. Cell represents a revolutionary extension of conventional microprocessor architecture and organization. The paper discusses the history of the project, the program objectives and challenges, the disign concept, the architecture and programming models, and the implementation.

...read moreread less

1,077 citations

Journal Article•DOI•

Single-chip microprocessor that communicates directly using light

[...]

Chen Sun¹, Chen Sun², Mark T. Wade³, Yunsup Lee², Jason S. Orcutt⁴, Jason S. Orcutt¹, Luca Alloatti¹, Michael Georgas¹, Andrew Waterman², Jeffrey M. Shainline³, Jeffrey M. Shainline⁴, Rimas Avizienis², Sen Lin², Benjamin Moss¹, Rajesh Kumar³, Fabio Pavanello³, Amir H. Atabaki¹, Henry Cook², Albert Ou², Jonathan Leu¹, Yu-Hsin Chen¹, Krste Asanovic², Rajeev J. Ram¹, Milos A. Popovic³, Vladimir Stojanovic² - Show less +21 more•Institutions (4)

Massachusetts Institute of Technology¹, University of California, Berkeley², University of Colorado Boulder³, National Institute of Standards and Technology⁴

24 Dec 2015-Nature

TL;DR: This demonstration could represent the beginning of an era of chip-scale electronic–photonic systems with the potential to transform computing system architectures, enabling more powerful computers, from network infrastructure to data centres and supercomputers.

...read moreread less

Abstract: An electronic–photonic microprocessor chip manufactured using a conventional microelectronics foundry process is demonstrated; the chip contains 70 million transistors and 850 photonic components and directly uses light to communicate to other chips. The rapid transfer of data between chips in computer systems and data centres has become one of the bottlenecks in modern information processing. One way of increasing speeds is to use optical connections rather than electrical wires and the past decade has seen significant efforts to develop silicon-based nanophotonic approaches to integrate such links within silicon chips, but incompatibility between the manufacturing processes used in electronics and photonics has proved a hindrance. Now Chen Sun et al. describe a 'system on a chip' microprocessor that successfully integrates electronics and photonics yet is produced using standard microelectronic chip fabrication techniques. The resulting microprocessor combines 70 million transistors and 850 photonic components and can communicate optically with the outside world. This result promises a way forward for new fast, low-power computing systems architectures. Data transport across short electrical wires is limited by both bandwidth and power density, which creates a performance bottleneck for semiconductor microchips in modern computer systems—from mobile phones to large-scale data centres. These limitations can be overcome1,2,3 by using optical communications based on chip-scale electronic–photonic systems4,5,6,7 enabled by silicon-based nanophotonic devices8. However, combining electronics and photonics on the same chip has proved challenging, owing to microchip manufacturing conflicts between electronics and photonics. Consequently, current electronic–photonic chips9,10,11 are limited to niche manufacturing processes and include only a few optical devices alongside simple circuits. Here we report an electronic–photonic system on a single chip integrating over 70 million transistors and 850 photonic components that work together to provide logic, memory, and interconnect functions. This system is a realization of a microprocessor that uses on-chip photonic devices to directly communicate with other chips using light. To integrate electronics and photonics at the scale of a microprocessor chip, we adopt a ‘zero-change’ approach to the integration of photonics. Instead of developing a custom process to enable the fabrication of photonics12, which would complicate or eliminate the possibility of integration with state-of-the-art transistors at large scale and at high yield, we design optical devices using a standard microelectronics foundry process that is used for modern microprocessors13,14,15,16. This demonstration could represent the beginning of an era of chip-scale electronic–photonic systems with the potential to transform computing system architectures, enabling more powerful computers, from network infrastructure to data centres and supercomputers.

...read moreread less

1,058 citations

Journal Article•DOI•

Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors

[...]

Assaf Shacham, Keren Bergman¹, Luca P. Carloni¹•Institutions (1)

Columbia University¹

01 Sep 2008-IEEE Transactions on Computers

TL;DR: Results confirm the unique benefits for future generations of CMPs that can be achieved by bringing optics into the chip in the form of photonic NoCs, as well as a comparative power analysis of a photonic versus an electronic NoC.

...read moreread less

Abstract: The design and performance of next-generation chip multiprocessors (CMPs) will be bound by the limited amount of power that can be dissipated on a single die We present photonic networks-on-chip (NoC) as a solution to reduce the impact of intra-chip and off-chip communication on the overall power budget A photonic interconnection network can deliver higher bandwidth and lower latencies with significantly lower power dissipation We explain why on-chip photonic communication has recently become a feasible opportunity and explore the challenges that need to be addressed to realize its implementation We introduce a novel hybrid micro-architecture for NoCs combining a broadband photonic circuit-switched network with an electronic overlay packet-switched control network We address the critical design issues including: topology, routing algorithms, deadlock avoidance, and path-setup/tear-down procedures We present experimental results obtained with POINTS, an event-driven simulator specifically developed to analyze the proposed idea, as well as a comparative power analysis of a photonic versus an electronic NoC Overall, these results confirm the unique benefits for future generations of CMPs that can be achieved by bringing optics into the chip in the form of photonic NoCs

...read moreread less

873 citations

Proceedings Article•DOI•

An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget

[...]

Canturk Isci¹, Alper Buyuktosunoglu¹, C.-Y. Chen¹, Pradip Bose¹, Margaret Martonosi² - Show less +1 more•Institutions (2)

IBM¹, Princeton University²

09 Dec 2006

TL;DR: The results show that the best architected policies can come within 1% of the performance of an ideal oracle, while meeting a given chip-level power budget, and are significantly better than static management, even if static scheduling is given oracular knowledge.

...read moreread less

Abstract: Chip-level power and thermal implications will continue to rule as one of the primary design constraints and performance limiters. The gap between average and peak power actually widens with increased levels of core integration. As such, if per-core control of power levels (modes) is possible, a global power manager should be able to dynamically set the modes suitably. This would be done in tune with the workload characteristics, in order to always maintain a chip-level power that is below the specified budget. Furthermore, this should be possible without significant degradation of chip-level throughput performance. We analyze and validate this concept in detail in this paper. We assume a per-core DVFS (dynamic voltage and frequency scaling) knob to be available to such a conceptual global power manager. We evaluate several different policies for global multi-core power management. In this analysis, we consider various different objectives such as prioritization and optimized throughput. Overall, our results show that in the context of a workload comprised of SPEC benchmark threads, our best architected policies can come within 1% of the performance of an ideal oracle, while meeting a given chip-level power budget. Furthermore, we show that these global dynamic management policies perform significantly better than static management, even if static scheduling is given oracular knowledge.

...read moreread less

667 citations

The RISC-V Instruction Set Manual

[...]

Andrew Waterman, Yunsup Lee, David A. Patterson, Krste Asanovi

01 Jan 2014

TL;DR: This draft specification may change before being accepted as standard by the RISC-V Foundation, and it remains possible that implementations made to this draft specification will not conform to the future standard.

...read moreread less

Abstract: Volume II: Privileged Architecture Privileged Architecture Version 1.10 Document Version 1.10 Warning! This draft specification may change before being accepted as standard by the RISC-V Foundation. While the editors intend future changes to this specification to be forward compatible, it remains possible that implementations made to this draft specification will not conform to the future standard.

...read moreread less

583 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse