Home
/
Authors
/
Glenn Reinman

Author

Glenn Reinman

Other affiliations: University of California, San Diego, Northwestern University, University of California, Berkeley ...read more

Bio: Glenn Reinman is an academic researcher from University of California, Los Angeles. The author has contributed to research in topics: Cache & Cache pollution. The author has an hindex of 38, co-authored 109 publications receiving 4062 citations. Previous affiliations of Glenn Reinman include University of California, San Diego & Northwestern University.

Topics: Cache, Cache pollution, Cache algorithms, Smart Cache, Cache coloring ...read more

Papers published on a yearly basis

2023
2020
2019
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

CMP network-on-chip overlaid with multi-band RF-interconnect

[...]

Mau-Chung Frank Chang¹, Jason Cong¹, Adam Kaplan¹, Mishali Naik¹, Glenn Reinman¹, Eran Socher¹, Sai-Wang Tam¹ - Show less +3 more•Institutions (1)

University of California, Los Angeles¹

24 Oct 2008

TL;DR: This paper explores the use of multi-band radio frequency interconnect (or RF-I) with signal propagation at the speed of light to provide shortcuts in a many core network-on-chip (NoC) mesh topology, and investigates the costs associated with this technology, and examines the latency and bandwidth benefits that it can provide.

...read moreread less

Abstract: In this paper, we explore the use of multi-band radio frequency interconnect (or RF-I) with signal propagation at the speed of light to provide shortcuts in a many core network-on-chip (NoC) mesh topology. We investigate the costs associated with this technology, and examine the latency and bandwidth benefits that it can provide. Assuming a 400 mm2 die, we demonstrate that in exchange for 0.13% of area overhead on the active layer, RF-I can provide an average 13% (max 18%) boost in application performance, corresponding to an average 22% (max 24%) reduction in packet latency. We observe that RF access points may become traffic bottlenecks when many packets try to use the RF at once, and conclude by proposing strategies that adapt RF-I utilization at runtime to actively combat this congestion.

...read moreread less

276 citations

Proceedings Article•DOI•

A scalable micro wireless interconnect structure for CMPs

[...]

Suk-Bok Lee¹, Sai-Wang Tam¹, Ioannis Pefkianakis¹, Songwu Lu¹, M. Frank Chang¹, Chuanxiong Guo², Glenn Reinman¹, Chunyi Peng², Mishali Naik¹, Lixia Zhang¹, Jason Cong¹ - Show less +7 more•Institutions (2)

University of California, Los Angeles¹, Microsoft²

20 Sep 2009

TL;DR: This paper proposes a recursive wireless interconnect structure called the WCube that features a single transmit antenna and multiple receive antennas at each micro wireless router and offers scalable performance in terms of latency and connectivity.

...read moreread less

Abstract: This paper describes an unconventional way to apply wireless networking in emerging technologies. It makes the case for using a two-tier hybrid wireless/wired architecture to interconnect hundreds to thousands of cores in chip multiprocessors (CMPs), where current interconnect technologies face severe scaling limitations in excessive latency, long wiring, and complex layout. We propose a recursive wireless interconnect structure called the WCube that features a single transmit antenna and multiple receive antennas at each micro wireless router and offers scalable performance in terms of latency and connectivity. We show the feasibility to build miniature on-chip antennas, and simple transmitters and receivers that operate at 100-500 GHz sub-terahertz frequency bands. We also devise new two-tier wormhole based routing algorithms that are deadlock free and ensure a minimum-latency route on a 1000-core on-chip interconnect network. Our simulations show that our protocol suite can reduce the observed latency by 20% to 45%, and consumes power that is comparable to or less than current 2-D wired mesh designs.

...read moreread less

220 citations

Proceedings Article•DOI•

Selective value prediction

[...]

Brad Calder¹, Glenn Reinman¹, Dean M. Tullsen¹•Institutions (1)

University of California, San Diego¹

01 May 1999

TL;DR: This paper examines selective techniques for using value prediction in the presence of predictor capacity constraints and reasonable misprediction penalties, and filters which instructions put values into the value prediction table.

...read moreread less

Abstract: Value Prediction is a relatively new technique to increase instruction-level parallelism by breaking true data dependence chains. A value prediction architecture produces values, which may be later consumed by instructions that execute speculatively using the predicted value.This paper examines selective techniques for using value prediction in the presence of predictor capacity constraints and reasonable misprediction penalties. We examine prediction and confidence mechanisms in light of these constraints, and we minimize capacity conflicts through instruction filtering. The latter technique filters which instructions put values into the value prediction table. We examine filtering techniques based on instruction type, as well as giving priority to instructions belonging to the longest data dependence path in the processor's active instruction window. We apply filtering both to the producers of predicted values and the consumers. In addition, we examine the benefit of using different confidence levels for instructions using predicted values on the longest dependence path.

...read moreread less

199 citations

CACTI 2.0: An Integrated Cache Timing and Power Model

[...]

Glen Reinman, Norman P. Jouppi, Glenn Reinman, Norm Jouppi

01 Jan 2002

129 citations

Proceedings Article•DOI•

A quantitative analysis on microarchitectures of modern CPU-FPGA platforms

[...]

Young-kyu Choi¹, Jason Cong¹, Zhenman Fang¹, Yuchen Hao¹, Glenn Reinman¹, Peng Wei¹ - Show less +2 more•Institutions (1)

University of California, Los Angeles¹

05 Jun 2016

TL;DR: This paper conducts a quantitative comparison and in-depth analysis on two representative platforms: QPI-based Intel-Altera HARP with coherent shared memory, and PCIe-based Alpha Data board with private device memory.

...read moreread less

Abstract: CPU-FPGA heterogeneous acceleration platforms have shown great potential for continued performance and energy efficiency improvement for modern data centers, and have captured great attention from both academia and industry. However, it is nontrivial for users to choose the right platform among various PCIe and QPI based CPU-FPGA platforms from different vendors. This paper aims to find out what microarchitectural characteristics affect the performance, and how. We conduct our quantitative comparison and in-depth analysis on two representative platforms: QPI-based Intel-Altera HARP with coherent shared memory, and PCIe-based Alpha Data board with private device memory. We provide multiple insights for both application developers and platform designers.

...read moreread less

128 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

Collapse

Cited by

PDF

Open Access

More filters

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

[...]

David Forsyth, Jean Ponce

01 Jan 2004

TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.

...read moreread less

Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

...read moreread less

3,627 citations

Proceedings Article•DOI•

The PARSEC benchmark suite: characterization and architectural implications

[...]

Christian Bienia¹, Sanjeev Kumar², Jaswinder Pal Singh¹, Kai Li¹•Institutions (2)

Princeton University¹, Intel²

25 Oct 2008

TL;DR: This paper presents and characterizes the Princeton Application Repository for Shared-Memory Computers (PARSEC), a benchmark suite for studies of Chip-Multiprocessors (CMPs), and shows that the benchmark suite covers a wide spectrum of working sets, locality, data sharing, synchronization and off-chip traffic.

...read moreread less

Abstract: This paper presents and characterizes the Princeton Application Repository for Shared-Memory Computers (PARSEC), a benchmark suite for studies of Chip-Multiprocessors (CMPs). Previous available benchmarks for multiprocessors have focused on high-performance computing applications and used a limited number of synchronization methods. PARSEC includes emerging applications in recognition, mining and synthesis (RMS) as well as systems applications which mimic large-scale multithreaded commercial programs. Our characterization shows that the benchmark suite covers a wide spectrum of working sets, locality, data sharing, synchronization and off-chip traffic. The benchmark suite has been made available to the public.

...read moreread less

3,514 citations

The theory of affordances

[...]

博之三嶋

01 Nov 2008

2,686 citations

Proceedings Article•DOI•

Achieving single channel, full duplex wireless communication

[...]

Jung-Il Choi¹, Mayank Jain¹, Kannan Srinivasan¹, Phil Levis¹, Sachin Katti¹ - Show less +1 more•Institutions (1)

Stanford University¹

20 Sep 2010

TL;DR: In this paper, a single channel full-duplex wireless transceiver is proposed, which uses a combination of RF and baseband techniques to achieve FD with minimal effect on link reliability.

...read moreread less

Abstract: This paper discusses the design of a single channel full-duplex wireless transceiver. The design uses a combination of RF and baseband techniques to achieve full-duplexing with minimal effect on link reliability. Experiments on real nodes show the full-duplex prototype achieves median performance that is within 8% of an ideal full-duplexing system. This paper presents Antenna Cancellation, a novel technique for self-interference cancellation. In conjunction with existing RF interference cancellation and digital baseband interference cancellation, antenna cancellation achieves the amount of self-interference cancellation required for full-duplex operation. The paper also discusses potential MAC and network gains with full-duplexing. It suggests ways in which a full-duplex system can solve some important problems with existing wireless systems including hidden terminals, loss of throughput due to congestion, and large end-to-end delays.

...read moreread less

1,623 citations

Journal Article•DOI•

ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars

[...]

Ali Shafiee¹, Anirban Nag¹, Naveen Muralimanohar², Rajeev Balasubramonian¹, John Paul Strachan², Miao Hu², R. Stanley Williams², Vivek Srikumar¹ - Show less +4 more•Institutions (2)

University of Utah¹, Hewlett-Packard²

18 Jun 2016

TL;DR: This work explores an in-situ processing approach, where memristor crossbar arrays not only store input weights, but are also used to perform dot-product operations in an analog manner.

...read moreread less

Abstract: A number of recent efforts have attempted to design accelerators for popular machine learning algorithms, such as those involving convolutional and deep neural networks (CNNs and DNNs). These algorithms typically involve a large number of multiply-accumulate (dot-product) operations. A recent project, DaDianNao, adopts a near data processing approach, where a specialized neural functional unit performs all the digital arithmetic operations and receives input weights from adjacent eDRAM banks.This work explores an in-situ processing approach, where memristor crossbar arrays not only store input weights, but are also used to perform dot-product operations in an analog manner. While the use of crossbar memory as an analog dot-product engine is well known, no prior work has designed or characterized a full-fledged accelerator based on crossbars. In particular, our work makes the following contributions: (i) We design a pipelined architecture, with some crossbars dedicated for each neural network layer, and eDRAM buffers that aggregate data between pipeline stages. (ii) We define new data encoding techniques that are amenable to analog computations and that can reduce the high overheads of analog-to-digital conversion (ADC). (iii) We define the many supporting digital components required in an analog CNN accelerator and carry out a design space exploration to identify the best balance of memristor storage/compute, ADCs, and eDRAM storage on a chip. On a suite of CNN and DNN workloads, the proposed ISAAC architecture yields improvements of 14.8×, 5.5×, and 7.5× in throughput, energy, and computational density (respectively), relative to the state-of-the-art DaDianNao architecture.

...read moreread less

1,558 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse