Home
/
Authors
/
Andreas Gerstlauer

Author

Andreas Gerstlauer

Other affiliations: University of California, Irvine, University of California

Bio: Andreas Gerstlauer is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Design space exploration & SpecC. The author has an hindex of 29, co-authored 176 publications receiving 3093 citations. Previous affiliations of Andreas Gerstlauer include University of California, Irvine & University of California.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999

Papers

PDF

Open Access

More filters

Journal Article•DOI•

DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters

[...]

Zhuoran Zhao¹, Kamyar Mirzazad Barijough¹, Andreas Gerstlauer¹•Institutions (1)

University of Texas at Austin¹

16 Oct 2018-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: DeepThings is proposed, a framework for adaptively distributed execution of CNN-based inference applications on tightly resource-constrained IoT edge clusters that employs a scalable Fused Tile Partitioning of convolutional layers to minimize memory footprint while exposing parallelism.

...read moreread less

Abstract: Edge computing has emerged as a trend to improve scalability, overhead, and privacy by processing large-scale data, e.g., in deep learning applications locally at the source. In IoT networks, edge devices are characterized by tight resource constraints and often dynamic nature of data sources, where existing approaches for deploying Deep/Convolutional Neural Networks (DNNs/CNNs) can only meet IoT constraints when severely reducing accuracy or using a static distribution that cannot adapt to dynamic IoT environments. In this paper, we propose DeepThings, a framework for adaptively distributed execution of CNN-based inference applications on tightly resource-constrained IoT edge clusters. DeepThings employs a scalable Fused Tile Partitioning (FTP) of convolutional layers to minimize memory footprint while exposing parallelism. It further realizes a distributed work stealing approach to enable dynamic workload distribution and balancing at inference runtime. Finally, we employ a novel work scheduling process to improve data reuse and reduce overall execution latency. Results show that our proposed FTP method can reduce memory footprint by more than 68% without sacrificing accuracy. Furthermore, compared to existing work sharing methods, our distributed work stealing and work scheduling improve throughput by $1.7\times -2.2\times$ with multiple dynamic data sources. When combined, DeepThings provides scalable CNN inference speedups of $1.7\times$ – $3.5\times$ on 2–6 edge devices with less than 23 MB memory each.

...read moreread less

323 citations

Proceedings Article•DOI•

RTOS Modeling for System Level Design

[...]

Andreas Gerstlauer¹, Haobo Yu¹, Daniel D. Gajski¹•Institutions (1)

University of California, Irvine¹

03 Mar 2003

TL;DR: This paper proposes a RTOS model built on top of existing SLDLs which, by providing the key features typically available in any RTOS, allows the designer to model the dynamic behavior of multi-tasking systems at higher abstraction levels to be incorporated into existing design flows.

...read moreread less

Abstract: System level synthesis is widely seen as the solution for closing the productivity gap in system design. High level system models are used in system level design for early design exploration. While real time operating systems (RTOS) are an increasingly important component in system design, specific RTOS implementations can not be used directly in high level models. On the other hand, existing system level design languages (SLDL) lack support for RTOS modeling. In this paper we propose a RTOS model built on top of existing SLDLs which, by providing the key features typically available in any RTOS, allows the designer to model the dynamic behavior of multi-tasking systems at higher abstraction levels to be incorporated into existing design flows. Experimental result shows that our RTOS model is easy to use and efficient while being able to provide accurate results.

...read moreread less

175 citations

Journal Article•DOI•

Electronic System-Level Synthesis Methodologies

[...]

Andreas Gerstlauer¹, Christian Haubelt², Andy D. Pimentel³, Todor Stefanov⁴, Daniel D. Gajski⁵, Jürgen Teich² - Show less +2 more•Institutions (5)

University of Texas at Austin¹, University of Erlangen-Nuremberg², University of Amsterdam³, Leiden University⁴, University of California, Irvine⁵

01 Oct 2009-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper develops and proposes a novel classification for ESL synthesis tools, and presents six different academic approaches in this context based on common principles and needs that are ultimately required for a true ESL synthesis solution.

...read moreread less

Abstract: With ever-increasing system complexities, all major semiconductor roadmaps have identified the need for moving to higher levels of abstraction in order to increase productivity in electronic system design. Most recently, many approaches and tools that claim to realize and support a design process at the so-called electronic system level (ESL) have emerged. However, faced with the vast complexity challenges, in most cases at best, only partial solutions are available. In this paper, we develop and propose a novel classification for ESL synthesis tools, and we will present six different academic approaches in this context. Based on these observations, we can identify such common principles and needs as they are leading toward and are ultimately required for a true ESL synthesis solution, covering the whole design process from specification to implementation for complete systems across hardware and software boundaries.

...read moreread less

174 citations

Journal Article•DOI•

System-on-chip environment: a SpecC-based framework for heterogeneous MPSoC design

[...]

Rainer Dömer¹, Andreas Gerstlauer¹, Junyu Peng¹, Dongwan Shin¹, Lukai Cai¹, Haobo Yu¹, Samar Abdi¹, Daniel D. Gajski¹ - Show less +4 more•Institutions (1)

University of California, Irvine¹

30 Jan 2008-Eurasip Journal on Embedded Systems

TL;DR: This article presents a comprehensive design framework, the system-on-chip environment (SCE), which is based on the influential SpecC language and methodology and enables rapid design space exploration and efficient MPSoC implementation.

...read moreread less

Abstract: The constantly growing complexity of embedded systems is a challenge that drives the development of novel design automation techniques. C-based system-level design addresses the complexity challenge by raising the level of abstraction and integrating the design processes for the heterogeneous system components. In this article, we present a comprehensive design framework, the system-on-chip environment (SCE) which is based on the influential SpecC language and methodology. SCE implements a top-down system design flow based on a specify-explore-refine paradigm with support for heterogeneous target platforms consisting of custom hardware components, embedded software processors, dedicated IP blocks, and complex communication bus architectures. Starting from an abstract specification of the desired system, models at various levels of abstraction are automatically generated through successive step-wise refinement, resulting in a pin-and cycle-accurate system implementation. The seamless integration of automatic model generation, estimation, and verification tools enables rapid design space exploration and efficient MPSoC implementation. Using a large set of industrial-strength examples with a wide range of target architectures, our experimental results demonstrate the effectiveness of our framework and show significant productivity gains in design time.

...read moreread less

124 citations

Book•

System Design: A Practical Guide with SpecC

[...]

Daniel D. Gajski, Rainer Dömer, Junyu Peng, Andreas Gerstlauer

31 May 2001

TL;DR: System Design: A Practical Guide with SpecC will benefit designers and design managers of complex SOCs, or embedded systems in general, by allowing them to develop new methodologies from these results, in order to increase design productivity by orders of magnitude.

...read moreread less

Abstract: From the Publisher: "System Design: A Practical Guide with SpecC will benefit designers and design managers of complex SOCs, or embedded systems in general, by allowing them to develop new methodologies from these results, in order to increase design productivity by orders of magnitude. Designers at RTL, logical or physical levels, who are interested in moving up to the system-level, will find a comprehensive overview within. The design models in the book define IP models and functions for IP exchange between IP providers and their users. A well-defined methodology like the one presented in this book will help product planning divisions to quickly develop new products or to derive completely new business models, like e-design or product-on-demand. Finally, researchers and students in the area of system design will find an example of a formal, well-structured design flow in this book."--BOOK JACKET.

...read moreread less

123 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

Collapse

Cited by

PDF

Open Access

More filters

DOI•

International Technology Roadmap for Semiconductors 2003の要求清浄度について－シリコンウエハ表面と雰囲気環境に要求される清浄度, 分析方法の現状について－

[...]

飯田裕幸, 竹田菊男, 藤本武利

20 Sep 2004

1,387 citations

Journal Article•DOI•

Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing

[...]

Zhi Zhou¹, Xu Chen¹, En Li¹, Liekang Zeng¹, Ke Luo¹, Junshan Zhang² - Show less +2 more•Institutions (2)

Sun Yat-sen University¹, Arizona State University²

12 Jun 2019

TL;DR: A comprehensive survey of the recent research efforts on edge intelligence can be found in this paper, where the authors review the background and motivation for AI running at the network edge and provide an overview of the overarching architectures, frameworks, and emerging key technologies for deep learning model toward training/inference at the edge.

...read moreread less

Abstract: With the breakthroughs in deep learning, the recent years have witnessed a booming of artificial intelligence (AI) applications and services, spanning from personal assistant to recommendation systems to video/audio surveillance. More recently, with the proliferation of mobile computing and Internet of Things (IoT), billions of mobile and IoT devices are connected to the Internet, generating zillions bytes of data at the network edge. Driving by this trend, there is an urgent need to push the AI frontiers to the network edge so as to fully unleash the potential of the edge big data. To meet this demand, edge computing, an emerging paradigm that pushes computing tasks and services from the network core to the network edge, has been widely recognized as a promising solution. The resulted new interdiscipline, edge AI or edge intelligence (EI), is beginning to receive a tremendous amount of interest. However, research on EI is still in its infancy stage, and a dedicated venue for exchanging the recent advances of EI is highly desired by both the computer system and AI communities. To this end, we conduct a comprehensive survey of the recent research efforts on EI. Specifically, we first review the background and motivation for AI running at the network edge. We then provide an overview of the overarching architectures, frameworks, and emerging key technologies for deep learning model toward training/inference at the network edge. Finally, we discuss future research opportunities on EI. We believe that this survey will elicit escalating attentions, stimulate fruitful discussions, and inspire further research ideas on EI.

...read moreread less

977 citations

Proceedings Article•DOI•

Approximate computing: An emerging paradigm for energy-efficient design

[...]

Jie Han¹, Michael Orshansky²•Institutions (2)

University of Alberta¹, University of Texas at Austin²

27 May 2013

TL;DR: This paper reviews recent progress in the area, including design of approximate arithmetic blocks, pertinent error and quality measures, and algorithm-level techniques for approximate computing.

...read moreread less

Abstract: Approximate computing has recently emerged as a promising approach to energy-efficient design of digital systems. Approximate computing relies on the ability of many systems and applications to tolerate some loss of quality or optimality in the computed result. By relaxing the need for fully precise or completely deterministic operations, approximate computing techniques allow substantially improved energy efficiency. This paper reviews recent progress in the area, including design of approximate arithmetic blocks, pertinent error and quality measures, and algorithm-level techniques for approximate computing.

...read moreread less

921 citations

Journal Article•DOI•

Deep Learning With Edge Computing: A Review

[...]

Jiasi Chen¹, Xukan Ran¹•Institutions (1)

University of California, Riverside¹

15 Jul 2019

TL;DR: This paper will provide an overview of applications where deep learning is used at the network edge, discuss various approaches for quickly executing deep learning inference across a combination of end devices, edge servers, and the cloud, and describe the methods for training deep learning models across multiple edge devices.

...read moreread less

Abstract: Deep learning is currently widely used in a variety of applications, including computer vision and natural language processing. End devices, such as smartphones and Internet-of-Things sensors, are generating data that need to be analyzed in real time using deep learning or used to train deep learning models. However, deep learning inference and training require substantial computation resources to run quickly. Edge computing, where a fine mesh of compute nodes are placed close to end devices, is a viable way to meet the high computation and low-latency requirements of deep learning on edge devices and also provides additional benefits in terms of privacy, bandwidth efficiency, and scalability. This paper aims to provide a comprehensive review of the current state of the art at the intersection of deep learning and edge computing. Specifically, it will provide an overview of applications where deep learning is used at the network edge, discuss various approaches for quickly executing deep learning inference across a combination of end devices, edge servers, and the cloud, and describe the methods for training deep learning models across multiple edge devices. It will also discuss open challenges in terms of systems performance, network technologies and management, benchmarks, and privacy. The reader will take away the following concepts from this paper: understanding scenarios where deep learning at the network edge can be useful, understanding common techniques for speeding up deep learning inference and performing distributed training on edge devices, and understanding recent trends and opportunities.

...read moreread less

793 citations

Book•

IEEE transactions on computer-aided design of integrated circuits and systems : a publication of the IEEE Circuits and Systems Society

[...]

Ieee Circuits

01 Jan 1982

729 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse