Home
/
Authors
/
Kubilay Atasu

Author

Kubilay Atasu

Other affiliations: École Polytechnique Fédérale de Lausanne, Boğaziçi University, École Polytechnique ...read more

Bio: Kubilay Atasu is an academic researcher from IBM. The author has contributed to research in topics: Instruction set & Regular expression. The author has an hindex of 14, co-authored 57 publications receiving 1318 citations. Previous affiliations of Kubilay Atasu include École Polytechnique Fédérale de Lausanne & Boğaziçi University.

Papers published on a yearly basis

2023
2022
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Automatic application-specific instruction-set extensions under microarchitectural constraints

[...]

Kubilay Atasu¹, Laura Pozzi², Paolo Ienne²•Institutions (2)

Boğaziçi University¹, École Polytechnique Fédérale de Lausanne²

02 Jun 2003

TL;DR: In this article, a more general algorithm which selects maximal speedup convex subgraphs of the application dataflow graph under fundamental micro-architectural constraints is presented, which improves significantly on the state of the art.

...read moreread less

Abstract: Many commercial processors now offer the possibility of extending their instruction set for a specific application - that is, to introduce customized functional units. There is a need to develop algorithms that decide automatically, from high-level application code, which operations are to be carried out in the customized extensions. A few algorithms exist but are severely limited in the type of operation clusters they can choose and hence reduce significantly the effectiveness of specialization. In this paper, we introduce a more general algorithm which selects maximal-speedup convex subgraphs of the application dataflow graph under fundamental microarchitectural constraints, and which improves significantly on the state of the art.

...read moreread less

355 citations

Journal Article•DOI•

Exact and approximate algorithms for the extension of embedded processor instruction sets

[...]

Laura Pozzi¹, Kubilay Atasu², Paolo Ienne²•Institutions (2)

École Polytechnique¹, École Polytechnique Fédérale de Lausanne²

01 Jul 2006-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: In this paper, a set of algorithms are proposed to find the best instruction set extensions (ISEs) for a given application, based on a detailed analysis of the application code.

...read moreread less

Abstract: In embedded computing, cost, power, and performance constraints call for the design of specialized processors, rather than for the use of the existing off-the-shelf solutions. While the design of these application-specific CPUs could be tackled from scratch, a cheaper and more effective option is that of extending the existing processors and toolchains. Extensibility is indeed a feature now offered in real designs, e.g., by processors such as Tensilica Xtensa [T. R. Halfhill, Microprocess Rep., 2003], ARC ARCtangent [T. R. Halfhill, Microprocess Rep., 2000], STMicroelectronics ST200 [P. Faraboschi, G. Brown, J. A. Fisher, G. Desoli, and F. Homewood, Proc. 27th Annu. Int. Symp. Computer Architecture, 2000, p. 203], and MIPS CorExtend [T. R. Halfhill, Microprocess Rep., 2003]. While all these processors provide development environments with simulation capabilities for evaluating efficiently hand-crafted solutions, the tools to identify automatically the best processor configuration for a given application are less common. In particular, solutions to choose specialized instruction-set extensions (ISEs) have been investigated in the past years but are still seldom part of commercial toolchains. This paper provides a formal methodology and a set of algorithms that help address the problem. It proposes exact algorithms to derive optimal ISEs; exact identification of a single ISE is applicable to basic blocks of up to 1500 assembler-like instructions. This paper also introduces approximate methods that can process basic blocks of larger size. Results show that the described algorithms find solutions close to those that a designer would obtain by a detailed study of the application code. Both heuristic and exact algorithms find ISEs able to speed up unextended processors up to 5.0x. State-of-the-art comparisons show that the presented algorithms outperform existing ones by up to 2.6x

...read moreread less

212 citations

Proceedings Article•DOI•

An integer linear programming approach for identifying instruction-set extensions

[...]

Can Özturan¹, Gunhan Dundar¹, Kubilay Atasu¹•Institutions (1)

Boğaziçi University¹

19 Sep 2005

TL;DR: In this paper, an ILP approach to the instruction-set extension identification problem is presented, where an algorithm that iteratively generates and solves a set of ILP problems is proposed.

...read moreread less

Abstract: This paper presents an Integer Linear Programming (ILP) approach to the instruction-set extension identification problem. An algorithm that iteratively generates and solves a set of ILP problems in order to generate a set of templates is proposed. A selection algorithm that ranks the generated templates based on isomorphism testing and potential evaluation is described. A Trimaran based framework is used to evaluate the quality of the instructions generated by the technique. Speed-up results of up to 7.5 are observed.

...read moreread less

88 citations

Proceedings Article•DOI•

Introduction of local memory elements in instruction set extensions

[...]

Partha Biswas¹, Vinay Choudhary², Kubilay Atasu², Laura Pozzi², Paolo Ienne², Nikil Dutt¹ - Show less +2 more•Institutions (2)

University of California, Irvine¹, École Polytechnique Fédérale de Lausanne²

07 Jun 2004

TL;DR: This paper introduces memory elements into custom units which result in ISEs closer to those sought after by the designers, and devised a genetic algorithm to specifically exploit opportunities of introducing memory elements during ISE generation.

...read moreread less

Abstract: Automatic generation of Instruction Set Extensions (ISEs), to be executed on a custom processing unit or a coprocessor is an important step towards processor customization. A typical goal of a manual designer is to combine a large number of atomic instructions into an ISE satisfying microarchitectural constraints. However, memory operations pose a challenge for previous ISE approaches by limiting the size of the resulting instruction. In this paper, we introduce memory elements into custom units which result in ISEs closer to those sought after by the designers. We consider two kinds of memory elements for mapping to the specialized hardware: small hardware tables and architecturally-visible state registers. We devised a genetic algorithm to specifically exploit opportunities of introducing memory elements during ISE generation. Finally, we demonstrate the effectiveness of our approach by a detailed study of the variation in performance, area and energy in the presence of the generated ISEs, on a number of MediaBench, EEMBC and cryptographic applications. With the introduction of memory, the average speedup varied from 2.7X to 5X depending on the architectural configuration with a nominal area overhead. Moreover, we obtained an average energy reduction of 26% with respect to a 32-KB cache.

...read moreread less

72 citations

Proceedings Article•DOI•

Designing a Programmable Wire-Speed Regular-Expression Matching Accelerator

[...]

Jan van Lunteren¹, Christoph Hagleitner¹, Timothy Heil², Giora Biran¹, Uzi Shvadron¹, Kubilay Atasu¹ - Show less +2 more•Institutions (2)

IBM¹, University of Rochester²

01 Dec 2012

TL;DR: The RegX accelerator in the IBM Power Edge of Network (PowerEN) processor supports these applications using a combination of fast programmable state machines and simple processing units to scan data streams against thousands of regular-expression patterns at state-of-the-art Ethernet link speeds.

...read moreread less

Abstract: A growing number of applications rely on fast pattern matching to scan data in real-time for security and analytics purposes. The RegX accelerator in the IBM Power Edge of Network (PowerEN) processor supports these applications using a combination of fast programmable state machines and simple processing units to scan data streams against thousands of regular-expression patterns at state-of-the-art Ethernet link speeds. RegX employs a special rule cache and includes several new micro-architectural features that enable various instruction dispatch and execution options for the processing units. The architecture applies RISC philosophy to special-purpose computing: hardware provides fast, simple primitives, typically performed in a single cycle, which are exploited by an intelligent compiler and system software for high performance. This approach provides the flexibility required to achieve good performance across a wide range of workloads. As implemented in the PowerEN processor, the accelerator achieves a theoretical peak scan rate of 73.6 Gbit/s, and a measured scan rate of about 15 to 40 Gbit/s for typical intrusion detection workloads.

...read moreread less

62 citations

1
2
3
4
…
5
6
7
8
9
10
11
12

Collapse

Cited by

PDF

Open Access

More filters

Matrix Factorization Techniques for Recommender Systems

[...]

Patrick Seemann

01 Jan 2014

2,080 citations

Journal Article•

Interpolation and SAT-based model checking

[...]

Kenneth L. McMillan¹•Institutions (1)

Lawrence Berkeley National Laboratory¹

01 Jan 2003-Lecture Notes in Computer Science

TL;DR: In benchmark studies using a set of large industrial circuit verification instances, this method is greatly more efficient than BDD-based symbolic model checking, and compares favorably to some recent SAT-based model checking methods on positive instances.

...read moreread less

Abstract: We consider a fully SAT-based method of unbounded symbolic model checking based on computing Craig interpolants. In benchmark studies using a set of large industrial circuit verification instances, this method is greatly more efficient than BDD-based symbolic model checking, and compares favorably to some recent SAT-based model checking methods on positive instances.

...read moreread less

775 citations

Journal Article•DOI•

Multiprocessor System-on-Chip (MPSoC) Technology

[...]

Wayne Wolf¹, Ahmed Amine Jerraya¹, G. Martin•Institutions (1)

Georgia Institute of Technology¹

01 Oct 2008-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: The history of MPSoCs is surveyed to argue that they represent an important and distinct category of computer architecture and to survey computer-aided design problems relevant to the design of MP soCs.

...read moreread less

Abstract: The multiprocessor system-on-chip (MPSoC) uses multiple CPUs along with other hardware subsystems to implement a system. A wide range of MPSoC architectures have been developed over the past decade. This paper surveys the history of MPSoCs to argue that they represent an important and distinct category of computer architecture. We consider some of the technological trends that have driven the design of MPSoCs. We also survey computer-aided design problems relevant to the design of MPSoCs.

...read moreread less

435 citations

Journal Article•DOI•

LegUp: An open-source high-level synthesis tool for FPGA-based processor/accelerator systems

[...]

Andrew Canis¹, Jongsok Choi¹, Mark Aldham¹, Victor Zhang¹, Ahmed Kammoona¹, Tomasz Czajkowski², Stephen J. Brown¹, Jason H. Anderson¹ - Show less +4 more•Institutions (2)

University of Toronto¹, Altera²

30 Sep 2013-ACM Transactions in Embedded Computing Systems

TL;DR: Results show that the tool produces hardware solutions of comparable quality to a commercial high-level synthesis tool, and results demonstrate the ability of the tool to explore the hardware/software codesign space by varying the amount of a program that runs in software versus hardware.

...read moreread less

Abstract: It is generally accepted that a custom hardware implementation of a set of computations will provide superior speed and energy efficiency relative to a software implementation. However, the cost and difficulty of hardware design is often prohibitive, and consequently, a software approach is used for most applications. In this article, we introduce a new high-level synthesis tool called LegUp that allows software techniques to be used for hardware design. LegUp accepts a standard C program as input and automatically compiles the program to a hybrid architecture containing an FPGA-based MIPS soft processor and custom hardware accelerators that communicate through a standard bus interface. In the hybrid processor/accelerator architecture, program segments that are unsuitable for hardware implementation can execute in software on the processor. LegUp can synthesize most of the C language to hardware, including fixed-sized multidimensional arrays, structs, global variables, and pointer arithmetic. Results show that the tool produces hardware solutions of comparable quality to a commercial high-level synthesis tool. We also give results demonstrating the ability of the tool to explore the hardware/software codesign space by varying the amount of a program that runs in software versus hardware. LegUp, along with a set of benchmark C programs, is open source and freely downloadable, providing a powerful platform that can be leveraged for new research on a wide range of high-level synthesis topics.

...read moreread less

302 citations

Journal Article•DOI•

Hardware/Software Codesign: The Past, the Present, and Predicting the Future

[...]

Jürgen Teich¹•Institutions (1)

University of Erlangen-Nuremberg¹

13 May 2012

TL;DR: This paper presents major achievements of two decades of research on methods and tools for hardware/software codesign by starting with a historical survey of its roots, highlighting its major research directions and achievements until today, and predicting in which direction research in codesign might evolve in the decades to come.

...read moreread less

Abstract: Hardware/software codesign investigates the concurrent design of hardware and software components of complex electronic systems. It tries to exploit the synergy of hardware and software with the goal to optimize and/or satisfy design constraints such as cost, performance, and power of the final product. At the same time, it targets to reduce the time-to-market frame considerably. This paper presents major achievements of two decades of research on methods and tools for hardware/software codesign by starting with a historical survey of its roots, by highlighting its major research directions and achievements until today, and finally, by predicting in which direction research in codesign might evolve in the decades to come.

...read moreread less

275 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182

Collapse