Home
/
Authors
/
Paul Feautrier

Author

Paul Feautrier

Other affiliations: University of Paris, French Institute for Research in Computer Science and Automation, Versailles Saint-Quentin-en-Yvelines University ...read more

Bio: Paul Feautrier is an academic researcher from École normale supérieure de Lyon. The author has contributed to research in topics: Automatic parallelization & Polytope model. The author has an hindex of 24, co-authored 75 publications receiving 3992 citations. Previous affiliations of Paul Feautrier include University of Paris & French Institute for Research in Computer Science and Automation.

Papers published on a yearly basis

2018
2016
2014
2013
2012
2011
2010
2009
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1988
1986

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Dataflow analysis of array and scalar references

[...]

Paul Feautrier

01 Feb 1991-International Journal of Parallel Programming

TL;DR: This paper presents an algorithm for analyzing the patterns along which values flow as the execution proceeds, and discusses several applications of the method: conversion of a program to a set of recurrence equations, array and scalar expansion, program verification and parallel program construction.

...read moreread less

Abstract: Given a program written in a simple imperative language (assignment statements,for loops, affine indices and loop limits), this paper presents an algorithm for analyzing the patterns along which values flow as the execution proceeds. For each array or scalar reference, the result is the name and iteration vector of the source statement as a function of the iteration vector of the referencing statement. The paper discusses several applications of the method: conversion of a program to a set of recurrence equations, array and scalar expansion, program verification and parallel program construction.

...read moreread less

618 citations

Journal Article•DOI•

Some efficient solutions to the affine scheduling problem: I. One-dimensional time

[...]

Paul Feautrier

01 Oct 1992-International Journal of Parallel Programming

TL;DR: This paper deals with the problem of finding closed form schedules as affine or piecewise affine functions of the iteration vector and presents an algorithm which reduces the scheduling problem to a parametric linear program of small size, which can be readily solved by an efficient algorithm.

...read moreread less

Abstract: Programs and systems of recurrence equations may be represented as sets of actions which are to be executed subject to precedence constraints. In may cases, actions may be labelled by integral vectors in some iterations domains, and precedence constraints may be described by affine relations. A schedule for such a program is a function which assigns an execution data to each action. Knowledge of such a schedule allows one to estimate the intrinsic degree of parallelism of the program and to compile a parallel version for multiprocessor architectures or systolic arrays. This paper deals with the problem of finding closed form schedules as affine or piecewise affine functions of the iteration vector. An algorithm is presented which reduces the scheduling problem to a parametric linear program of small size, which can be readily solved by an efficient algorithm.

...read moreread less

614 citations

Journal Article•DOI•

Parametric integer programming

[...]

Paul Feautrier

01 Jan 1988-Rairo-operations Research

TL;DR: In this paper, the analysis semantique des programs informatiques conduit a la resolution de problemes de programmation parametrique entiere, i.e., the problem of finding a parametrization of a program.

...read moreread less

Abstract: L'analyse semantique des programmes informatiques conduit a la resolution de problemes de programmation parametrique entiere. L'article s'est ainsi consacre a la construction d'un algorithme de ce type

...read moreread less

454 citations

Journal Article•DOI•

Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time

[...]

Paul Feautrier

01 Dec 1992-International Journal of Parallel Programming

TL;DR: This paper extends the algorithms which were developed in Part I to cases in which there is no affine schedule, i.e. to problems whose parallel complexity is polynomial but not linear, and gives some experimental evidence for the applicability, performances and limitations of the algorithm.

...read moreread less

Abstract: This paper extends the algorithms which were developed in Part I to cases in which there is no affine schedule, i.e. to problems whose parallel complexity is polynomial but not linear. The natural generalization is to multidimensional schedules with lexicographic ordering as temporal succession. Multidimensional affine schedules, are, in a sense, equivalent to polynomial schedules, and are much easier to handle automatically. Furthermore, there is a strong connection between multidimensional schedules and loop nests, which allows one to prove that a static control program always has a multidimensional schedule. Roughly, a larger dimension indicates less parallelism. In the algorithm which is presented here, this dimension is computed dynamically, and is just sufficient for scheduling the source program. The algorithm lends itself to a “divide and conquer” strategy. The paper gives some experimental evidence for the applicability, performances and limitations of the algorithm.

...read moreread less

445 citations

Book Chapter•DOI•

Automatic Parallelization in the Polytope Model

[...]

Paul Feautrier

01 Jan 1996

TL;DR: The aim of this paper is to explain the importance of polytope and polyhedra in automatic parallelization, and shows that the semantics of parallel programs is best described geometrically, as properties of sets of integral points in n-dimensional spaces.

...read moreread less

Abstract: The aim of this paper is to explain the importance of polytope and polyhedra in automatic parallelization. We show that the semantics of parallel programs is best described geometrically, as properties of sets of integral points in n-dimensional spaces, where n is related to the maximum nesting depth of DO loops. The needed properties translate nicely to properties of polyhedra, for which many algorithms have been designed for the needs of optimization and operation research. We show how these ideas apply to scheduling, placement and parallel code generation.

...read moreread less

217 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

A data locality optimizing algorithm

[...]

Michael Wolf¹, Monica S. Lam¹•Institutions (1)

Stanford University¹

01 May 1991

TL;DR: An algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling is proposed, and is successful in optimizing codes such as matrix multiplication, successive over-relaxation, LU decomposition without pivoting, and Givens QR factorization.

...read moreread less

Abstract: This paper proposes an algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling. The loop transformation algorithm is based on two concepts: a mathematical formulation of reuse and locality, and a loop transformation theory that unifies the various transforms as unimodular matrix transformations.The algorithm has been implemented in the SUIF (Stanford University Intermediate Format) compiler, and is successful in optimizing codes such as matrix multiplication, successive over-relaxation (SOR), LU decomposition without pivoting, and Givens QR factorization. Performance evaluation indicates that locality optimization is especially crucial for scaling up the performance of parallel code.

...read moreread less

1,352 citations

Journal Article•DOI•

Compiler transformations for high-performance computing

[...]

David F. Bacon¹, Susan L. Graham¹, Oliver Sharp¹•Institutions (1)

University of California, Berkeley¹

01 Dec 1994-ACM Computing Surveys

TL;DR: This survey is a comprehensive overview of the important high-level program restructuring techniques for imperative languages, such as C and Fortran, and describes the purpose of each transformation, how to determine if it is legal, and an example of its application.

...read moreread less

Abstract: In the last three decades a large number of compiler transformations for optimizing programs have been implemented. Most optimizations for uniprocessors reduce the number of instructions executed by the program using transformations based on the analysis of scalar quantities and data-flow techniques. In contrast, optimizations for high-performance superscalar, vector, and parallel processors maximize parallelism and memory locality with transformations that rely on tracking the properties of arrays using loop dependence analysis.This survey is a comprehensive overview of the important high-level program restructuring techniques for imperative languages, such as C and Fortran. Transformations for both sequential and various types of parallel architectures are covered in depth. We describe the purpose of each transformation, explain how to determine if it is legal, and give an example of its application.Programmers wishing to enhance the performance of their code can use this survey to improve their understanding of the optimizations that compilers can perform, or as a reference for techniques to be applied manually. Students can obtain an overview of optimizing compiler technology. Compiler writers can use this survey as a reference for most of the important optimizations developed to date, and as bibliographic reference for the details of each optimization. Readers are expected to be familiar with modern computer architecture and basic program compilation techniques.

...read moreread less

946 citations

Proceedings Article•DOI•

A practical automatic polyhedral parallelizer and locality optimizer

[...]

Uday Bondhugula¹, Albert Hartono¹, J. Ramanujam², P. Sadayappan¹•Institutions (2)

Ohio State University¹, Louisiana State University²

07 Jun 2008

TL;DR: An automatic polyhedral source-to-source transformation framework that can optimize regular programs for parallelism and locality simultaneously simultaneously and is implemented into a tool to automatically generate OpenMP parallel code from C program sections.

...read moreread less

Abstract: We present the design and implementation of an automatic polyhedral source-to-source transformation framework that can optimize regular programs (sequences of possibly imperfectly nested loops) for parallelism and locality simultaneously. Through this work, we show the practicality of analytical model-driven automatic transformation in the polyhedral model -- far beyond what is possible by current production compilers. Unlike previous works, our approach is an end-to-end fully automatic one driven by an integer linear optimization framework that takes an explicit view of finding good ways of tiling for parallelism and locality using affine transformations. The framework has been implemented into a tool to automatically generate OpenMP parallel code from C program sections. Experimental results from the tool show very high speedups for local and parallel execution on multi-cores over state-of-the-art compiler frameworks from the research community as well as the best native production compilers. The system also enables the easy use of powerful empirical/iterative optimization for general arbitrarily nested loop sequences.

...read moreread less

930 citations

Journal Article•DOI•

SPIRAL: Code Generation for DSP Transforms

[...]

Markus Püschel¹, Jose M. F. Moura¹, Jeremy Johnson², David Padua³, Manuela Veloso¹, Bryan Singer, Jianxin Xiong⁴, Franz Franchetti¹, A. Gacic¹, Yevgen Voronenko¹, K. Chen⁵, R. W. Johnson, Nick Rizzolo³ - Show less +9 more•Institutions (5)

Carnegie Mellon University¹, Drexel University², University of Illinois at Urbana–Champaign³, University of Cambridge⁴, STMicroelectronics⁵

27 Jun 2005

TL;DR: SPIRAL generates high-performance code for a broad set of DSP transforms, including the discrete Fourier transform, other trigonometric transforms, filter transforms, and discrete wavelet transforms.

...read moreread less

Abstract: Fast changing, increasingly complex, and diverse computing platforms pose central problems in scientific computing: How to achieve, with reasonable effort, portable optimal performance? We present SPIRAL, which considers this problem for the performance-critical domain of linear digital signal processing (DSP) transforms. For a specified transform, SPIRAL automatically generates high-performance code that is tuned to the given platform. SPIRAL formulates the tuning as an optimization problem and exploits the domain-specific mathematical structure of transform algorithms to implement a feedback-driven optimizer. Similar to a human expert, for a specified transform, SPIRAL "intelligently" generates and explores algorithmic and implementation choices to find the best match to the computer's microarchitecture. The "intelligence" is provided by search and learning techniques that exploit the structure of the algorithm and implementation space to guide the exploration and optimization. SPIRAL generates high-performance code for a broad set of DSP transforms, including the discrete Fourier transform, other trigonometric transforms, filter transforms, and discrete wavelet transforms. Experimental results show that the code generated by SPIRAL competes with, and sometimes outperforms, the best available human tuned transform library code.

...read moreread less

853 citations

Journal Article•DOI•

Advanced compiler optimizations for supercomputers

[...]

David Padua¹, Michael Wolfe¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 1986-Communications of The ACM

TL;DR: Compilers for vector or multiprocessor computers must have certain optimization features to successfully generate parallel code to be able to operate on parallel systems.

...read moreread less

Abstract: Compilers for vector or multiprocessor computers must have certain optimization features to successfully generate parallel code.

...read moreread less

758 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse