What are the contributions mentioned in the paper "Parametric query optimization for linear and piecewise linear cost functions" ?

The authors first propose a solution for the PQO problem for the case when the cost functions are linear in the given parameters. The authors then propose a solution for the PQO problem for the case when the cost functions are piecewise-linear in the given parameters.

What is the future work of the project?

Future work includes implementing or using polyhedron handling code that minimizes overheads, and characterizing the performance of their algorithms.

How can the authors apply the cost polytope algorithm in the piecewise linear case?

The cost polytope algorithm for the linear case can be applied in the piecewise linear case by pre-partitioning the parameter space in a way that every cost function is linear in every partition.

how many calls to the optimizer are necessary to check if a given set of plans?

A total of v calls to the optimizer are necessary and sufficient to check if a given set of plans, with a parameter space decomposition defining a region of optimality for each plan, is the POSP .

What is the cost function of a plan in n+1?

The authors can think of the cost function as a hyperplane in n+1 whose equation is given bysn+1 = c1s1 + c2s2 + · · · + cnsn + cn+1 where sn+1 denotes the cost of the plan.

What is the procedure for carving out the optimal region for a new plan?

in Ganguly’s algorithm, the procedure for carving out the optimal region for a new plan (given the existing decomposition) begins with the parameter space polytope and chips out the optimal region of each plan in the existing decomposition.

What is the probability that a cost hyperplane touches it?

If the coefficient vectors8 of the cost functions are distributed uniformly in a unit sphere then the probability that a cost hyperplane not defining the final cost polytope touches it is zero; see[HS02] for details.

what is the parametric query optimization problem?

The parametric query optimization (PQO) problem is defined as follows [Gan98]: Let s1, s2, . . . , sn denote n parameters, where each si quantifies some cost parameter.

What is the way to save on the number of calls to the optimizer?

the authors can then save on the number of calls to the optimizer by intersecting the intermediate cost polytope with all the returned hyperplanes.

What is the way to calculate the cost hyperplane?

The authors assumed that the conventional optimizer returns one of the optimal plans at a given point in the parameter space, along with its cost hyperplane.

(Open Access) Parametric query optimization for linear and piecewise linear cost functions (2002) | Arvind Hulgeri

Q: What is the way to solve the parametric query optimization problem?

The authors propose an approach for parametric query optimization with piecewise linear cost functions, based on extending existing optimization algorithms to use cost functions in place of costs.

Parametric Query Optimization for Linear and

Piecewise Linear Cost Functions

Arvind Hulgeri

∗

S. Sudarshan

Indian Institute of Technology, Bombay

{aru, sudarsha}@cse.iitb.ac.in

Abstract

The cost of a query plan depends on many pa-

rameters, such as predicate selectivities and

available memory, whose values may not be

known at optimization time. Parametric

query optimization (PQO) optimizes a query

into a number of candidate plans, each opti-

mal for some region of the parameter space.

We ﬁrst propose a solution for the PQO prob-

lem for the case when the cost functions are

linear in the given parameters. This solution

is minimally intrusive in the sense that an ex-

isting query optimizer can be used with minor

modiﬁcations: the solution invokes the con-

ventional query optimizer multiple times, with

diﬀerent parameter values.

We then propose a solution for the PQO prob-

lem for the case when the cost functions are

piecewise-linear in the given parameters. The

solution is based on modiﬁcation of an exist-

ing query optimizer. This solution is quite

general, since arbitrary cost functions can be

approximated to piecewise linear form. Both

the solutions work for an arbitrary number of

parameters.

1 Introduction

The cost of a query plan depends on various database

and system parameters. The database parameters in-

clude selectivity of the predicates and sizes of the rela-

tions. The system parameters include available mem-

ory, disk bandwidth and latency. The exact values of

∗

Work supported by an Infosys Fellowship

Permission to copy without fe e all or p art of this material is

granted provided that the copies are not made or distributed for

dire ct commercial advantage, the VLDB copyright notice and

the title of the publication and its date appear, and notic e is

given that copying is by permission of the Very L arge Data Base

Endowment. To copy otherwise, or to republish, requires a fee

and/or special permission from the Endowment.

Proceedings of the 28th VLDB Conference,

Hong Kong, China, 2002

these parameters may not be known at compile time.

For example, in the case of embedded SQL queries con-

taining unbound variables, the values of the variables

are known only at run time. In general, the avail-

able memory is not known until runtime. Optimizing

a query into a single plan may result in a substan-

tially sub-optimal plan if the actual values are diﬀerent

from those assumed at optimization time [GW89]. To

overcome this problem, parametric query optimization

(PQO) optimizes a query into a number of candidate

plans, each optimal for some region of the parameter

space [CG94, INSS97, INSS92, GK94, Gan98]. At run

time, when the actual parameter values are known, the

appropriate plan can be chosen.

The contributions of this paper lie in providing two

novel solutions for the parametric query optimization

problem:

• We provide a novel parametric query optimiza-

tion algorithm for the case when the plan cost

functions are linear in the parameters. The algo-

rithm works for an arbitrary number of parame-

ters and is minimally-intrusive in the sense that

it does not modify the conventional query opti-

mizer, and merely uses it as a subroutine (invok-

ing it with diﬀerent parameter values). To the

best of our knowledge, no exact solution published

so far works for an arbitrary number of parame-

ters; however, there is a related work [Gan01],

currently unpublished, that handles an arbitrary

number of parameters; we describe the connec-

tions in Section 6. Our solution is simple and

eﬃcient, unlike earlier solutions to the PQO prob-

lem.

• In general, the cost function of an operation may

be non-linear and discontinuous in the parameters

involved. The cost function of a plan, which is the

sum total of the cost functions of the operations

involved, will then also be non-linear and discon-

tinuous. It is, in general, diﬃcult and costly to

deal with such nonlinear functions and this is par-

ticularly true when the functions involve many pa-

rameters. However, nonlinear cost functions can

be approximated by piecewise linear cost func-

tions.

We propose an approach for parametric query op-

timization with piecewise linear cost functions,

based on extending existing optimization algo-

rithms to use cost functions in place of costs.

We show how to extend the System-R query op-

timization algorithm [SAC

79] to perform para-

metric query optimization with piecewise linear

cost functions. We have also extended the Vol-

cano query optimization algorithm [GM93] in a

similar fashion. The solution works for an arbi-

trary number of parameters.

The rest of the paper is organized as follows. Sec-

tion 2 formally deﬁnes the parametric query opti-

mization problem and provides background material

on polytopes. Section 3 describes non-intrusive al-

gorithms for PQO with linear cost functions. Sec-

tion 4 presents deﬁnitions related to piecewise linear

cost functions. Section 5 describes (intrusive) algo-

rithms for PQO with piecewise linear cost functions.

Related work is described in Section 6. We conclude

the paper in Section 7.

2 Deﬁnitions

In this section we formally deﬁne the parametric query

optimization problem and provide some background

material on polytopes.

2.1 Problem Deﬁnition

The parametric query optimization (PQO) problem is

deﬁned as follows [Gan98]: Let s

,...,s

denote n

parameters, where each s

quantiﬁes some cost param-

eter. Let the cost of a plan p be a function of these n

parameters and let it be denoted by C

,...,s

For every legal value of the parameters, there is some

plan that is optimal for that value. Given a query and

n parameters, the maximum parametric set of plans

(MPSP) is the set of plans, each member of which is

optimal for some point in the n-dimensional parameter

space. The MPSP may be deﬁned as:

MPSP = {p | p is optimal for some point in the

parameter space}

For every legal value of the parameters there is a

plan in the MPSP that is optimal for that value and

vice-versa. The region of optimality for a plan p is

denoted by r(p) and is the set deﬁned as

r(p)={(s

,...,s

) | p is optimal at

,...,s

)}

A parametric optimal set of plans (POSP) is a min-

imal subset of MPSP that includes at least one opti-

mal plan for each point in the parameter space. The

parametric query optimization (PQO)problemisto

ﬁnd a POSP and the region of optimality for each

plan in the POSP.

(a)

000000000000000

111111111111111

(b)

Figure 1: (a) a polytope and (b) a lower convex poly-

tope in 2-dimensions

2.2 Polytopes

In the proposed solutions, we need to represent and

manipulate parameter space partitions. For paramet-

ric query optimization with linear cost functions, the

regions of optimality are convex; if the parameter

space of interest is a convex polytope, the regions of

optimality are also convex polytopes. In this section

we deﬁne what are polytopes and describe a special

type of polytope, lower convex polytope.

A convex polytope in 

is a nonempty region that

can be obtained by intersecting a ﬁnite set of closed

halfspaces. Each halfspace is deﬁned as the solution

set of a linear inequality of the form a

+ a

···+ a

≥ a

,whereeacha

is a constant, the x

’s

denote the coordinates in 

,anda

,...a

are not

all zero. The boundary of this halfspace is the hyper-

plane deﬁned by a

+ a

+ ···+ a

= a

.We

denote the bounding hyperplane of a halfspace M

∂M

Let P = ∩

be any convex polytope in 

,where

each M

is a halfspace. A halfspace M

is called re-

dundant if it can be thrown away without aﬀecting

P . This means that the intersection of the remain-

ing halfspaces is also P . Otherwise, the halfspace is

called non-redundant. The hyperplanes bounding the

non-redundant halfspaces are said to be the bounding

hyperplanes of P .Afacet of P is deﬁned to be the in-

tersection of P with one of its bounding hyperplanes.

Each facet of P is a (d − 1)-dimensional convex poly-

tope. In general, an i-face of P is the (non-empty) in-

tersection of P with d − i of its bounding hyperplanes;

afacetisathusa(d − 1)-face. For example, in three

dimensions, a side (facet) of the polytope is a 2-face,

an edge of the polytope is a 1-face, and a vertex is a

0-face.

Figure 1(a) shows a polygon abcdef in 

(a poly-

tope in 

is a polygon.) It is deﬁned by the halfspaces

,...,h

. On which side of the bounding hyper-

plane the corresponding halfspace lies is shown by an

arrow. Note that the halfspace h

is redundant.

Let the set of halfspaces deﬁning P be M . Lower

convex polytopes are a special class of convex polytopes

where all halfspaces in M extend to inﬁnity in the

negative x

direction. Then each element in M can be

viewed as a hyperplane that implicitly stands for the

halfspace bounded by it and extending in negative x

direction. We say that P is the lower convex polytope

formed by such hyperplanes in M . Figure 1(b) shows

a lower convex polygon.

3 P arametric Query Optimization for

Linear Cost Functions

In this section we propose minimally-intrusive solu-

tions for linear cost functions. First we review some

basic properties of the linear cost functions; we give

a brief outline of a naive recursive decomposition al-

gorithm and then we present our main algorithm, the

cost polytope algorithm.

Conventional query optimizers return an optimal

plan along with its cost. For parametric query opti-

mization, the cost of a plan is a function of the pa-

rameters, and the cost function of a plan is required

to compare it with other plans. We can extend the

statistics/cost-estimation component of the optimizer

to make it return the cost function of a given plan; one

waytodosoistodoconventionalcostestimationon

the given plan at n + 1 (non-degenerate

)pointsinthe

parameter space, where n is the number of parameters,

and thereby infer its cost function. The optimizer it-

self is not modiﬁed in any way, and continues to use

the original statistics/cost-estimation code.

In general we are not interested in the whole param-

eter space 

as only a part of it would constitute le-

gal combinations of the parameter values. We assume

that the parameter space of interest is a closed convex

polytope, which we call the parameter space polytope,

and it is provided to the optimizer. Typically, the pa-

rameter space polytope is a hyper-rectangle deﬁned by

a range of legal values speciﬁed for each parameter.

3.1 Properties of Linear Cost Functions

We state the following properties regarding linear cost

functions from [Gan98]:

• If two points in the parameter space have the same

optimal plan then the plan is optimal along the

line segment connecting the two points.

• Each plan in a POSP has only one region of op-

timality and the region is a convex polytope.

• If all the vertices of a polytope in the parameter

space have the same optimal plan then the plan

is optimal within that polytope.

The points are not contained in a common hyperplane.

Thus the partitioning of the parameter space is con-

vex and the solution will divide the parameter space

into convex polytopes.

Note that for linear cost functions, the decompo-

sition of the parameter space induced by any POSP

is the same and the POSP is unique if no two plans

have the same cost function. Details may be found in

the full version of the paper, [HS02]. Without loss of

generality, we assume that the POSP is unique.

3.2 The Recursive Decomposition Algorithm

This solution is based on the observation that if all

the vertices of a polytope in the parameter space have

the same optimal plan, then the plan is optimal within

that polytope. We recursively decompose the param-

eter space into convex polytopes.

We ﬁnd the optimal plans at the vertices of each

polytope, starting with the parameter space polytope,

using a conventional query optimizer. If two of the ver-

tices of a polytope have two diﬀerent optimal plans (or

more precisely, optimal plans with diﬀerent cost func-

tions), then we partition the polytope into two poly-

topes: the dividing hyperplane is derived by equating

the cost functions of the plans. As a result, one plan

is better in one of the polytopes, and the other plan

is better in the other. We then recursively apply the

above test to each of the two polytopes. A polytope

region is not decomposed further when all its vertices

have the same optimal plan.

The detailed algorithm

may be found in [HS02].

This solution has two shortcomings: It may form

more that one region for a plan and may need to merge

them in a post-pass; And the number of calls made to

the conventional optimizer may be more than neces-

sary.

In fact, we can combine the decompose and merge

phases by noticing that the optimality region for an

optimal plan at a point may surround the point

.So

instead of partitioning each polytope adjacent to the

point independently, we can partition all of them si-

multaneously by carving out a single polytope around

the point and subtracting it from each adjacent parti-

tion. Our next algorithm is an outcome of this obser-

vation.

3.3 The Cost Polytope Algorithm

The cost polytope algorithm works in the 

n+1

space

with n dimensions representing n parameters and one

dimension representing cost. The cost function of each

We can devise an approximate version of the algorithm,

which does not partition the polytope if the cost of the opti-

mal plan at one vertex is within a small percentage of the cost

of the optimal plan at each of the remaining vertices.

This may not be the case, though, if more than one plan is

optimal at the point; in that case, the point lies on the boundary

of the optimality regions of the plans.

0000000000000000000000

1111111111111111111111

0000000000000000000000

1111111111111111111111

00000000000000000000000

11111111111111111111111

00000000000000000000000

11111111111111111111111

00000000000000000000000

11111111111111111111111

00000000000000000000000

11111111111111111111111

cost

opt(pt

) = pl

opt(pt

) = pl

parameter parameterparameter

cost

opt(pt

) = pl

opt(pt

) = pl

parameter parameter

cost halfspace of plan pl3

added to the cost polytope

pl 6

cost

parameter

point pt1 optimized; Initial cost

polytope = cost halfspace of plan pl1

(b)

point pt2 optimized; cost halfspace

of pl2 added to cost polytope

point pt3 optimized; cost halfspace

of pl3 being added to cost polytope

cost = −

(e)

(

a) cost hyperplanes and cost polytope

(f) final cost polytope

(c)

(

Figure 2: Cost Polytope Algorithm: One parameter example

plan in the plan space can be represented by a hyper-

plane in 

n+1

. We work on these hyperplanes to con-

struct a lower convex polytope that represents the op-

timal cost among all plans at each point in the param-

eter space. Each facet of this polytope corresponds to

a plan in the parametric optimal set of plans (POSP)

and one can obtain its optimality region by projecting

the facet on the parameter space (

We use a running example with one parameter,

shown in Figure 2, throughout the section.

3.3.1 Parametric Optimal Cost Function

The parametric optimal cost function (POCF) over

the parameter space is deﬁned as follows. For a point

v in the parameter space:

POCF(v)=costofaplanp that is optimal at v.

It follows that, for any plan p in the POSP,at

any point v in its region of optimality, the value of

POCF(v)=C

(v), the cost of p at v.Thuswithin

the region of optimality of a plan, the POCF follows

the cost function of the plan.

Consider an example with one parameter shown

in Figure 2(a). The horizontal axis represents the

parameter space and the vertical axis represents the

cost. The line segment pt

− pt

is the parameter

space polytope. Let the plan space contain eight plans

,pl

,...,pl

with cost functions as shown in Fig-

ure 2(a). We have POSP = {pl

,pl

}.Figure

2(f) shows the POCF, and the region of optimality

of each plan in the POSP.

3.3.2 Cost Hyperplane, Cost Halfspace and

Cost Polytope

Consider 

n+1

space with n dimensions representing

n parameters and the n +1

dimension representing

the cost. Let the cost of a plan p be c

+ c

+ ···+

+ c

n+1

. We can think of the cost function as a

hyperplane in 

n+1

whose equation is given by

n+1

= c

+ c

+ ···+ c

+ c

n+1

where s

n+1

denotes the cost of the plan. We call such

ahyperplaneacost hyperplane. We assume that no

plan has a degenerate cost hyperplane with inﬁnite

slope

. Figure 2(a) shows cost hyperplanes for the

plans pl

,pl

,...,pl

in the parameter range pt

− pt

We deﬁne the lower halfspace (extending to cost =

n+1

= −∞) of the cost hyperplane as:

n+1

≤ c

+ c

+ ···+ c

+ c

n+1

We call such a halfspace a cost halfspace.

We represent each plan p in the plan space by its

cost hyperplane in 

n+1

space. The Cost Polytope

is deﬁned as the lower convex polytope obtained by

intersection of the cost halfspaces of all the plans in

the plan space.

Figure 2(f) shows the cost polytope for the cost

hyperplanes deﬁned in Figure 2(a). We can see that its

boundary is the POCF for the plans pl

,pl

,...,pl

in the parameter range pt

− pt

Such a cost function would be completely unrealistic since

it would divide the parameter space into two halves with each

point in one half having cost positive inﬁnity and each point in

the other half having cost negative inﬁnity.

Theorem 3.1 The boundary of the cost polytope de-

ﬁnes the POCF.

For the proof see [HS02]. Note that the cost hy-

perplanes corresponding to the plans in the POSP are

the bounding hyperplanes of the cost polytope and the

rest of the hyperplanes are redundant. Thus, the

cost hyperplanes corresponding to the plans not in the

POSP cannot form any facet of the cost polytope.

Whereas, the cost hyperplane corresponding to each

plan in the POSP forms one facet of the cost poly-

tope, and the projection of the facet on the parameter

space (the hyperplane s

n+1

= 0; i.e. cost = 0) gives

the region of optimality for the plan.

3.3.3 Cost Polytope Construction

We discuss the cost polytope construction algorithm

in this section. A naive algorithm would be to inter-

sect all the halfspaces that are in the input set. In the

case of cost polytopes, enumerating all the halfspaces

amounts to enumerating all the plans in the plan space

and getting their cost functions. The plan space can

be very large; and only a handful of plans constitute

the POSP [Gan00]. Such a naive algorithm would

be prohibitively expensive. But we have an additional

tool: Given a point v in the parameter space, we can

use the conventional optimizer to obtain a cost hyper-

plane that bounds or touches the cost polytope at the

point whose projection is v. We use this property to

avoid enumerating all the cost hyperplanes.

Our algorithm uses an online polytope construction

algorithm such as that in [Mul94], as a subroutine. A

polytope construction algorithm is given a set of half-

spaces and the algorithm intersects the halfspaces to

construct the desired polytope. In the case of an online

algorithm, the halfspaces are given one at a time, and

at each stage the algorithm maintains an intermediate

polytope.

Figure 3 shows pseudo code for the Cost Polytope

algorithm. We ﬁrst optimize any one vertex, say v,of

the parameter space polytope to get a optimal plan at

it, from which we derive the corresponding cost halfs-

pace in 

n+1

. We transfer the equations of all bound-

ing hyperplanes of the parameter space polytope to



n+1

space and intersect them with the cost halfspace

obtained above, to get the initial cost polytope.

We then put all vertices, except v, of the initial

cost polytope in a queue. We pick one vertex at a

time from this queue and optimize it, i.e. invoke the

conventional optimizer on the parameter coordinate

values of the vertex. Consider an intermediate cost

polytope and one of its vertices, say v. We optimize

vertex v to get an optimal plan p (at vertex v)and

The cost hyperplane of a plan not in the POSP may touch

the polytope at a vertex or along an i-face, for 0 <i<n(e.g.,

an edge, for n>1), but cannot form a facet (i.e., a n-face) in



n+1

PSpacePTope = parameter space polytope

/* See Section 3.1; the polytope is in 

PSPTHalfspaces =

Halfspaces deﬁning PSpacePTope in 

Let v =(v

,...v

) be any vertex of PSpacePTope

p = ConventionalOptimizer(v)

/* p is one of the optimal plans at v */

VerticesOptimized = {v}

n+1

= cost of p at v

Let v



=(v

,...v

n+1

)

Let hs

bethecosthalfspaceofplanp in 

n+1

CostPolytope = intersection of all PSPTHalfspaces

and hs

in 

n+1

/* Initial cost p olytope * /

Queue = {vertices of CostPolytope }\v



While Queue = ∅ do



= Queue.RemoveFirstEntry()

Let the coordinates of v



be (v

,...v

n+1

)

Let v =(v

,...v

)

/* projection of v



on parameter space */

p = ConventionalOptimizer(v)

/* p is one of the optimal plans at v */

VerticesOptimized = VerticesOptimized ∪{v}

Let hs

be the cost halfspace of plan p

If (cost of p at v) <v

n+1

/* v



is in conﬂict with hs

CostPolytope = CostPolytope ∩ hs

Remove from Queue vertices no longer

in CostPolytope

For each new vertex w



=(w

,...w

n+1

)

added to CostPolytope

if w =(w

,...w

) ∈ VerticesOptimized

add w



to Queue

Figure 3: The Cost Polytope Algorithm

its cost hyperplane. Note that, as plan p is optimal

at vertex v, the cost hyperplane of plan p must either

touch or intersect the cost polytope at vertex v.Inthe

later case, we intersect the cost hyperplane with the

current cost polytope to get a new cost polytope. The

intersection operation may delete some of the vertices

from the polytope and may add some new vertices. If

a vertex which is deleted from the polytope is present

in the queue, the vertex is removed from the queue.

All the new vertices of the polytope are added to the

queue.

When the queue become empty, we terminate the

algorithm. When this condition is reached, the cost

hyperplane of an optimal plan of each vertex is either

a facet of the cost polytope or is touching the cost

polytope. The plans corresponding to the facets of the

cost polytope form the POSP. The cost hyperplanes

for all other plans are redundant.

An online algorithm for polytope construction is

Parametric query optimization for linear and piecewise linear cost functions

Figures

Citations

Adaptive Query Processing

Robust query processing through progressive optimization

Staying FIT: efficient load shedding techniques for distributed stream processing

Learning to Optimize Join Queries With Deep Reinforcement Learning.

Automatically and adaptively determining execution plans for queries with parameter markers

References

Lectures on Polytopes

Access path selection in a relational database management system

Computational geometry : an introduction through randomized algorithms

The Volcano optimizer generator: extensibility and efficient search

Optimization of dynamic query evaluation plans

Related Papers (5)

Analyzing plan diagrams of database query optimizers

Access path selection in a relational database management system

Parametric query optimization

Efficient mid-query re-optimization of sub-optimal query execution plans

Robust query processing through progressive optimization

Frequently Asked Questions (15)

Q1. What are the contributions mentioned in the paper "Parametric query optimization for linear and piecewise linear cost functions" ?

Q2. What are the future works in "Parametric query optimization for linear and piecewise linear cost functions" ?

Q3. What is the future work of the project?

Q4. How can the authors apply the cost polytope algorithm in the piecewise linear case?

Q5. how many calls to the optimizer are necessary to check if a given set of plans?

Q6. What is the cost function of a plan in n+1?

Q7. What is the procedure for carving out the optimal region for a new plan?

Q8. What is the way to solve the parametric query optimization problem?

Q9. What is the probability that a cost hyperplane touches it?

Q10. what is the parametric query optimization problem?

Q11. What is the definition of a parametric optimal set of plans?

Q12. What is the cost polytope construction algorithm?

Q13. What is the way to save on the number of calls to the optimizer?

Q14. What is the way to calculate the cost hyperplane?

Q15. How can one obtain the optimality region of a polytope?