scispace - formally typeset
Open AccessBook ChapterDOI

Parametric query optimization for linear and piecewise linear cost functions

TLDR
This work proposes a solution for the PQO problem for the case when the cost functions are linear in the given parameters, and a solution based on modification of an existing query optimizer, which is minimally intrusive.
Abstract
The cost of a query plan depends on many parameters, such as predicate selectivities and available memory, whose values may not be known at optimization time. Parametric query optimization (PQO) optimizes a query into a number of candidate plans, each optimal for some region of the parameter space. We first propose a solution for the PQO problem for the case when the cost functions are linear in the given parameters. This solution is minimally intrusive in the sense that an existing query optimizer can be used with minor modifications: the solution invokes the conventional query optimizer multiple times, with different parameter values. We then propose a solution for the PQO problem for the case when the cost functions are piecewise-linear in the given parameters. The solution is based on modification of an existing query optimizer. This solution is quite general, since arbitrary cost functions can be approximated to piecewise linear form. Both the solutions work for an arbitrary number of parameters.

read more

Content maybe subject to copyright    Report

Parametric Query Optimization for Linear and
Piecewise Linear Cost Functions
Arvind Hulgeri
S. Sudarshan
Indian Institute of Technology, Bombay
{aru, sudarsha}@cse.iitb.ac.in
Abstract
The cost of a query plan depends on many pa-
rameters, such as predicate selectivities and
available memory, whose values may not be
known at optimization time. Parametric
query optimization (PQO) optimizes a query
into a number of candidate plans, each opti-
mal for some region of the parameter space.
We first propose a solution for the PQO prob-
lem for the case when the cost functions are
linear in the given parameters. This solution
is minimally intrusive in the sense that an ex-
isting query optimizer can be used with minor
modifications: the solution invokes the con-
ventional query optimizer multiple times, with
different parameter values.
We then propose a solution for the PQO prob-
lem for the case when the cost functions are
piecewise-linear in the given parameters. The
solution is based on modification of an exist-
ing query optimizer. This solution is quite
general, since arbitrary cost functions can be
approximated to piecewise linear form. Both
the solutions work for an arbitrary number of
parameters.
1 Introduction
The cost of a query plan depends on various database
and system parameters. The database parameters in-
clude selectivity of the predicates and sizes of the rela-
tions. The system parameters include available mem-
ory, disk bandwidth and latency. The exact values of
Work supported by an Infosys Fellowship
Permission to copy without fe e all or p art of this material is
granted provided that the copies are not made or distributed for
dire ct commercial advantage, the VLDB copyright notice and
the title of the publication and its date appear, and notic e is
given that copying is by permission of the Very L arge Data Base
Endowment. To copy otherwise, or to republish, requires a fee
and/or special permission from the Endowment.
Proceedings of the 28th VLDB Conference,
Hong Kong, China, 2002
these parameters may not be known at compile time.
For example, in the case of embedded SQL queries con-
taining unbound variables, the values of the variables
are known only at run time. In general, the avail-
able memory is not known until runtime. Optimizing
a query into a single plan may result in a substan-
tially sub-optimal plan if the actual values are different
from those assumed at optimization time [GW89]. To
overcome this problem, parametric query optimization
(PQO) optimizes a query into a number of candidate
plans, each optimal for some region of the parameter
space [CG94, INSS97, INSS92, GK94, Gan98]. At run
time, when the actual parameter values are known, the
appropriate plan can be chosen.
The contributions of this paper lie in providing two
novel solutions for the parametric query optimization
problem:
We provide a novel parametric query optimiza-
tion algorithm for the case when the plan cost
functions are linear in the parameters. The algo-
rithm works for an arbitrary number of parame-
ters and is minimally-intrusive in the sense that
it does not modify the conventional query opti-
mizer, and merely uses it as a subroutine (invok-
ing it with different parameter values). To the
best of our knowledge, no exact solution published
so far works for an arbitrary number of parame-
ters; however, there is a related work [Gan01],
currently unpublished, that handles an arbitrary
number of parameters; we describe the connec-
tions in Section 6. Our solution is simple and
efficient, unlike earlier solutions to the PQO prob-
lem.
In general, the cost function of an operation may
be non-linear and discontinuous in the parameters
involved. The cost function of a plan, which is the
sum total of the cost functions of the operations
involved, will then also be non-linear and discon-
tinuous. It is, in general, difficult and costly to
deal with such nonlinear functions and this is par-
ticularly true when the functions involve many pa-
rameters. However, nonlinear cost functions can

be approximated by piecewise linear cost func-
tions.
We propose an approach for parametric query op-
timization with piecewise linear cost functions,
based on extending existing optimization algo-
rithms to use cost functions in place of costs.
We show how to extend the System-R query op-
timization algorithm [SAC
+
79] to perform para-
metric query optimization with piecewise linear
cost functions. We have also extended the Vol-
cano query optimization algorithm [GM93] in a
similar fashion. The solution works for an arbi-
trary number of parameters.
The rest of the paper is organized as follows. Sec-
tion 2 formally defines the parametric query opti-
mization problem and provides background material
on polytopes. Section 3 describes non-intrusive al-
gorithms for PQO with linear cost functions. Sec-
tion 4 presents definitions related to piecewise linear
cost functions. Section 5 describes (intrusive) algo-
rithms for PQO with piecewise linear cost functions.
Related work is described in Section 6. We conclude
the paper in Section 7.
2 Definitions
In this section we formally define the parametric query
optimization problem and provide some background
material on polytopes.
2.1 Problem Definition
The parametric query optimization (PQO) problem is
defined as follows [Gan98]: Let s
1
,s
2
,...,s
n
denote n
parameters, where each s
i
quantifies some cost param-
eter. Let the cost of a plan p be a function of these n
parameters and let it be denoted by C
p
(s
1
,s
2
,...,s
n
).
For every legal value of the parameters, there is some
plan that is optimal for that value. Given a query and
n parameters, the maximum parametric set of plans
(MPSP) is the set of plans, each member of which is
optimal for some point in the n-dimensional parameter
space. The MPSP may be defined as:
MPSP = {p | p is optimal for some point in the
parameter space}
For every legal value of the parameters there is a
plan in the MPSP that is optimal for that value and
vice-versa. The region of optimality for a plan p is
denoted by r(p) and is the set defined as
r(p)={(s
1
,s
2
,...,s
n
) | p is optimal at
(s
1
,s
2
,...,s
n
)}
A parametric optimal set of plans (POSP) is a min-
imal subset of MPSP that includes at least one opti-
mal plan for each point in the parameter space. The
parametric query optimization (PQO)problemisto
find a POSP and the region of optimality for each
plan in the POSP.
h
2
h
3
h
4
h
6
h
1
h
5
h
7
a
b
c
d
e
f
(a)
h
2
h
3
h
1
000000000000000
000000000000000
000000000000000
000000000000000
000000000000000
000000000000000
111111111111111
111111111111111
111111111111111
111111111111111
111111111111111
111111111111111
(b)
Figure 1: (a) a polytope and (b) a lower convex poly-
tope in 2-dimensions
2.2 Polytopes
In the proposed solutions, we need to represent and
manipulate parameter space partitions. For paramet-
ric query optimization with linear cost functions, the
regions of optimality are convex; if the parameter
space of interest is a convex polytope, the regions of
optimality are also convex polytopes. In this section
we define what are polytopes and describe a special
type of polytope, lower convex polytope.
A convex polytope in
d
is a nonempty region that
can be obtained by intersecting a finite set of closed
halfspaces. Each halfspace is defined as the solution
set of a linear inequality of the form a
1
x
1
+ a
2
x
2
+
···+ a
d
x
d
a
0
,whereeacha
j
is a constant, the x
j
’s
denote the coordinates in
d
,anda
1
,a
2
,...a
n
are not
all zero. The boundary of this halfspace is the hyper-
plane defined by a
1
x
1
+ a
2
x
2
+ ···+ a
d
x
d
= a
0
.We
denote the bounding hyperplane of a halfspace M
i
by
∂M
i
.
Let P =
i
M
i
be any convex polytope in
d
,where
each M
i
is a halfspace. A halfspace M
i
is called re-
dundant if it can be thrown away without affecting
P . This means that the intersection of the remain-
ing halfspaces is also P . Otherwise, the halfspace is
called non-redundant. The hyperplanes bounding the
non-redundant halfspaces are said to be the bounding
hyperplanes of P .Afacet of P is defined to be the in-
tersection of P with one of its bounding hyperplanes.
Each facet of P is a (d 1)-dimensional convex poly-
tope. In general, an i-face of P is the (non-empty) in-
tersection of P with d i of its bounding hyperplanes;
afacetisathusa(d 1)-face. For example, in three
dimensions, a side (facet) of the polytope is a 2-face,
an edge of the polytope is a 1-face, and a vertex is a
0-face.
Figure 1(a) shows a polygon abcdef in
2
(a poly-
tope in
2
is a polygon.) It is defined by the halfspaces
h
1
,h
2
,...,h
7
. On which side of the bounding hyper-
plane the corresponding halfspace lies is shown by an
arrow. Note that the halfspace h
7
is redundant.

Let the set of halfspaces defining P be M . Lower
convex polytopes are a special class of convex polytopes
where all halfspaces in M extend to infinity in the
negative x
d
direction. Then each element in M can be
viewed as a hyperplane that implicitly stands for the
halfspace bounded by it and extending in negative x
d
direction. We say that P is the lower convex polytope
formed by such hyperplanes in M . Figure 1(b) shows
a lower convex polygon.
3 P arametric Query Optimization for
Linear Cost Functions
In this section we propose minimally-intrusive solu-
tions for linear cost functions. First we review some
basic properties of the linear cost functions; we give
a brief outline of a naive recursive decomposition al-
gorithm and then we present our main algorithm, the
cost polytope algorithm.
Conventional query optimizers return an optimal
plan along with its cost. For parametric query opti-
mization, the cost of a plan is a function of the pa-
rameters, and the cost function of a plan is required
to compare it with other plans. We can extend the
statistics/cost-estimation component of the optimizer
to make it return the cost function of a given plan; one
waytodosoistodoconventionalcostestimationon
the given plan at n + 1 (non-degenerate
1
)pointsinthe
parameter space, where n is the number of parameters,
and thereby infer its cost function. The optimizer it-
self is not modified in any way, and continues to use
the original statistics/cost-estimation code.
In general we are not interested in the whole param-
eter space
n
as only a part of it would constitute le-
gal combinations of the parameter values. We assume
that the parameter space of interest is a closed convex
polytope, which we call the parameter space polytope,
and it is provided to the optimizer. Typically, the pa-
rameter space polytope is a hyper-rectangle defined by
a range of legal values specified for each parameter.
3.1 Properties of Linear Cost Functions
We state the following properties regarding linear cost
functions from [Gan98]:
If two points in the parameter space have the same
optimal plan then the plan is optimal along the
line segment connecting the two points.
Each plan in a POSP has only one region of op-
timality and the region is a convex polytope.
If all the vertices of a polytope in the parameter
space have the same optimal plan then the plan
is optimal within that polytope.
1
The points are not contained in a common hyperplane.
Thus the partitioning of the parameter space is con-
vex and the solution will divide the parameter space
into convex polytopes.
Note that for linear cost functions, the decompo-
sition of the parameter space induced by any POSP
is the same and the POSP is unique if no two plans
have the same cost function. Details may be found in
the full version of the paper, [HS02]. Without loss of
generality, we assume that the POSP is unique.
3.2 The Recursive Decomposition Algorithm
This solution is based on the observation that if all
the vertices of a polytope in the parameter space have
the same optimal plan, then the plan is optimal within
that polytope. We recursively decompose the param-
eter space into convex polytopes.
We find the optimal plans at the vertices of each
polytope, starting with the parameter space polytope,
using a conventional query optimizer. If two of the ver-
tices of a polytope have two different optimal plans (or
more precisely, optimal plans with different cost func-
tions), then we partition the polytope into two poly-
topes: the dividing hyperplane is derived by equating
the cost functions of the plans. As a result, one plan
is better in one of the polytopes, and the other plan
is better in the other. We then recursively apply the
above test to each of the two polytopes. A polytope
region is not decomposed further when all its vertices
have the same optimal plan.
2
The detailed algorithm
may be found in [HS02].
This solution has two shortcomings: It may form
more that one region for a plan and may need to merge
them in a post-pass; And the number of calls made to
the conventional optimizer may be more than neces-
sary.
In fact, we can combine the decompose and merge
phases by noticing that the optimality region for an
optimal plan at a point may surround the point
3
.So
instead of partitioning each polytope adjacent to the
point independently, we can partition all of them si-
multaneously by carving out a single polytope around
the point and subtracting it from each adjacent parti-
tion. Our next algorithm is an outcome of this obser-
vation.
3.3 The Cost Polytope Algorithm
The cost polytope algorithm works in the
n+1
space
with n dimensions representing n parameters and one
dimension representing cost. The cost function of each
2
We can devise an approximate version of the algorithm,
which does not partition the polytope if the cost of the opti-
mal plan at one vertex is within a small percentage of the cost
of the optimal plan at each of the remaining vertices.
3
This may not be the case, though, if more than one plan is
optimal at the point; in that case, the point lies on the boundary
of the optimality regions of the plans.

0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
0000000000000000000000
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
1111111111111111111111
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
00000000000000000000000
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
11111111111111111111111
3
pt
1
pt
2
pt
cost
4
pl
pl
1
1
pt
pl
1
2
pl
1
pt
3
pt
2
pt
pl
1
cost
cost
opt(pt
i
) = pl
i
opt(pt
i
) = pl
i
parameter parameterparameter
3
pt
1
pt
3
pl
2
pt
3
pl
5
pt
4
pt
pl
1
1
pt
2
pt
cost
cost
opt(pt
i
) = pl
i
opt(pt
i
) = pl
i
2
pl
parameter parameter
cost halfspace of plan pl3
added to the cost polytope
pl
8
5
pl
3
pl 6
pl
2
pl
7
pl
1
pt
2
pt
cost
parameter
6
pt
3
pt
7
pt
pl
1
4
pl
5
pl
2
pl
point pt1 optimized; Initial cost
polytope = cost halfspace of plan pl1
(b)
point pt2 optimized; cost halfspace
of pl2 added to cost polytope
point pt3 optimized; cost halfspace
of pl3 being added to cost polytope
00
cost = −
00
cost = −
2
pt
(e)
(
a) cost hyperplanes and cost polytope
(f) final cost polytope
(c)
(
d)
Figure 2: Cost Polytope Algorithm: One parameter example
plan in the plan space can be represented by a hyper-
plane in
n+1
. We work on these hyperplanes to con-
struct a lower convex polytope that represents the op-
timal cost among all plans at each point in the param-
eter space. Each facet of this polytope corresponds to
a plan in the parametric optimal set of plans (POSP)
and one can obtain its optimality region by projecting
the facet on the parameter space (
n
).
We use a running example with one parameter,
shown in Figure 2, throughout the section.
3.3.1 Parametric Optimal Cost Function
The parametric optimal cost function (POCF) over
the parameter space is defined as follows. For a point
v in the parameter space:
POCF(v)=costofaplanp that is optimal at v.
It follows that, for any plan p in the POSP,at
any point v in its region of optimality, the value of
POCF(v)=C
p
(v), the cost of p at v.Thuswithin
the region of optimality of a plan, the POCF follows
the cost function of the plan.
Consider an example with one parameter shown
in Figure 2(a). The horizontal axis represents the
parameter space and the vertical axis represents the
cost. The line segment pt
1
pt
2
is the parameter
space polytope. Let the plan space contain eight plans
pl
1
,pl
2
,...,pl
8
with cost functions as shown in Fig-
ure 2(a). We have POSP = {pl
1
,pl
2
,pl
4
,pl
5
}.Figure
2(f) shows the POCF, and the region of optimality
of each plan in the POSP.
3.3.2 Cost Hyperplane, Cost Halfspace and
Cost Polytope
Consider
n+1
space with n dimensions representing
n parameters and the n +1
th
dimension representing
the cost. Let the cost of a plan p be c
1
s
1
+ c
2
s
2
+ ···+
c
n
s
n
+ c
n+1
. We can think of the cost function as a
hyperplane in
n+1
whose equation is given by
s
n+1
= c
1
s
1
+ c
2
s
2
+ ···+ c
n
s
n
+ c
n+1
where s
n+1
denotes the cost of the plan. We call such
ahyperplaneacost hyperplane. We assume that no
plan has a degenerate cost hyperplane with infinite
slope
4
. Figure 2(a) shows cost hyperplanes for the
plans pl
1
,pl
2
,...,pl
8
in the parameter range pt
1
pt
2
.
We define the lower halfspace (extending to cost =
s
n+1
= −∞) of the cost hyperplane as:
s
n+1
c
1
s
1
+ c
2
s
2
+ ···+ c
n
s
n
+ c
n+1
We call such a halfspace a cost halfspace.
We represent each plan p in the plan space by its
cost hyperplane in
n+1
space. The Cost Polytope
is defined as the lower convex polytope obtained by
intersection of the cost halfspaces of all the plans in
the plan space.
Figure 2(f) shows the cost polytope for the cost
hyperplanes defined in Figure 2(a). We can see that its
boundary is the POCF for the plans pl
1
,pl
2
,...,pl
8
in the parameter range pt
1
pt
2
.
4
Such a cost function would be completely unrealistic since
it would divide the parameter space into two halves with each
point in one half having cost positive infinity and each point in
the other half having cost negative infinity.

Theorem 3.1 The boundary of the cost polytope de-
fines the POCF.
¾
For the proof see [HS02]. Note that the cost hy-
perplanes corresponding to the plans in the POSP are
the bounding hyperplanes of the cost polytope and the
rest of the hyperplanes are redundant. Thus, the
cost hyperplanes corresponding to the plans not in the
POSP cannot form any facet of the cost polytope.
5
Whereas, the cost hyperplane corresponding to each
plan in the POSP forms one facet of the cost poly-
tope, and the projection of the facet on the parameter
space (the hyperplane s
n+1
= 0; i.e. cost = 0) gives
the region of optimality for the plan.
3.3.3 Cost Polytope Construction
We discuss the cost polytope construction algorithm
in this section. A naive algorithm would be to inter-
sect all the halfspaces that are in the input set. In the
case of cost polytopes, enumerating all the halfspaces
amounts to enumerating all the plans in the plan space
and getting their cost functions. The plan space can
be very large; and only a handful of plans constitute
the POSP [Gan00]. Such a naive algorithm would
be prohibitively expensive. But we have an additional
tool: Given a point v in the parameter space, we can
use the conventional optimizer to obtain a cost hyper-
plane that bounds or touches the cost polytope at the
point whose projection is v. We use this property to
avoid enumerating all the cost hyperplanes.
Our algorithm uses an online polytope construction
algorithm such as that in [Mul94], as a subroutine. A
polytope construction algorithm is given a set of half-
spaces and the algorithm intersects the halfspaces to
construct the desired polytope. In the case of an online
algorithm, the halfspaces are given one at a time, and
at each stage the algorithm maintains an intermediate
polytope.
Figure 3 shows pseudo code for the Cost Polytope
algorithm. We first optimize any one vertex, say v,of
the parameter space polytope to get a optimal plan at
it, from which we derive the corresponding cost halfs-
pace in
n+1
. We transfer the equations of all bound-
ing hyperplanes of the parameter space polytope to
n+1
space and intersect them with the cost halfspace
obtained above, to get the initial cost polytope.
We then put all vertices, except v, of the initial
cost polytope in a queue. We pick one vertex at a
time from this queue and optimize it, i.e. invoke the
conventional optimizer on the parameter coordinate
values of the vertex. Consider an intermediate cost
polytope and one of its vertices, say v. We optimize
vertex v to get an optimal plan p (at vertex v)and
5
The cost hyperplane of a plan not in the POSP may touch
the polytope at a vertex or along an i-face, for 0 <i<n(e.g.,
an edge, for n>1), but cannot form a facet (i.e., a n-face) in
n+1
.
PSpacePTope = parameter space polytope
/* See Section 3.1; the polytope is in
n
*/
PSPTHalfspaces =
Halfspaces defining PSpacePTope in
n
Let v =(v
1
,v
2
,...v
n
) be any vertex of PSpacePTope
p = ConventionalOptimizer(v)
/* p is one of the optimal plans at v */
VerticesOptimized = {v}
v
n+1
= cost of p at v
Let v
=(v
1
,v
2
,...v
n
,v
n+1
)
Let hs
p
bethecosthalfspaceofplanp in
n+1
CostPolytope = intersection of all PSPTHalfspaces
and hs
p
in
n+1
/* Initial cost p olytope * /
Queue = {vertices of CostPolytope }\v
While Queue = do
v
= Queue.RemoveFirstEntry()
Let the coordinates of v
be (v
1
,v
2
,...v
n
,v
n+1
)
Let v =(v
1
,v
2
,...v
n
)
/* projection of v
on parameter space */
p = ConventionalOptimizer(v)
/* p is one of the optimal plans at v */
VerticesOptimized = VerticesOptimized ∪{v}
Let hs
p
be the cost halfspace of plan p
If (cost of p at v) <v
n+1
/* v
is in conflict with hs
p
*/
CostPolytope = CostPolytope hs
p
Remove from Queue vertices no longer
in CostPolytope
For each new vertex w
=(w
1
,...w
n
,w
n+1
)
added to CostPolytope
if w =(w
1
,...w
n
) ∈ VerticesOptimized
add w
to Queue
Figure 3: The Cost Polytope Algorithm
its cost hyperplane. Note that, as plan p is optimal
at vertex v, the cost hyperplane of plan p must either
touch or intersect the cost polytope at vertex v.Inthe
later case, we intersect the cost hyperplane with the
current cost polytope to get a new cost polytope. The
intersection operation may delete some of the vertices
from the polytope and may add some new vertices. If
a vertex which is deleted from the polytope is present
in the queue, the vertex is removed from the queue.
All the new vertices of the polytope are added to the
queue.
When the queue become empty, we terminate the
algorithm. When this condition is reached, the cost
hyperplane of an optimal plan of each vertex is either
a facet of the cost polytope or is touching the cost
polytope. The plans corresponding to the facets of the
cost polytope form the POSP. The cost hyperplanes
for all other plans are redundant.
An online algorithm for polytope construction is

Citations
More filters
Book

Adaptive Query Processing

TL;DR: Adaptive Query Processing as mentioned in this paper surveys the fundamental issues, techniques, costs, and benefits of adaptive query processing and provides a broad overview of the field, identifying the dimensions of adaptive techniques.
Proceedings ArticleDOI

Robust query processing through progressive optimization

TL;DR: This work presents an approach to query processing that is extremely robust because it is able to detect and recover from cardinality estimation errors, and calls this approach "progressive query optimization" (POP).
Proceedings Article

Staying FIT: efficient load shedding techniques for distributed stream processing

TL;DR: This paper model the distributed load shedding problem as a linear optimization problem, and proposes two alternative solution approaches: a solver-based centralized approach, and a distributed approach based on metadata aggregation and propagation, whose centralized implementation is also available.
Posted Content

Learning to Optimize Join Queries With Deep Reinforcement Learning.

TL;DR: This work proposes a RL-based DQ optimizer, which currently optimizes select-project-join blocks and implements three versions of DQ to illustrate the ease of integration into existing DBMSes.
Patent

Automatically and adaptively determining execution plans for queries with parameter markers

TL;DR: In this article, a method and system for automatically and adaptively determining query execution plans for parametric queries is presented, where a first classifier trained by an initial set of training points is generated.
References
More filters
Book

Lectures on Polytopes

TL;DR: In this article, the authors present a rich collection of material on the modern theory of convex polytopes, with an emphasis on the methods that yield the results (Fourier-Motzkin elimination, Schlegel diagrams, shellability, Gale transforms, and oriented matroids).
Proceedings ArticleDOI

Access path selection in a relational database management system

TL;DR: System R as mentioned in this paper is an experimental database management system developed to carry out research on the relational model of data, which chooses access paths for both simple (single relation) and complex queries (such as joins), given a user specification of desired data as a boolean expression of predicates.
Book

Computational geometry : an introduction through randomized algorithms

TL;DR: A comparison of quick-sort and search problems, Voronoi diagrams of Hyperplanes, and the model of randomness: The number of faces and the expected structural and conflict change.
Proceedings ArticleDOI

The Volcano optimizer generator: extensibility and efficient search

TL;DR: The Volcano project, which provides efficient, extensible tools for query and request processing, particularly for object-oriented and scientific database systems, is reviewed, and it is shown that the search engine of the Volcano optimizer generator is more extensible and powerful.
Proceedings ArticleDOI

Optimization of dynamic query evaluation plans

TL;DR: The present paper proposes a novel optimization model that assigns the bulk of the optimization effort to compiling-time and delays carefully selected optimization decisions until run-time, and introduces techniques that solve the problem of constructing dynamic plans at compile-time.
Related Papers (5)
Frequently Asked Questions (15)
Q1. What are the contributions mentioned in the paper "Parametric query optimization for linear and piecewise linear cost functions" ?

The authors first propose a solution for the PQO problem for the case when the cost functions are linear in the given parameters. The authors then propose a solution for the PQO problem for the case when the cost functions are piecewise-linear in the given parameters. 

Future work includes implementing or using polyhedron handling code that minimizes overheads, and characterizing the performance of their algorithms. 

Future work includes implementing or using polyhedron handling code that minimizes overheads, and characterizing the performance of their algorithms. 

The cost polytope algorithm for the linear case can be applied in the piecewise linear case by pre-partitioning the parameter space in a way that every cost function is linear in every partition. 

A total of v calls to the optimizer are necessary and sufficient to check if a given set of plans, with a parameter space decomposition defining a region of optimality for each plan, is the POSP . 

The authors can think of the cost function as a hyperplane in n+1 whose equation is given bysn+1 = c1s1 + c2s2 + · · · + cnsn + cn+1 where sn+1 denotes the cost of the plan. 

in Ganguly’s algorithm, the procedure for carving out the optimal region for a new plan (given the existing decomposition) begins with the parameter space polytope and chips out the optimal region of each plan in the existing decomposition. 

The authors propose an approach for parametric query optimization with piecewise linear cost functions, based on extending existing optimization algorithms to use cost functions in place of costs. 

If the coefficient vectors8 of the cost functions are distributed uniformly in a unit sphere then the probability that a cost hyperplane not defining the final cost polytope touches it is zero; see[HS02] for details. 

The parametric query optimization (PQO) problem is defined as follows [Gan98]: Let s1, s2, . . . , sn denote n parameters, where each si quantifies some cost parameter. 

}A parametric optimal set of plans (POSP ) is a minimal subset of MPSP that includes at least one optimal plan for each point in the parameter space. 

A polytope construction algorithm is given a set of halfspaces and the algorithm intersects the halfspaces to construct the desired polytope. 

the authors can then save on the number of calls to the optimizer by intersecting the intermediate cost polytope with all the returned hyperplanes. 

The authors assumed that the conventional optimizer returns one of the optimal plans at a given point in the parameter space, along with its cost hyperplane. 

Each facet of this polytope corresponds to a plan in the parametric optimal set of plans (POSP ) and one can obtain its optimality region by projecting the facet on the parameter space ( n).