Files
secondo/Optimizer/costs2014.pl
2026-01-23 17:03:45 +08:00

659 lines
21 KiB
Prolog

/*
----
This file is part of SECONDO.
Copyright (C) 2004, University in Hagen, Department of Computer Science,
Database Systems for New Applications.
SECONDO is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
SECONDO is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with SECONDO; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
----
//paragraph [10] title: [{\Large \bf ] [}]
//characters [1] formula: [$] [$]
//[toc] [\tableofcontents]
//[newpage] [\newpage]
[10] Memory-Aware Cost Estimation
[toc]
1 The Costs of Terms
---- cost(+Term, +Sel, +Pred, +Res, +Mem, -Card, -NAttrs,
-TSize, -Cost)
----
The cost of an executable ~Term~ representing a predicate ~Pred~ with
selectivity ~Sel~ is ~Cost~ and the cardinality of the result (number of tuples) is ~Card~. The resulting tuples have ~NAttrs~ attributes and tuple size terms ~TSize~ = sizeTerm(MemoryFix, DiskCore, DiskLOB). The result will belong to node ~Res~. If the root operator of the ~Term~ uses a memory buffer, the amount of available memory is ~Mem~ (in MB).
This is evaluated recursively descending into the term. When the operator
realizing the predicate (e.g. ~filter~) is encountered, the selectivity ~Sel~ is
used to determine the size of the result. Usually only a single operator of this kind occurs within the term. However, with the filter and refine technique (used e.g. for spatial join), selectivity needs to be split into bounding-box selectivity and remaining selectivity.
1.1 Arguments
*/
cost(rel(Rel, _), _, _, _, _, Card, NAttrs, TSize, 0) :-
card(Rel, Card),
tupleSizeSplit(Rel, TSize),
getRelAttrList(Rel, OrigAttrs, _),
length(OrigAttrs, NAttrs).
cost(res(N), _, _, _, _, Card, NAttrs, TSize, 0) :-
resultSize(N, Card),
nodeTupleSize(N, TSize),
nodeNAttrs(N, NAttrs).
/*
1.2 feed, feedproject
Cost constant definitions at the end of the file.
*/
% feedC(0.0011, 0.0004). % PerTuple, PerAttr [ms]
% original constants with machine factor 3.35 multiplied by
% 1.33 / 3.35 = 0.4 to adapt to currrent machine
cost(feed(X), Sel, Pred, Res, Mem, Card, NAttrs, TSize, Cost) :-
cost(X, Sel, Pred, Res, Mem, Card, NAttrs, TSize, Cost1),
feedC(PerTuple, PerAttr),
Cost is Cost1 + Card * (PerTuple + PerAttr * NAttrs).
% feedprojectC(0.0033, 0.0003, 0.000006).
% PerTuple, PerAttr, PerByte [ms]
% constants adapted to this machine, keeping relative
% sizes the same
cost(feedproject(rel(Rel, _), Project), _, _, _, _,
Card, NAttrs, TSize, Cost) :-
cost(rel(Rel, _), _, _, _, _, Card, _, _, Cost1),
getRelAttrList(Rel, OrigAttrs, _),
attributes(Project, AList),
projectAttrList(OrigAttrs, AList, _, TSize1),
length(AList, NAttrs),
% add memory required for embedding attributes in tuples
TSize1 = sizeTerm(MemoryFix, S2, S3),
TS1 is MemoryFix + 136,
( NAttrs > 10 -> TS is TS1 + 8 * NAttrs
; TS is TS1
),
TSize = sizeTerm(TS,S2,S3),
feedprojectC(PerTuple, PerAttr, PerByte),
Cost is Cost1 + Card *
(PerTuple + PerAttr * NAttrs + PerByte * MemoryFix).
/*
1.3 filter
Filter evaluates a predicate on each tuple of its input stream, hence the cost is linear in the size of the input stream. On top of the basic cost for a cheap predicate, there may be the cost for evaluating an expensive predicate.
Derivation of constants: We use database nrw, constructed with script nrwImportShape.SEC from secondo/bin/Scripts. Relation Roads has schema
---- Roads(Osm_id: int, Name: string, Ref: string, Type: string,
Oneway: int, Bridge: int, Maxspeed: int, GeoData: Line
----
The total number of tuples is 735683
memtuplesize: 382.65
roottuplesize: 192
exttuplesize: 293.71
tuplesize: 1318.45
The sizes determined in the optimizer are:
----
Relation Roads (Auxiliary objects:)
AttributeName Type MemoryFix DiskCore DiskLOB
GeoData line 160 212.992 1124.74
Maxspeed int 16 5.0 0
Bridge int 16 5.0 0
Oneway int 16 5.0 0
Type string 64 22.0 0
Ref string 64 7.44 0
Name string 64 29.136 0
Osm_id int 16 5.0 0
Indices:
Ordering: []
Cardinality: 735683
Avg.TSize: 1416.3 = sizeTerm(416, 291.568, 1124.74)
----
Evaluating a cheap predicate:
---- sql select count(*) from roads where maxspeed > 50.
Predicate Cost : (1e-06/0.101117) ms
Selectivity : 0.0111699
----
This query was executed as
---- query Roads feedproject[Maxspeed]
filter[(.Maxspeed > 50)]{0.0111587, 0.101117} count
----
within 3.57 seconds, returning the number 8404. Here the differential cost (obtained by running the above query without the filter operator) of the filter operator is 3.57 - 1.87 = 1.7 seconds, which is 1.7 / 735683 = 0.0023107 ms.
We construct a rather large line object by putting together all pieces of the Rhein river:
---- let rhein = Waterways feed filter[(.Name contains "Rhein")]
project[GeoData] transformstream collect_line[TRUE]
----
We then assume this is a relatively expensive predicate:
---- select count(*) from Roads where GeoData intersects rhein
Selectivity query : query Roads_sample_s feed {1}
filter[(bbox(.GeoData) bboxintersects bbox(rhein))] {2}
filter[(.GeoData intersects rhein)] timeout[10.0] count
Elapsed Time: 2325 ms
Predicate Cost : (5.00463e-07/1.36684) ms
Selectivity : 0.000587889
BBox-Selectivity: 0.462669
----
The query was executed as
---- query Roads feedproject[GeoData]
filter[(.GeoData intersects rhein)]{0.000587302, 1.36684} count
----
within 924 seconds, returning 189. Here the differential cost of the filter operator is 924 - 2.3 seconds.
We interpret these results as follows. The cheap predicate has a predicate cost of 0.101117 ms. As the sample relation has about 1700 tuples, this corresponds to a query time for the sample query of 170 ms. However we have a basic time for processing a query of about that magnitude, so that the real predicate evaluation time is much smaller. This is why the predicate can be evaluated on the complete relation in 3.57 seconds.
On the other hand, the expensive predicate has a cost of 1.36684 ms, corresponding to dividing the total time for the sample query of 2325 seconds by 1700. In this case the predicate cost is close to the real one which leads to an estimate of 735683 * 1.36684 ms = 1005 seconds, a bit higher than the observed time of 924 seconds.
We will proceed as follows. We set a threshold value of 0.2 ms, corresponding to processing 2000 tuples, the standard sample size, within 400 ms. Any value below this threshold is considered to be a cheap predicate with real costs 0 (more precisely, its cost is included in the tuple constant). A predicate with a higher cost is considered to be an expensive predicate, and its cost is added to the standard tuple cost.
*/
% filterC(0.00111). % PerTuple [ms]
cost(filter(X, _), Sel, Pred, Res, Mem, Card,
NAttrs, TSize, Cost) :-
not(prefilter(X)), % this is the standard case
!,
cost(X, 1.0, Pred, Res, Mem, Card1, NAttrs, TSize, Cost1),
getPET(Pred, _, ExpPET), % fetch stored predicate evaluation time
filterC(PerTuple),
( (ExpPET =< 0.2) -> (PET is 0.0); PET is ExpPET),
Card is Card1 * Sel,
Cost is Cost1 + Card1 * (PerTuple + PET).
cost(filter(X, _), Sel, Pred, Res, Mem, Card,
NAttrs, TSize, Cost) :-
prefilter(X), % filter and refinement case,
simplePred(Pred, P), % hence lookup bbox selectivity
databaseName(DB),
storedBBoxSel(DB, P, Sel2),
write('passed selectivity is '), write(Sel), nl,
write('filter step called with selectivity '), write(Sel2), nl,
RefSel is Sel / Sel2,
write('hence refinement selectivity is '), write(RefSel), nl,
cost(X, Sel2, Pred, Res, Mem, Card1, NAttrs, TSize, Cost1),
write('returned filter step cardinality is '), write(Card1), nl,
write('filter step cost is '), write(Cost1), nl,
getPET(Pred, _, ExpPET), % fetch stored predicate evaluation time
filterC(PerTuple),
( (ExpPET =< 0.2) -> (PET is 0.0); PET is ExpPET),
Card is Card1 * (Sel / Sel2),
write('PET is '), write(PET), nl,
write('returned total cardinality is '), write(Card), nl,
Cost is Cost1 + Card1 * (PerTuple + PET),
RefCost is Cost - Cost1,
write('refinement step cost is '), write(RefCost), nl.
/*
1.4 rename
Cost is linear in the number of tuples, does not depend on the number of attributes or tuple size as just one pointer for the tuple is passed to the next operator.
The query (database nrw again)
---- query Roads feed count
----
yields 13.22, 3.22, 3.24, 3.20 seconds; average of last three is 3.22.
The query
---- query Roads feed {r} count
----
has running times 3.65, 3.58, 3.23 seconds, average 3.49. Hence the cost of rename per tuple is (3.49 - 3.22) / 735683 = 0.000000367 seconds = 0.000367 ms.
*/
% renameC(0.000367). % PerTuple
cost(rename(X, _), Sel, Pred, Res, Mem, Card, NAttrs, TSize, Cost) :-
cost(X, Sel, Pred, Res, Mem, Card, NAttrs, TSize, Cost1),
renameC(PerTuple),
Cost is Cost1 + Card * PerTuple.
/*
1.5 itHashJoin
Iterative HashJoin reads the tuples from its first input stream into a hash table in memory. There are two cases:
1 The tuples from the first stream fit completely into memory. Then the tuples from the second stream are read and each of them probes against the hash table to find join partners.
2 Only an initial part of the tuples from the first stream fits into memory. Then the second stream is read, probing against the hashtable as in case 1. However, additionally the tuples from the second stream are written into a tuple file (that is, to disk).
When this is finished, the hash table is emptied and further tuples from the first input stream are read into it, called the second partition. The tuples from the second argument are now read from the tuple file and probed against the second partition.
This is repeated until all partitions have been processed.
For the cost formula we use the following notations:
* $Mem$: the available memory, in MB
* $Card_i$: the cardinalities of the first / second input stream
* $NAttrs_i$: the numbers of attributes of the first / second argument stream
* $Size_i$: the tuple sizes in memory of the first / second input stream. Tuple sizes are taken from the ~MemoryFix~ component of the ~sizeTerm~.
* $Card$: the cardinality of the output.
* $NAttrs$: the number of attributes of the output
* $P$: the number of partitions where $P = \lceil(Card_1 \cdot Size_1) / (Mem * 1024 * 1024)\rceil$.
* $u, v, w, x, y$: constants with the following meanings (denoted as $uHashJoin$ etc. in a global table of constants):
* $u$ - putting one tuple from stream 1 into the hash table
* $v$ - time for processing one tuple of stream 2, if $P = 1$
* $w$ - time for writing one byte (of stream 2) into a tuple file
* $x$ - time for reading one byte from the tuple file
* $y$ - time for processing one attribute in a result tuple
The cost for iterative hash join then is the following:
\[
\begin{array}{lll}
C_{itHashJoin} & = &
\left\{ \begin{array}{ll}
Card_1 \cdot u + Card_2 \cdot v & \mbox{if $P$ = 1} \\
Card_1 \cdot u + Card_2 \cdot Size_2 \cdot w & \mbox{otherwise} \\
+ (P - 1) \cdot Card_2 \cdot Size_2 \cdot x & \\
\end{array}
\right. \\
& & + Card \cdot y \cdot NAttrs
\end{array}
\]
These costs are reflected in the following cost function.
*/
% attrC(0.00016). % Cost per attribute in result tuple. in ms.
% itHashJoinC(0.003, 0.49, 0.00016, 0.000047) % see above
cost(itHashJoin(Arg1, Arg2, _, _), Sel, Pred, Res, Mem,
Card, NAttrs, TSize, Cost) :-
cost(Arg1, 1, Pred, Res, Mem, Card1, _, sizeTerm(Size1, _, _), Cost1),
cost(Arg2, 1, Pred, Res, Mem, Card2, _, sizeTerm(Size2, _, _), Cost2),
Card is Card1 * Card2 * Sel,
nodeNAttrs(Res, NAttrs),
nodeTupleSize(Res, TSize),
P is 1 + floor((Card1 * Size1) / (Mem * 1024 * 1024)), % # of partitions
itHashJoinC(U, V, W, X),
( P = 1
-> C1 is Card1 * U + Card2 * V
; C1 is Card1 * U
+ Card2 * Size2 * W
+ (P - 1) * Card2 * Size2 * X
),
attrC(PerAttr), % Cost depending on result size,
C2 is Card * NAttrs * PerAttr, % the same for all joins
Cost is Cost1 + Cost2 + C1 + C2.
/*
1.5 itSpatialJoin
The task is to join tuples from two input streams that have overlapping bounding boxes for some attribute.
This operator works in the same way as ~itHashJoin~. The only difference is that instead of a hash table, a main memory R-tree is used. Probes search with a rectangle on this R-tree. We can therefore use the same cost function, only the constants will be different.
\[
\begin{array}{lll}
C_{itSpatialJoin} & = &
\left\{ \begin{array}{ll}
Card_1 \cdot u + Card_2 \cdot v & \mbox{if $P$ = 1} \\
Card_1 \cdot u + Card_2 \cdot Size_2 \cdot w & \mbox{otherwise} \\
+ (P - 1) \cdot Card_2 \cdot Size_2 \cdot x & \\
\end{array}
\right. \\
& & + Card \cdot y \cdot NAttrs
\end{array}
\]
*/
% itSpatialJoinC(0.059, 0.125, 0.006, 0.001) % see above
cost(itSpatialJoin(Arg1, Arg2, _, _), Sel, Pred, Res, Mem,
Card, NAttrs, TSize, Cost) :-
cost(Arg1, 1, Pred, Res, Mem, Card1, _, sizeTerm(Size1, _, _), Cost1),
cost(Arg2, 1, Pred, Res, Mem, Card2, _, sizeTerm(Size2, _, _), Cost2),
Card is Card1 * Card2 * Sel,
nodeNAttrs(Res, NAttrs),
nodeTupleSize(Res, TSize),
P is 1 + floor((Card1 * Size1) / (Mem * 1024 * 1024)), % # of partitions
itSpatialJoinC(U, V, W, X),
( P = 1
-> C1 is Card1 * U + Card2 * V
; C1 is Card1 * U
+ Card2 * Size2 * W
+ (P - 1) * Card2 * Size2 * X
),
attrC(PerAttr), % Cost depending on result size,
C2 is Card * NAttrs * PerAttr, % the same for all joins
Cost is Cost1 + Cost2 + C1 + C2.
/*
1.7 symmjoin
Symmjoin is the fallback method for join as it is able to process an arbitrary join predicate. It is a symmetric version of a nested-loop join, considering all pairs of tuples from the two argument streams and evaluating the predicate for them. It works roughly as follows:
1 For the two input streams, initialize two buffers ~A~ and ~B~ as empty.
2 While not both input streams are exhausted, do:
1 If stream 1 is not empty, take the first tuple $t_1$ from stream 1 and put it into buffer ~A~. For each tuple $t_2$ in buffer ~B~ evaluate the predicate on the pair $(t_1, t_2)$.
2 If stream 2 is not empty, take the first tuple $t_2$ from stream 2 and put it into buffer ~B~. For each tuple $t_1$ in buffer ~A~ evaluate the predicate on the pair $(t_1, t_2)$.
Hence the procedure alternates between the two input streams and therefore is called symmetric. One advantage is that there is no blocking time as for standard nested loop which first loads one entire argument into the buffer. Instead, result tuples can be reported immediately.
Obviously the cost is quadratic, or more precisely, the product of the argument sizes. When the available memory is exceeded, there is some change in cost as tuples need to be written to disk. However, we do not need to model this in the cost function, as the quadratic behaviour dominates the cost. Except in rare cases, ~symmjoin~ will only be used if no other join technique is available, because it is so expensive.
Similar as for filter, we need to distinguish between cheap and expensive predicates. Again we introduce a threshold cost and add the cost for evaluating the predicate only if the measured cost exceeds the threshold.
Let ~ExpPET~ denote the predicate evaluation time measured in the selectivity query and ~th~ the threshold that determines whether we have an expensive predicate. The constant ~u~ denotes the cost for processing one pair of tuples with a cheap predicate.
The cost function is:
\[
\begin{array}{lll}
PET & = &
\left\{ \begin{array}{ll}
ExpPET & \mbox{if $ExpPET > th$} \\
0 & \mbox{otherwise} \\
\end{array}
\right. \\
%
& & \\
%
C_{symmjoin} & = &
Card_1 * Card_2 * (u + PET) \\
& & + Card \cdot y \cdot NAttrs
\end{array}
\]
1.7.1 Determining Constants
We use database ~opt~.
---- let plzEven = plz feed filter[(.PLZ mod 2) = 0] consume
let plzOdd = plz feed filter[(.PLZ mod 2) = 1] consume
----
The two relations ~plzEven~ and ~plzOdd~ have cardinalities 19773 and 21494, respectively. The query
---- query plzEven feed {p1} plzOdd feed {p2}
itHashJoin[PLZ_p1, PLZ_p2] count
----
has result 0 and runs in 0.12 seconds.
---- query plzEven feed head[A] {p1}
plzOdd feed head[B] {p2}
symmjoin[.PLZ_p1 = ..PLZ_p2] {0.00000001, 0.01} count
----
for all parameters A and B has result 0 and has running times:
B = 1000:
* A = 1000: 3.04 sec
* A = 2000: 6.05 sec
* A = 3000: 9.07 sec
* A = 4000: 12.05 sec
B = 2000:
* A = 1000: 6.05 sec
* A = 2000: 12.07 sec
* A = 3000: 18.12 sec
* A = 4000: 24.19 sec
One can observe the product cost very well. Clearly the cost is 3 secs per million pairs processed, hence the constant ~u~ is $u = 3000 / 1000000$ = 0.003 ms.
To determine the cost for an expensive predicate, we use the database berlintest.
---- query Trains feed {t1} Trains feed {t2}
symmjoin[.Id_t1 = ..Id_t2] count
----
The result is 562, the number of trains, and the running times are 0.99, 0.96, 0.97 secs. The prediction using the constant above is 562 * 562 * 0.003 = 947.532 ms which is close.
---- query Trains feed head[200] {t1} Trains feed head[B] {t2}
symmjoin[sometimes(distance(.Trip_t1, ..Trip_t2) < 50.0)] count
----
* B = 50: 10.67 sec
* B = 100: 16.08
* B = 150: 23.58
* B = 200: 36.93
The cost per predicate evaluation is about 8 secs per 10000 pairs, that is, 0.8 ms.
For this predicate, the optimizer determines in the sample query a cost of 0.76 ms. For a cheap predicate like comparing two numbers it is 0.0035. We use a threshold of 0.01 ms to distinguish an expensive predicate.
*/
% symmjoinC(0.003). % Cost per tuple pair for cheap preds in ms
cost(symmjoin(Arg1, Arg2, _), Sel, Pred, Res, Mem,
Card, NAttrs, TSize, Cost) :-
cost(Arg1, 1, Pred, Res, Mem, Card1, _, _, Cost1),
cost(Arg2, 1, Pred, Res, Mem, Card2, _, _, Cost2),
Card is Card1 * Card2 * Sel,
nodeNAttrs(Res, NAttrs),
nodeTupleSize(Res, TSize),
symmjoinC(U),
getPET(Pred, _, ExpPET), % fetch stored predicate evaluation time
( (ExpPET =< 0.01) -> (PET is 0.0); PET is ExpPET),
C1 is Card1 * Card2 * (U + PET),
attrC(PerAttr), % Cost dep. on result size,
C2 is Card * NAttrs * PerAttr, % the same for all joins
Cost is Cost1 + Cost2 + C1 + C2.
/*
We treat ~symmouterjoin~ in the same way as ~symmjoin~.
*/
cost(symmouterjoin(Arg1, Arg2, _), Sel, Pred, Res, Mem,
Card, NAttrs, TSize, Cost) :-
cost(Arg1, 1, Pred, Res, Mem, Card1, _, _, Cost1),
cost(Arg2, 1, Pred, Res, Mem, Card2, _, _, Cost2),
Card is Card1 * Card2 * Sel,
nodeNAttrs(Res, NAttrs),
nodeTupleSize(Res, TSize),
symmjoinC(U),
getPET(Pred, _, ExpPET), % fetch stored predicate evaluation time
( (ExpPET =< 0.01) -> (PET is 0.0); PET is ExpPET),
C1 is Card1 * Card2 * (U + PET),
attrC(PerAttr), % Cost dep. on result size,
C2 is Card * NAttrs * PerAttr, % the same for all joins
Cost is Cost1 + Cost2 + C1 + C2.
/*
2 Cost Constants
2.1 Defined here
*/
feedC(0.0011, 0.0004). % PerTuple, PerAttr [ms]
% original constants with machine factor 3.35 multiplied by
% 1.33 / 3.35 = 0.4 to adapt to currrent machine
feedprojectC(0.0033, 0.0003, 0.000006).
% PerTuple, PerAttr, PerByte [ms]
% constants adapted to this machine, keeping relative sizes
% the same
filterC(0.00111). % PerTuple [ms]
renameC(0.000367). % PerTuple
itHashJoinC(0.003, 0.49, 0.00016, 0.000047).
% u, v, w, x constants for itHashJoin, see above
itSpatialJoinC(0.059, 0.125, 0.006, 0.001).
symmjoinC(0.003).
attrC(0.00016). % PerAttr [ms], cost per attribute in result tuple
/*
2.2 Looking up constants determined in kernel system
These are retrieved from the kernel system and the stored so that each constant is retrieved only once.
*/
/* does not yet work
:- dynamic costConstant/3.
costConstant(Algebra, ConstName, Const) :-
costConst(Algebra, ConstName, Const).
costConstant(Algebra, ConstName, Const) :-
secondo('query
*/
/*
3 Auxiliary Predicates
---- prefilter(+X)
----
is fulfilled if ~X~ is an operation like loopjoin or spatialjoin so that the expression ~filter(X)~ represents a filter and refinement strategy.
*/
prefilter(itSpatialJoin(_, _, _, _)).
prefilter(spatialjoin(_, _, _, _)).
prefilter(loopjoin(_, _)).