secondo/Tools/Generators/TPC-H/postgres/Postgres-Introduction.txt

/*
----
This file is part of SECONDO.

Copyright (C) 2004, University in Hagen, Department of Computer Science,
Database Systems for New Applications.

SECONDO is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

SECONDO is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with SECONDO; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
----

//paragraph [1]  Title:         [{\Large \bf \begin{center}] [\end{center}}]
//paragraph [2]  Center:        [{\begin{center}] [\end{center}}]
//paragraph [10] Footnote:      [{\footnote{] [}}]
//paragraph [44] table4columns: [\begin{quote}\begin{tabular}{llll}]    [\end{tabular}\end{quote}]

//characters    [20]    verbatim:   [\verb@]    [@]
//characters    [21]    formula:    [$]         [$]
//characters    [22]    capital:    [\textsc{]  [}]
//characters    [23]    teletype:   [\texttt{]  [}]

//[--------]    [\hline]
//[TOC]         [\tableofcontents]
//[p]           [\par]
//[@]           [\@]
//[LISTING-SH]  [\lstsetSH]

[1] A quick introduction into the PostgreSQL DBMS


[2]        Database Systems for new Applications           [p] 
                  University of Hagen                      [p] 
	http://www.informatik.fernuni-hagen.de/secondo     [p]


Author: M. Spiekermann, Last Changes: 2007-02-13

[TOC]

1 Introduction

PostgreSQL is a popular open source DBMS which is the successor of INGRES and
POSTGRES. Sometimes it may be interesting to compare it with Secondo. Hence we
will give a short overview how to install it on a Linux system, how to create databases and how
to create objects and populate it with data. However, its just a rough
introduction for further details consult the Postgres documentation which is
available as HTML-files below /usr/share/doc/packages/postgresql/html.


2 Installation on Linux

Start the package manager (on SuSe-Linux its called YAST) and select all
packages whose name starts with postgres. 

3 Environment Setup

Before you can create a database you need to define and initialize a so called
data storage area or database cluster. The location of this directory should be
defined in the environment variable "PGDATA"[20]. The directory must be only
readable and writeable by the Linux user which is the database administrator.

In order to set up the storage area run the following commands:

[LISTING-SH]

*/
  export PGDATA=/data/postgres-databases
  mkdir $PGDATA
  chmod go-rwx $PGDATA
  initdb -D$PGDATA

/* 

Afterwards the directory "$PGDATA" contains about 26MB data. The definition of
"PGDATA" should be done in the shells startup script (".bashrc") otherwise you
have to define it in every new shell. Now we can startup
up the database server process which is called "postmaster".

*/
  postmaster [-D$PGDATA]
/*

It will print messages to the standard output.

4 Creating Databases

The utility "createdb" can be used to create a database, e.g. the
command

*/
  createdb tpch
/*

will create a database called "tpch" which adds 31MB to the storage area. The
text based database client is called "psql", client internal commands start with
a "\" symbol, for example "\?" will list all client internal commands and "\q"
will quit the session. The command 

*/
  psql -dtpch 
/*

establishes a connection to the "tpch" database. The command prompt now
includes the used database: 

*/
  tpch# \dt       % display tables
  tpch# \di       % display indexes
  tpch# \q        % disconnect and exit
  tpch# \i <file> % run query from file
  tpch# \s <file> % save the cmd history 
  tpch# \h select % explain the syntax of the select statement
/*  

5 Creating Objects

If you are connected with a database the create command can be used to
define a relation.

*/
create table customer (
  C_CUSTKEY     int4,
  C_NAME        varchar(25),
  C_ADDRESS     varchar(40),
  C_NATIONKEY   int4,
  C_PHONE       char(15),
  C_ACCTBAL     float4,
  C_MKTSEGMENT  char(10),
  C_COMMENT     varchar(117)
);
/*

Afterwards you can populate it with tuples by importing a text file. Each line
will be interpreted as a tuple. A field separator can be specified which marks
the end of an attribute value. This is a special client command, e.g.

*/
  \copy customer FROM 's05pp/customer.tbl.pg' WITH DELIMITER AS '|';
/*

reads the tuple data from the file "s05pp/customer.tbl.pg". An index can be
created by

*/
  create index customer_c_custkey on cutomer(c_custkey);
/*

Another kind of objects are sequences. The commands

*/
  create sequence serial starts 1;
  select nextval('serial);          % will return 2
/*

Sometimes it is necessary to store query results as new relations. This can be
done by the "create table <ident> as" command. Moreover new attribute values can
be computed by the existing tuple values by just writing expressions of the
available functions and operations, e.g.

*/
  create table customer_s100
    as select C_CUSTKEY, C_NAME, nextval('serial') % 100 as C_NUM
    from customer; 
/*


6 Investigating Query Plans

If a query is introduced by "explain" or "explain analyze" the used query plan
will be printed. The second variant runs the query and displays estimated costs
and tuple cardinalities with actual runtimes.

*/
  explain <query>
  explain analyze <query>
/*

7 Maintenance

The query planner needs accurate statistics about the data. It will use samples
of the data to estimate the frequency distribution of a table attribute's
values. The internal estimates will be updated by the command "analyze"
it collects statistics about the contents of tables in the database, and
stores the results in the system table "pg_statistic".

In normal PostgreSQL operation, tuples that are deleted or obsoleted by an
update are not physically removed from their table; they remain present until
the command "vaccum" is called.  This procedure reclaims storage occupied by deleted
tuples. Hence the administrator should run

*/
  vacuum analyze
/*

after remarkable updates.

8 Tuning

By using the set command the admin can set various runtime parameters.
This can be useful to force or to disable some evaluation methods for
relational algebra expressions. For example, the statement below disables the use
of indexes. 

*/
set enable_indexscan = off;
/*
8.1 Adjusting cost factors 

SQL statements can be translated into different execution plans which compute
the same result. The Planner (or Optimizer) module uses data statistics, cost functions
and some basic cost factors to rate such plans. The optimization algorithms sytematically
procudes subplans and prunes non-efficient solutions. The result of this process might be
the best available plan. However, error factors are

(1) Imprecise statistics
(2) Imprecise cost functions
(3) Imprecise cost factors

Some important cost factors are:

*/
cpu_tuple_cost;
cpu_operator_cost;
/*
Those are expressed as float values which define the ratio of time they need compared
with a sequential access of a memory page. The costs can be determined by running
some queries.

First you need to create relations $R_1, R_2$ with different tuple sizes but
the same number of tuples and pages. Hence the time difference for scanning
those relations can be used to compute the time for a page fetch. Moreover, the
size of the relations should be bigger than the main memory. Hence we have
$|t_{q1} - t_{q2}| = T_{pc} |P_1 - P_2|$ where $t_{qi}$ is the runtime for a
query which scans relation $R_i$.

Afterwards one can mesaure the time for processing a tuple $T_{tc}$by constructing
relations with the same number of pages but a different number of tuples. Again
the run time difference for a scan can be utilized to compute the processing
overhead for a single tuple.

Finally queries applying a different number of operators are used to compute the
time needed for a single operator $T_{oc}$.


9 Understanding the Postgres Planner

Below there are three similar queries which result in different plans.

*/

Q1: explain select count(*) from m1, m2 where m1.a = m2.a and m1.a = 1;
 Aggregate  (cost=22128.85..22128.85 rows=1 width=0)
   ->  Nested Loop  (cost=8543.55..22119.35 rows=949638 width=0)
         ->  Seq Scan on m2  (cost=0.00..8542.72 rows=978 width=4)
               Filter: (1 = a)
         ->  Materialize  (cost=8543.55..8546.12 rows=971 width=4)
               ->  Seq Scan on m1  (cost=0.00..8543.29 rows=971 width=4)
                     Filter: (a = 1)

Q2: explain select count(*) from m1, m2 where m1.a = m2.a and m2.a < 10;
 Aggregate  (cost=99334.22..99334.23 rows=1 width=0)
   ->  Merge Join  (cost=53549.54..99163.39 rows=17083708 width=0)
         Merge Cond: ("outer".a = "inner".a)
         ->  Sort  (cost=8547.57..8547.74 rows=17246 width=4)
               Sort Key: m2.a
               ->  Seq Scan on m2  (cost=0.00..8542.72 rows=17246 width=4)
                     Filter: (a < 10)
         ->  Sort  (cost=45001.97..45011.97 rows=1000110 width=4)
               Sort Key: m1.a
               ->  Seq Scan on m1  (cost=0.00..8533.29 rows=1000110 width=4)

Q3 explain select count(*) from m1, m2 where m1.a = m2.a and m1.a < 10;
 Aggregate  (cost=80644.17..80644.17 rows=1 width=0)
   ->  Hash Join  (cost=8543.39..80543.07 rows=10109754 width=0)
         Hash Cond: ("outer".a = "inner".a)
         ->  Seq Scan on m2  (cost=0.00..8532.72 rows=999894 width=4)
         ->  Hash  (cost=8543.29..8543.29 rows=10208 width=4)
               ->  Seq Scan on m1  (cost=0.00..8543.29 rows=10208 width=4)
                     Filter: (a < 10)

/*
Note that in Q1 the planner rewrites the query and adds an additional predicate m2.a = 0.
This is possible since an equi-join essentially needs the same values to produce matches.
Moreover, it seems that hashjoins and mergejoins are prevented since they are never chosen, even with
configuration option "enable_nestloop = off" which raises the total costs up to 100.000.000.  
 
Extraordinarily, this technique is not applied for queries "Q2" and "Q3" even
though it could reduce costs. Moreover, one can observe, that the estimates for
"m1.a < 10" and "m2.a < 10" vary in a wide range despite the fact that relation "m2"
is a copy of "m1". After each command which updates statistics samples, e.g.
analyze m1, the estimate changes. Note: the sttistics about data distributions
can be confiured on a per column basis or for globally by the parameter
"default_statistics_target".

Adding a redundant (totally correlated) predicate "m2.b = 2" misguides the planner since it
chooses a very expensive plan based on the estimate that the scan on "m2" will return only
1 tuple (actually 1000 tuples). This leads to a nested loop-join without materialization
of the intermediate result, hence m2 will be scanned 1000 times. This is a good demonstration
for the needs of robust query optimization as claimed in [xxx]. 
 
*/
Q4: Q3 and m2.b = 2
Aggregate  (cost=17099.17..17099.17 rows=1 width=0)
   ->  Nested Loop  (cost=0.00..17099.16 rows=971 width=0)
         ->  Seq Scan on m2  (cost=0.00..8553.29 rows=1 width=4)
               Filter: ((b = 2) AND (1 = a))
         ->  Seq Scan on m1  (cost=0.00..8543.29 rows=971 width=4)
               Filter: (a = 1)
firs commit 2026-01-23 17:03:45 +08:00			`/*`
			`----`
			`This file is part of SECONDO.`

			`Copyright (C) 2004, University in Hagen, Department of Computer Science,`
			`Database Systems for New Applications.`

			`SECONDO is free software; you can redistribute it and/or modify`
			`it under the terms of the GNU General Public License as published by`
			`the Free Software Foundation; either version 2 of the License, or`
			`(at your option) any later version.`

			`SECONDO is distributed in the hope that it will be useful,`
			`but WITHOUT ANY WARRANTY; without even the implied warranty of`
			`MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the`
			`GNU General Public License for more details.`

			`You should have received a copy of the GNU General Public License`
			`along with SECONDO; if not, write to the Free Software`
			`Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA`
			`----`

			`//paragraph [1] Title: [{\Large \bf \begin{center}] [\end{center}}]`
			`//paragraph [2] Center: [{\begin{center}] [\end{center}}]`
			`//paragraph [10] Footnote: [{\footnote{] [}}]`
			`//paragraph [44] table4columns: [\begin{quote}\begin{tabular}{llll}] [\end{tabular}\end{quote}]`

			`//characters [20] verbatim: [\verb@] [@]`
			`//characters [21] formula: [$] [$]`
			`//characters [22] capital: [\textsc{] [}]`
			`//characters [23] teletype: [\texttt{] [}]`

			`//[--------] [\hline]`
			`//[TOC] [\tableofcontents]`
			`//[p] [\par]`
			`//[@] [\@]`
			`//[LISTING-SH] [\lstsetSH]`

			`[1] A quick introduction into the PostgreSQL DBMS`


			`[2] Database Systems for new Applications [p]`
			`University of Hagen [p]`
			`http://www.informatik.fernuni-hagen.de/secondo [p]`


			`Author: M. Spiekermann, Last Changes: 2007-02-13`

			`[TOC]`

			`1 Introduction`

			`PostgreSQL is a popular open source DBMS which is the successor of INGRES and`
			`POSTGRES. Sometimes it may be interesting to compare it with Secondo. Hence we`
			`will give a short overview how to install it on a Linux system, how to create databases and how`
			`to create objects and populate it with data. However, its just a rough`
			`introduction for further details consult the Postgres documentation which is`
			`available as HTML-files below /usr/share/doc/packages/postgresql/html.`


			`2 Installation on Linux`

			`Start the package manager (on SuSe-Linux its called YAST) and select all`
			`packages whose name starts with postgres.`

			`3 Environment Setup`

			`Before you can create a database you need to define and initialize a so called`
			`data storage area or database cluster. The location of this directory should be`
			`defined in the environment variable "PGDATA"[20]. The directory must be only`
			`readable and writeable by the Linux user which is the database administrator.`

			`In order to set up the storage area run the following commands:`

			`[LISTING-SH]`

			`*/`
			`export PGDATA=/data/postgres-databases`
			`mkdir $PGDATA`
			`chmod go-rwx $PGDATA`
			`initdb -D$PGDATA`

			`/*`

			`Afterwards the directory "$PGDATA" contains about 26MB data. The definition of`
			`"PGDATA" should be done in the shells startup script (".bashrc") otherwise you`
			`have to define it in every new shell. Now we can startup`
			`up the database server process which is called "postmaster".`

			`*/`
			`postmaster [-D$PGDATA]`
			`/*`

			`It will print messages to the standard output.`

			`4 Creating Databases`

			`The utility "createdb" can be used to create a database, e.g. the`
			`command`

			`*/`
			`createdb tpch`
			`/*`

			`will create a database called "tpch" which adds 31MB to the storage area. The`
			`text based database client is called "psql", client internal commands start with`
			`a "\" symbol, for example "\?" will list all client internal commands and "\q"`
			`will quit the session. The command`

			`*/`
			`psql -dtpch`
			`/*`

			`establishes a connection to the "tpch" database. The command prompt now`
			`includes the used database:`

			`*/`
			`tpch# \dt % display tables`
			`tpch# \di % display indexes`
			`tpch# \q % disconnect and exit`
			`tpch# \i <file> % run query from file`
			`tpch# \s <file> % save the cmd history`
			`tpch# \h select % explain the syntax of the select statement`
			`/*`

			`5 Creating Objects`

			`If you are connected with a database the create command can be used to`
			`define a relation.`

			`*/`
			`create table customer (`
			`C_CUSTKEY int4,`
			`C_NAME varchar(25),`
			`C_ADDRESS varchar(40),`
			`C_NATIONKEY int4,`
			`C_PHONE char(15),`
			`C_ACCTBAL float4,`
			`C_MKTSEGMENT char(10),`
			`C_COMMENT varchar(117)`
			`);`
			`/*`

			`Afterwards you can populate it with tuples by importing a text file. Each line`
			`will be interpreted as a tuple. A field separator can be specified which marks`
			`the end of an attribute value. This is a special client command, e.g.`

			`*/`
			`\copy customer FROM 's05pp/customer.tbl.pg' WITH DELIMITER AS '\|';`
			`/*`

			`reads the tuple data from the file "s05pp/customer.tbl.pg". An index can be`
			`created by`

			`*/`
			`create index customer_c_custkey on cutomer(c_custkey);`
			`/*`

			`Another kind of objects are sequences. The commands`

			`*/`
			`create sequence serial starts 1;`
			`select nextval('serial); % will return 2`
			`/*`

			`Sometimes it is necessary to store query results as new relations. This can be`
			`done by the "create table <ident> as" command. Moreover new attribute values can`
			`be computed by the existing tuple values by just writing expressions of the`
			`available functions and operations, e.g.`

			`*/`
			`create table customer_s100`
			`as select C_CUSTKEY, C_NAME, nextval('serial') % 100 as C_NUM`
			`from customer;`
			`/*`


			`6 Investigating Query Plans`

			`If a query is introduced by "explain" or "explain analyze" the used query plan`
			`will be printed. The second variant runs the query and displays estimated costs`
			`and tuple cardinalities with actual runtimes.`

			`*/`
			`explain <query>`
			`explain analyze <query>`
			`/*`

			`7 Maintenance`

			`The query planner needs accurate statistics about the data. It will use samples`
			`of the data to estimate the frequency distribution of a table attribute's`
			`values. The internal estimates will be updated by the command "analyze"`
			`it collects statistics about the contents of tables in the database, and`
			`stores the results in the system table "pg_statistic".`

			`In normal PostgreSQL operation, tuples that are deleted or obsoleted by an`
			`update are not physically removed from their table; they remain present until`
			`the command "vaccum" is called. This procedure reclaims storage occupied by deleted`
			`tuples. Hence the administrator should run`

			`*/`
			`vacuum analyze`
			`/*`

			`after remarkable updates.`

			`8 Tuning`

			`By using the set command the admin can set various runtime parameters.`
			`This can be useful to force or to disable some evaluation methods for`
			`relational algebra expressions. For example, the statement below disables the use`
			`of indexes.`

			`*/`
			`set enable_indexscan = off;`
			`/*`
			`8.1 Adjusting cost factors`

			`SQL statements can be translated into different execution plans which compute`
			`the same result. The Planner (or Optimizer) module uses data statistics, cost functions`
			`and some basic cost factors to rate such plans. The optimization algorithms sytematically`
			`procudes subplans and prunes non-efficient solutions. The result of this process might be`
			`the best available plan. However, error factors are`

			`(1) Imprecise statistics`
			`(2) Imprecise cost functions`
			`(3) Imprecise cost factors`

			`Some important cost factors are:`

			`*/`
			`cpu_tuple_cost;`
			`cpu_operator_cost;`
			`/*`
			`Those are expressed as float values which define the ratio of time they need compared`
			`with a sequential access of a memory page. The costs can be determined by running`
			`some queries.`

			`First you need to create relations $R_1, R_2$ with different tuple sizes but`
			`the same number of tuples and pages. Hence the time difference for scanning`
			`those relations can be used to compute the time for a page fetch. Moreover, the`
			`size of the relations should be bigger than the main memory. Hence we have`
			`$\|t_{q1} - t_{q2}\| = T_{pc} \|P_1 - P_2\|$ where $t_{qi}$ is the runtime for a`
			`query which scans relation $R_i$.`

			`Afterwards one can mesaure the time for processing a tuple $T_{tc}$by constructing`
			`relations with the same number of pages but a different number of tuples. Again`
			`the run time difference for a scan can be utilized to compute the processing`
			`overhead for a single tuple.`

			`Finally queries applying a different number of operators are used to compute the`
			`time needed for a single operator $T_{oc}$.`


			`9 Understanding the Postgres Planner`

			`Below there are three similar queries which result in different plans.`

			`*/`

			`Q1: explain select count(*) from m1, m2 where m1.a = m2.a and m1.a = 1;`
			`Aggregate (cost=22128.85..22128.85 rows=1 width=0)`
			`-> Nested Loop (cost=8543.55..22119.35 rows=949638 width=0)`
			`-> Seq Scan on m2 (cost=0.00..8542.72 rows=978 width=4)`
			`Filter: (1 = a)`
			`-> Materialize (cost=8543.55..8546.12 rows=971 width=4)`
			`-> Seq Scan on m1 (cost=0.00..8543.29 rows=971 width=4)`
			`Filter: (a = 1)`

			`Q2: explain select count(*) from m1, m2 where m1.a = m2.a and m2.a < 10;`
			`Aggregate (cost=99334.22..99334.23 rows=1 width=0)`
			`-> Merge Join (cost=53549.54..99163.39 rows=17083708 width=0)`
			`Merge Cond: ("outer".a = "inner".a)`
			`-> Sort (cost=8547.57..8547.74 rows=17246 width=4)`
			`Sort Key: m2.a`
			`-> Seq Scan on m2 (cost=0.00..8542.72 rows=17246 width=4)`
			`Filter: (a < 10)`
			`-> Sort (cost=45001.97..45011.97 rows=1000110 width=4)`
			`Sort Key: m1.a`
			`-> Seq Scan on m1 (cost=0.00..8533.29 rows=1000110 width=4)`

			`Q3 explain select count(*) from m1, m2 where m1.a = m2.a and m1.a < 10;`
			`Aggregate (cost=80644.17..80644.17 rows=1 width=0)`
			`-> Hash Join (cost=8543.39..80543.07 rows=10109754 width=0)`
			`Hash Cond: ("outer".a = "inner".a)`
			`-> Seq Scan on m2 (cost=0.00..8532.72 rows=999894 width=4)`
			`-> Hash (cost=8543.29..8543.29 rows=10208 width=4)`
			`-> Seq Scan on m1 (cost=0.00..8543.29 rows=10208 width=4)`
			`Filter: (a < 10)`

			`/*`
			`Note that in Q1 the planner rewrites the query and adds an additional predicate m2.a = 0.`
			`This is possible since an equi-join essentially needs the same values to produce matches.`
			`Moreover, it seems that hashjoins and mergejoins are prevented since they are never chosen, even with`
			`configuration option "enable_nestloop = off" which raises the total costs up to 100.000.000.`

			`Extraordinarily, this technique is not applied for queries "Q2" and "Q3" even`
			`though it could reduce costs. Moreover, one can observe, that the estimates for`
			`"m1.a < 10" and "m2.a < 10" vary in a wide range despite the fact that relation "m2"`
			`is a copy of "m1". After each command which updates statistics samples, e.g.`
			`analyze m1, the estimate changes. Note: the sttistics about data distributions`
			`can be confiured on a per column basis or for globally by the parameter`
			`"default_statistics_target".`

			`Adding a redundant (totally correlated) predicate "m2.b = 2" misguides the planner since it`
			`chooses a very expensive plan based on the estimate that the scan on "m2" will return only`
			`1 tuple (actually 1000 tuples). This leads to a nested loop-join without materialization`
			`of the intermediate result, hence m2 will be scanned 1000 times. This is a good demonstration`
			`for the needs of robust query optimization as claimed in [xxx].`

			`*/`
			`Q4: Q3 and m2.b = 2`
			`Aggregate (cost=17099.17..17099.17 rows=1 width=0)`
			`-> Nested Loop (cost=0.00..17099.16 rows=971 width=0)`
			`-> Seq Scan on m2 (cost=0.00..8553.29 rows=1 width=4)`
			`Filter: ((b = 2) AND (1 = a))`
			`-> Seq Scan on m1 (cost=0.00..8543.29 rows=971 width=4)`
			`Filter: (a = 1)`