/* ---- This file is part of SECONDO. Copyright (C) 2004, University in Hagen, Department of Computer Science, Database Systems for New Applications. SECONDO is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. SECONDO is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with SECONDO; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA ---- //paragraph [1] Title: [{\Large \bf \begin{center}] [\end{center}}] //paragraph [2] Center: [{\begin{center}] [\end{center}}] //paragraph [10] Footnote: [{\footnote{] [}}] //paragraph [44] table4columns: [\begin{quote}\begin{tabular}{llll}] [\end{tabular}\end{quote}] //characters [20] verbatim: [\verb@] [@] //characters [21] formula: [$] [$] //characters [22] capital: [\textsc{] [}] //characters [23] teletype: [\texttt{] [}] //[--------] [\hline] //[TOC] [\tableofcontents] //[p] [\par] //[@] [\@] //[LISTING-SH] [\lstsetSH] [1] A quick introduction into the PostgreSQL DBMS [2] Database Systems for new Applications [p] University of Hagen [p] http://www.informatik.fernuni-hagen.de/secondo [p] Author: M. Spiekermann, Last Changes: 2007-02-13 [TOC] 1 Introduction PostgreSQL is a popular open source DBMS which is the successor of INGRES and POSTGRES. Sometimes it may be interesting to compare it with Secondo. Hence we will give a short overview how to install it on a Linux system, how to create databases and how to create objects and populate it with data. However, its just a rough introduction for further details consult the Postgres documentation which is available as HTML-files below /usr/share/doc/packages/postgresql/html. 2 Installation on Linux Start the package manager (on SuSe-Linux its called YAST) and select all packages whose name starts with postgres. 3 Environment Setup Before you can create a database you need to define and initialize a so called data storage area or database cluster. The location of this directory should be defined in the environment variable "PGDATA"[20]. The directory must be only readable and writeable by the Linux user which is the database administrator. In order to set up the storage area run the following commands: [LISTING-SH] */ export PGDATA=/data/postgres-databases mkdir $PGDATA chmod go-rwx $PGDATA initdb -D$PGDATA /* Afterwards the directory "$PGDATA" contains about 26MB data. The definition of "PGDATA" should be done in the shells startup script (".bashrc") otherwise you have to define it in every new shell. Now we can startup up the database server process which is called "postmaster". */ postmaster [-D$PGDATA] /* It will print messages to the standard output. 4 Creating Databases The utility "createdb" can be used to create a database, e.g. the command */ createdb tpch /* will create a database called "tpch" which adds 31MB to the storage area. The text based database client is called "psql", client internal commands start with a "\" symbol, for example "\?" will list all client internal commands and "\q" will quit the session. The command */ psql -dtpch /* establishes a connection to the "tpch" database. The command prompt now includes the used database: */ tpch# \dt % display tables tpch# \di % display indexes tpch# \q % disconnect and exit tpch# \i % run query from file tpch# \s % save the cmd history tpch# \h select % explain the syntax of the select statement /* 5 Creating Objects If you are connected with a database the create command can be used to define a relation. */ create table customer ( C_CUSTKEY int4, C_NAME varchar(25), C_ADDRESS varchar(40), C_NATIONKEY int4, C_PHONE char(15), C_ACCTBAL float4, C_MKTSEGMENT char(10), C_COMMENT varchar(117) ); /* Afterwards you can populate it with tuples by importing a text file. Each line will be interpreted as a tuple. A field separator can be specified which marks the end of an attribute value. This is a special client command, e.g. */ \copy customer FROM 's05pp/customer.tbl.pg' WITH DELIMITER AS '|'; /* reads the tuple data from the file "s05pp/customer.tbl.pg". An index can be created by */ create index customer_c_custkey on cutomer(c_custkey); /* Another kind of objects are sequences. The commands */ create sequence serial starts 1; select nextval('serial); % will return 2 /* Sometimes it is necessary to store query results as new relations. This can be done by the "create table as" command. Moreover new attribute values can be computed by the existing tuple values by just writing expressions of the available functions and operations, e.g. */ create table customer_s100 as select C_CUSTKEY, C_NAME, nextval('serial') % 100 as C_NUM from customer; /* 6 Investigating Query Plans If a query is introduced by "explain" or "explain analyze" the used query plan will be printed. The second variant runs the query and displays estimated costs and tuple cardinalities with actual runtimes. */ explain explain analyze /* 7 Maintenance The query planner needs accurate statistics about the data. It will use samples of the data to estimate the frequency distribution of a table attribute's values. The internal estimates will be updated by the command "analyze" it collects statistics about the contents of tables in the database, and stores the results in the system table "pg_statistic". In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until the command "vaccum" is called. This procedure reclaims storage occupied by deleted tuples. Hence the administrator should run */ vacuum analyze /* after remarkable updates. 8 Tuning By using the set command the admin can set various runtime parameters. This can be useful to force or to disable some evaluation methods for relational algebra expressions. For example, the statement below disables the use of indexes. */ set enable_indexscan = off; /* 8.1 Adjusting cost factors SQL statements can be translated into different execution plans which compute the same result. The Planner (or Optimizer) module uses data statistics, cost functions and some basic cost factors to rate such plans. The optimization algorithms sytematically procudes subplans and prunes non-efficient solutions. The result of this process might be the best available plan. However, error factors are (1) Imprecise statistics (2) Imprecise cost functions (3) Imprecise cost factors Some important cost factors are: */ cpu_tuple_cost; cpu_operator_cost; /* Those are expressed as float values which define the ratio of time they need compared with a sequential access of a memory page. The costs can be determined by running some queries. First you need to create relations $R_1, R_2$ with different tuple sizes but the same number of tuples and pages. Hence the time difference for scanning those relations can be used to compute the time for a page fetch. Moreover, the size of the relations should be bigger than the main memory. Hence we have $|t_{q1} - t_{q2}| = T_{pc} |P_1 - P_2|$ where $t_{qi}$ is the runtime for a query which scans relation $R_i$. Afterwards one can mesaure the time for processing a tuple $T_{tc}$by constructing relations with the same number of pages but a different number of tuples. Again the run time difference for a scan can be utilized to compute the processing overhead for a single tuple. Finally queries applying a different number of operators are used to compute the time needed for a single operator $T_{oc}$. 9 Understanding the Postgres Planner Below there are three similar queries which result in different plans. */ Q1: explain select count(*) from m1, m2 where m1.a = m2.a and m1.a = 1; Aggregate (cost=22128.85..22128.85 rows=1 width=0) -> Nested Loop (cost=8543.55..22119.35 rows=949638 width=0) -> Seq Scan on m2 (cost=0.00..8542.72 rows=978 width=4) Filter: (1 = a) -> Materialize (cost=8543.55..8546.12 rows=971 width=4) -> Seq Scan on m1 (cost=0.00..8543.29 rows=971 width=4) Filter: (a = 1) Q2: explain select count(*) from m1, m2 where m1.a = m2.a and m2.a < 10; Aggregate (cost=99334.22..99334.23 rows=1 width=0) -> Merge Join (cost=53549.54..99163.39 rows=17083708 width=0) Merge Cond: ("outer".a = "inner".a) -> Sort (cost=8547.57..8547.74 rows=17246 width=4) Sort Key: m2.a -> Seq Scan on m2 (cost=0.00..8542.72 rows=17246 width=4) Filter: (a < 10) -> Sort (cost=45001.97..45011.97 rows=1000110 width=4) Sort Key: m1.a -> Seq Scan on m1 (cost=0.00..8533.29 rows=1000110 width=4) Q3 explain select count(*) from m1, m2 where m1.a = m2.a and m1.a < 10; Aggregate (cost=80644.17..80644.17 rows=1 width=0) -> Hash Join (cost=8543.39..80543.07 rows=10109754 width=0) Hash Cond: ("outer".a = "inner".a) -> Seq Scan on m2 (cost=0.00..8532.72 rows=999894 width=4) -> Hash (cost=8543.29..8543.29 rows=10208 width=4) -> Seq Scan on m1 (cost=0.00..8543.29 rows=10208 width=4) Filter: (a < 10) /* Note that in Q1 the planner rewrites the query and adds an additional predicate m2.a = 0. This is possible since an equi-join essentially needs the same values to produce matches. Moreover, it seems that hashjoins and mergejoins are prevented since they are never chosen, even with configuration option "enable_nestloop = off" which raises the total costs up to 100.000.000. Extraordinarily, this technique is not applied for queries "Q2" and "Q3" even though it could reduce costs. Moreover, one can observe, that the estimates for "m1.a < 10" and "m2.a < 10" vary in a wide range despite the fact that relation "m2" is a copy of "m1". After each command which updates statistics samples, e.g. analyze m1, the estimate changes. Note: the sttistics about data distributions can be confiured on a per column basis or for globally by the parameter "default_statistics_target". Adding a redundant (totally correlated) predicate "m2.b = 2" misguides the planner since it chooses a very expensive plan based on the estimate that the scan on "m2" will return only 1 tuple (actually 1000 tuples). This leads to a nested loop-join without materialization of the intermediate result, hence m2 will be scanned 1000 times. This is a good demonstration for the needs of robust query optimization as claimed in [xxx]. */ Q4: Q3 and m2.b = 2 Aggregate (cost=17099.17..17099.17 rows=1 width=0) -> Nested Loop (cost=0.00..17099.16 rows=971 width=0) -> Seq Scan on m2 (cost=0.00..8553.29 rows=1 width=4) Filter: ((b = 2) AND (1 = a)) -> Seq Scan on m1 (cost=0.00..8543.29 rows=971 width=4) Filter: (a = 1)