PostgreSQL

From AdminWiki

Revision as of 15:05, 19 July 2007 by Robe (Talk | contribs)
Jump to: navigation, search

Contents

General

Setup

Storage

As for every tranactional database, disk I/O is the main limiting factor for PostgresSQL. If you plan to deploy a high-usage database please take some precautions on the used storage. Use either a RAID 1 or 10 and choose a FS that does fast block-IO (ext2).

Accounts and permissions

When starting with a fresh database, you've to create some users first before you can start using it.

Basic steps:

su - postgresql
createuser -P <user>
createdb -O <user> <db>


Also be sure to tune the pg_hba.conf in the data-directory to your needs. To emulate MySQLs default "every user needs to authenticate from everywhere"-semantics use the following config:

local   all         all                               md5
host    all         all         127.0.0.1/32          md5
host    all         all         ::1/128               md5 

Performance and Tuning

Sequential Scans

If you're getting sequential scans despite having indexes and have run a vacuum analyze recently you might need to increase the default_statistics_target.

max_fsm_*

Maintenance

Vacuuming

What?

Since PostgresSQL is a transactional database, old rows don't get actually removed/replaced when you update/delete them (since they might be still needed in older/long running transactions). To actually free them you need to issue a vacuum.

A normal vacuum will only mark deprecated rows for reuse, to actually reclaim diskspace (e.g. when having deleted large amounts of data) you need to issue a full vacuum. Please note that it might be faster to backup the data you want to keep and truncate the table if you plan to remove large portions of a table.

autovacuum

TODO

Transaction ID Wraparound

A regularly run vacuum should prevent Transaction ID Wraparound in all cases.

You can check your TXID counters with the following query:

SELECT datname, age(datfrozenxid) FROM pg_database;

If the age is noticeably higher than 1 Billion after a recent vacuum something is probably not working as intended.

Fixing broken databases

set zero_damaged_pages to on; vacuum; pray;

Table Size

If you want to know which table in your database claims the most diskspace, here is a query that returns the size of the tables from the current database.

A block is 8192 bytes therefore I calculate the kb size in an extra column.
UPDATE: I calculate the size to bytes, so I can use the "pg_size_pretty" function to convert it to a human readable form (KB, MB, GB, ...)
relfilenode column holds the file name for this table / data. You can find it in the data directory from postgres (main/).
The relkind column holds the type of the data and reltuples the count of rows in this table.

 SELECT relname, pg_size_pretty(relpages * 8 *1024) as size, relfilenode, relkind, reltuples, relpages FROM pg_class ORDER BY relpages DESC;

relnames starting with pg_toast are TOAST-storage for large tables. Compare the appended number with the relfilenodes to get the associated table.

If you have a lot of huge data, thre will be a lot of toast tables. With this query you get an additional field that shows the toast table name if this table has one, and the original table for each toast table:

 SELECT relname, pg_size_pretty(relpages * 8 *1024) AS size, CASE WHEN relkind = 't' THEN  (SELECT pgd.relname FROM pg_class pgd WHERE pgd.relfilenode = SUBSTRING(pg.relname FROM 10)) ELSE (SELECT pgc.relname FROM pg_class pgc WHERE pg.reltoastrelid = pgc.relfilenode) END AS refrelname, relfilenode, relkind, reltuples, relpages FROM pg_class pg ORDER BY relpages DESC;

example output:

                 relname                 | relfilenode | relkind |  reltuples  | relpages | relpages_kb
-----------------------------------------+-------------+---------+-------------+----------+-------------
 pg_toast_16496                          |       16499 | t       | 6.74842e+06 |  1684158 |    13473264
 eintrag                                 |       16510 | r       | 3.97601e+06 |   271484 |     2171872
 admin_log                               |       16496 | r       | 9.49351e+06 |   248608 |     1988864
 history                                 |       16654 | r       | 1.98684e+07 |   204714 |     1637712
 ctimes                                  |       16695 | r       | 1.36451e+07 |   189826 |     1518608

More about admin functions regarding databasize and table size can be found in the System Admin Functions under the Table 9-48.

Cancelling queries

To gracefully cancel queries, one can use pg_cancel_backend(). This should be safer than SIGTERMing the according process.

Accessing the Database

psql and other CLI programs

psql password prompts

psql has, unlike the mysql client, no option to supply a password on the command line (which would be insecure). There are two solutions for this problem:

  • environment variables:
 export PGPASSWORD=password
 export PGUSER=username
 export PGHOST=host
  • a ~/.pgpass file

pgpass Documentation

Format:

hostname:port:database:username:password
Personal tools