PostgreSQL

From AdminWiki

Revision as of 02:18, 29 March 2007 by Gullevek (Talk | contribs)
Jump to: navigation, search

Contents

General

Setup

Storage

As for every tranactional database, disk I/O is the main limiting factor for PostgresSQL. If you plan to deploy a high-usage database please take some precautions on the used storage. Use either a RAID 1 or 10 and choose a FS that does fast block-IO (ext2).

Accounts and permissions

When starting with a fresh database, you've to create some users first before you can start using it.

Basic steps:

su - postgresql
createuser -P <user>
createdb -O <user> <db>


Also be sure to tune the pg_hba.conf in the data-directory to your needs. To emulate MySQLs default "every user needs to authenticate from everywhere"-semantics use the following config:

local   all         all                               md5
host    all         all         127.0.0.1/32          md5
host    all         all         ::1/128               md5 

Performance and Tuning

Sequential Scans

If you're getting sequential scans despite having indexes and have run a vacuum analyze recently you might need to increase the default_statistics_target.

max_fsm_*

Maintenance

Vacuuming

What?

Since PostgresSQL is a transactional database, old rows don't get actually removed/replaced when you update/delete them (since they might be still needed in older/long running transactions). To actually free them you need to issue a vacuum.

A normal vacuum will only mark deprecated rows for reuse, to actually reclaim diskspace (e.g. when having deleted large amounts of data) you need to issue a full vacuum. Please note that it might be faster to backup the data you want to keep and truncate the table if you plan to remove large portions of a table.

autovacuum

TODO

Transaction ID Wraparound

A regularly run vacuum should prevent Transaction ID Wraparound in all cases.

You can check your TXID counters with the following query:

SELECT datname, age(datfrozenxid) FROM pg_database;

If the age is noticeably higher than 1 Billion after a recent vacuum something is probably not working as intended.

Fixing broken databases

set zero_damaged_pages to on; vacuum; pray;

Table Size

If you want to know which table in your database claims the most diskspace, here is a query that returns the size of the tables from the current database.

A block is 8192 bytes therefore I calculate the kb size in an extra column.
UPDATE: I calculate the size to bytes, so I can use the "pg_size_pretty" function to convert it to a human readable form (KB, MB, GB, ...)
relfilenode column holds the file name for this table / data. You can find it in the data directory from postgres (main/).
The relkind column holds the type of the data and reltuples the count of rows in this table.

 SELECT relname, pg_size_pretty(relpages * 8 *1024) as size, relfilenode, relkind, reltuples, relpages FROM pg_class ORDER BY relpages DESC;

relnames starting with pg_toast are TOAST-storage for large tables. Compare the appended number with the relfilenodes to get the associated table.

example output:

                 relname                 | relfilenode | relkind |  reltuples  | relpages | relpages_kb
-----------------------------------------+-------------+---------+-------------+----------+-------------
 pg_toast_16496                          |       16499 | t       | 6.74842e+06 |  1684158 |    13473264
 eintrag                                 |       16510 | r       | 3.97601e+06 |   271484 |     2171872
 admin_log                               |       16496 | r       | 9.49351e+06 |   248608 |     1988864
 history                                 |       16654 | r       | 1.98684e+07 |   204714 |     1637712
 ctimes                                  |       16695 | r       | 1.36451e+07 |   189826 |     1518608

More about admin functions regarding databasize and table size can be found in the System Admin Functions under the Table 9-48.

Cancelling queries

To gracefully cancel queries, one can use pg_cancel_backend(). This should be safer than SIGTERMing the according process.

Accessing the Database

psql and other CLI programs

psql password prompts

psql has, unlike the mysql client, no option to supply a password on the command line (which would be insecure). There are two solutions for this problem:

  • environment variables:
 export PGPASSWORD=password
 export PGUSER=username
 export PGHOST=host
  • a ~/.pgpass file

pgpass Documentation

Format:

hostname:port:database:username:password
Personal tools