As for every transactional database, disk I/O is the main limiting factor for PostgresSQL. If you plan to deploy a high-usage database please take some precautions on the used storage. Use either a RAID 1 or 10 and choose a FS that does fast block-IO (ext2).
Accounts and permissions
When starting with a fresh database, you've to create some users first before you can start using it.
su - postgresql createuser -P <user> createdb -O <user> <db>
Also be sure to tune the pg_hba.conf in the data-directory to your needs. To emulate MySQLs default "every user needs to authenticate from everywhere"-semantics use the following config:
local all all md5 host all all 127.0.0.1/32 md5 host all all ::1/128 md5
Performance and Tuning
SELECT * FROM pg_stat_activiy;
This will show you all currently running transactions/queries. Eg. Look out for longer running queries, if things go wrong.
SELECT * FROM pg_lock;
This will show you a list of currently locked resources.
If you're getting sequential scans despite having indexes and have run a vacuum analyze recently you might need to increase the default_statistics_target.
Since PostgresSQL is a transactional database, old rows don't get actually removed/replaced when you update/delete them (since they might be still needed in older/long running transactions). To actually free them you need to issue a vacuum.
A normal vacuum will only mark deprecated rows for reuse, to actually reclaim diskspace (e.g. when having deleted large amounts of data) you need to issue a full vacuum. Please note that it might be faster to backup the data you want to keep and truncate the table if you plan to remove large portions of a table.
Check status of autovacuums in a given Database:
select relname,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze from pg_stat_user_tables;
Transaction ID Wraparound
A regularly run plain or "lazy" vacuum should prevent Transaction ID Wraparound in all cases, a full vacuum is not required.
You can check your TXID counters with the following query:
SELECT datname, age(datfrozenxid) FROM pg_database;
If the age is noticeably higher than 1 Billion (2^30 or 1073741824 to be exact) after a recent vacuum something is probably not working as intended; if it should be anywhere near 2 billion you should take immediate action to prevent problems.
Fixing broken databases
set zero_damaged_pages to on; vacuum; pray;
If you want to know which table in your database claims the most diskspace, here is a query that returns the size of the tables from the current database.
Note: These examples assume that you use the default page size of 8kb.
relfilenode column holds the file name for this table / data. You can find it in the data directory from postgres (main/).
The relkind column holds the type of the data and reltuples the count of rows in this table.
SELECT relname, pg_size_pretty(relpages::bigint * 8 * 1024) as size, relkind, reltuples::bigint as rows, relpages, relfilenode FROM pg_class ORDER BY relpages DESC;
relnames starting with pg_toast are TOAST-storage for large tables. Compare the appended number with the relfilenodes to get the associated table.
If you have a lot of huge data, thre will be a lot of toast tables. With this query you get an additional field that shows the toast table name if this table has one, and the original table for each toast table:
This works until PostgreSQL 9.0, see modified query below.
SELECT relname, pg_size_pretty(relpages::bigint * 8 *1024) AS size, CASE WHEN relkind = 't' THEN (SELECT pgd.relname FROM pg_class pgd WHERE pgd.relfilenode::text = SUBSTRING(pg.relname FROM 10)) ELSE (SELECT pgc.relname FROM pg_class pgc WHERE pg.reltoastrelid = pgc.relfilenode) END AS refrelname, relfilenode, relkind, reltuples::bigint, relpages FROM pg_class pg ORDER BY relpages DESC;
relname | relfilenode | relkind | reltuples | relpages | relpages_kb -----------------------------------------+-------------+---------+-------------+----------+------------- pg_toast_16496 | 16499 | t | 6.74842e+06 | 1684158 | 13473264 eintrag | 16510 | r | 3.97601e+06 | 271484 | 2171872 admin_log | 16496 | r | 9.49351e+06 | 248608 | 1988864 history | 16654 | r | 1.98684e+07 | 204714 | 1637712 ctimes | 16695 | r | 1.36451e+07 | 189826 | 1518608
For PostgreSQL 9.0 the query below will work, it also adds some additional info for toast indexes:
SELECT pgn.nspname, relname, pg_size_pretty(relpages::bigint * 8 * 1024) AS size, CASE WHEN relkind = 't' THEN (SELECT pgd.relname FROM pg_class pgd WHERE pgd.reltoastrelid = pg.oid) WHEN nspname = 'pg_toast' AND relkind = 'i' THEN (SELECT pgt.relname FROM pg_class pgt WHERE SUBSTRING(pgt.relname FROM 10) = REPLACE(SUBSTRING(pg.relname FROM 10), '_index', '')) ELSE (SELECT pgc.relname FROM pg_class pgc WHERE pg.reltoastrelid = pgc.oid) END::varchar AS refrelname, CASE WHEN nspname = 'pg_toast' AND relkind = 'i' THEN (SELECT pgts.relname FROM pg_class pgts WHERE pgts.reltoastrelid = (SELECT pgt.oid FROM pg_class pgt WHERE SUBSTRING(pgt.relname FROM 10) = REPLACE(SUBSTRING(pg.relname FROM 10), '_index', ''))) END AS relidxrefrelname, relfilenode, relkind, reltuples::bigint, relpages FROM pg_class pg, pg_namespace pgn WHERE pg.relnamespace = pgn.oid AND pgn.nspname NOT IN ('information_schema', 'pg_catalog') ORDER BY relpages DESC;
nspname | relname | size | refrelname | relidxrefrelname | relfilenode | relkind | reltuples | relpages ----------+--------------------------------------------------------------+------------+----------------------------+----------------------------+-------------+---------+-----------+---------- pg_toast | pg_toast_12633551 | 12 GB | error_email_collect | | 16113687 | t | 6506013 | 1532166 public | log_email_sent | 4394 MB | pg_toast_12633624 | | 16112041 | r | 28645328 | 562416 public | mail_log | 3260 MB | pg_toast_12633661 | | 16113649 | r | 13556149 | 417328 public | error_email_collect | 1003 MB | pg_toast_12633551 | | 16113684 | r | 1161103 | 128345 public | log_hash_reference | 789 MB | pg_toast_12633642 | | 18241270 | r | 7046247 | 100960 public | email | 504 MB | pg_toast_12633482 | | 16111750 | r | 3673276 | 64568 public | log_email_sent_log_id_idx | 492 MB | | | 16112048 | i | 28684712 | 62947 public | log_email_sent_pkey | 492 MB | | | 16112047 | i | 28684712 | 62947 public | idx_mail_log_date_created | 291 MB | | | 16113677 | i | 13563234 | 37192 public | mail_log_pkey | 233 MB | | | 16113655 | i | 13563234 | 29765 pg_toast | pg_toast_12633551_index | 139 MB | pg_toast_12633551 | error_email_collect | 16113689 | i | 6506013 | 17840
relidxrefrelname is the original table on which the toast table is based for an toast index
More about admin functions regarding databasize and table size can be found in the System Admin Functions under the Table 9-48.
To gracefully cancel queries, one can use pg_cancel_backend(). This should be safer than SIGTERMing the according process.
Accessing the Database
psql and other CLI programs
psql password prompts
psql has, unlike the mysql client, no option to supply a password on the command line (which would be insecure). There are two solutions for this problem:
- environment variables:
export PGPASSWORD=password export PGUSER=username export PGHOST=host
- a ~/.pgpass file