As for every tranactional database, disk I/O is the main limiting factor for PostgresSQL. If you plan to deploy a high-usage database please take some precautions on the used storage. Use either a RAID 1 or 10 and choose a FS that does fast block-IO (ext2).
Accounts and permissions
When starting with a fresh database, you've to create some users first before you can start using it.
su - postgresql createuser -P <user> createdb -O <user> <db>
Also be sure to tune the pg_hba.conf in the data-directory to your needs. To emulate MySQLs default "every user needs to authenticate from everywhere"-semantics use the following config:
local all all md5 host all all 127.0.0.1/32 md5 host all all ::1/128 md5
Performance and Tuning
If you're getting sequential scans despite having indexes and have run a vacuum analyze recently you might need to increase the default_statistics_target.
Since PostgresSQL is a transactional database, old rows don't get actually removed/replaced when you update/delete them (since they might be still needed in older/long running transactions). To actually free them you need to issue a vacuum.
A normal vacuum will only mark deprecated rows for reuse, to actually reclaim diskspace (e.g. when having deleted large amounts of data) you need to issue a full vacuum. Please note that it might be faster to backup the data you want to keep and truncate the table if you plan to remove large portions of a table.
Transaction ID Wraparound
A regularly run vacuum should prevent Transaction ID Wraparound in all cases.
You can check your TXID counters with the following query:
SELECT datname, age(datfrozenxid) FROM pg_database;
If the age is noticeably higher than 1 Billion after a recent vacuum something is probably not working as intended.
Fixing broken databases
set zero_damaged_pages to on; vacuum; pray;
If you want to know which table in your database claims the most diskspace, here is a query that returns the size of the tables from the current database.
A block is 8192 bytes therefore I calculate the kb size in an extra column. relfilenode column holds the file name for this table / data. You can find it in the data directory from postgres (main/). The relkind column holds the type of the data and reltuples the count of rows in this table.
SELECT relname, relfilenode, relkind, reltuples, relpages, (relpages * 8) as relpages_kb FROM pg_class ORDER BY relpages DESC ;
relnames starting with pg_toast are TOAST-storage for large tables. Compare the appended number with the relfilenodes to get the associated table.
relname | relfilenode | relkind | reltuples | relpages | relpages_kb -----------------------------------------+-------------+---------+-------------+----------+------------- pg_toast_16496 | 16499 | t | 6.74842e+06 | 1684158 | 13473264 eintrag | 16510 | r | 3.97601e+06 | 271484 | 2171872 admin_log | 16496 | r | 9.49351e+06 | 248608 | 1988864 history | 16654 | r | 1.98684e+07 | 204714 | 1637712 ctimes | 16695 | r | 1.36451e+07 | 189826 | 1518608
To gracefully cancel queries, one can use pg_cancel_backend(). This should be safer than SIGTERMing the according process.
Accessing the Database
psql and other CLI programs
psql password prompts
psql has, unlike the mysql client, no option to supply a password on the command line (which would be insecure). There are two solutions for this problem:
- environment variables:
export PGPASSWORD=password export PGUSER=username export PGHOST=host
- a ~/.pgpass file