This is an old revision of the document!

PostgreSQL - Basi di configurazione

eseguire /usr/local/sbin/postgresql_shmall_shmmax.sh » /etc/sysctl.conf
installare pgtune per un tuning grezzo di PostgreSQL sulla macchina
eseguire pgtune -i /etc/postgresql/9.1/main/postgresql.conf | grep pgtune » /etc/postgresql/9.1/main/postgresql.conf
abilitare i log su stderr seguendo la documentazione di pgbadger
NOTA LOG: attualmente su befair2 i log sono in /var/lib/postgresql/9.1/main/pg_log/ vediamo come si comporta postgresql con rotazione e compressione

Performance

Performance Tuning: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
Ottimo articolo segnalato da AndreaF http://www.revsys.com/writings/postgresql-performance.html
Index Maintenance https://wiki.postgresql.org/wiki/Index_Maintenance

Query utili su indici e loro utilizzo

Indici presenti

  [0] SELECT * FROM pg_indexes;

Ricerca indici doppi

Query “grezza” per noi

acs_service=# SELECT tablename,indexname,indexdef from pg_indexes WHERE schemaname='public';

Query appropriata da http://www.xzilla.net/blog/2008/Jul/Index-pruning-techniques.html

[1] select indrelid::regclass, array_accum(indexrelid::regclass) from pg_index group by indrelid, indkey having count(*) > 1;

Nota: richiede di definire l'aggregato array_accum

Utilizzo degli indici

  
  SELECT relid,schemaname,relname,idx_scan,idx_tup_read,idx_tup_fetch FROM pg_stat_all_indexes ORDER BY idx_scan desc;

Filtra solo quelli utilizzati almeno una volta

  
  SELECT * from pg_stat_all_indexes WHERE relname='table' AND idx_scan > 0 ;

Filtra solo quelli che sono stati efficaci almeno una volta

  SELECT * from pg_stat_all_indexes WHERE schemaname='public' AND (idx_tup_read > 0 OR idx_tup_fetch > 0);

Query più appropriata, eseguita su tutti gli schemi (da http://www.xzilla.net/blog/2008/Jul/Index-pruning-techniques.html)

  [2] select indexrelid::regclass as index, relid::regclass as table from pg_stat_user_indexes JOIN pg_index USING (indexrelid) where idx_scan = 0 and indisunique is false;

Ho modificato la query per trovare gli indici che sono stati efficaci almeno una volta:

  [3] select indexrelid::regclass as index, relid::regclass as table from pg_stat_user_indexes JOIN pg_index USING (indexrelid) where (idx_tup_read > 0 OR idx_tup_fetch > 0) and indisunique is false;

Invece questa la query per indici usati senza essere efficaci

  [4] select indexrelid::regclass as index, relid::regclass as table from pg_stat_user_indexes JOIN pg_index USING (indexrelid) where (idx_scan >0 and (idx_tup_read = 0 OR idx_tup_fetch = 0)) and indisunique is false;

Query per tabelle

Tabelle vuote (stima. Per una risposta esaustiva vedere http://stackoverflow.com/questions/2596670/how-do-you-find-the-row-count-for-all-your-tables-in-postgres)

  [5] SELECT schemaname,relname,n_live_tup FROM pg_stat_user_tables WHERE n_live_tup=0;

Esecuzione query al ...

[0] = … indici totali
[1] = … indici doppi
[2] = … indici non usati
[3] = … indici che sono stati efficaci
[4] = … indici usati ma non efficaci
[5] = … tabelle vuote

CREAZIONE E USO INDICI

postgres# \d+ acs_service.connection_session;
   Column   |           Type           |                                  Modifiers                                  | Storage |                                                   Description                                                    
------------+--------------------------+-----------------------------------------------------------------------------+---------+------------------------------------------------------------------------------------------------------------------
 id         | integer                  | not null default nextval('acs_service.connection_session_id_seq'::regclass) | plain   | 
 start_time | timestamp with time zone | not null                                                                    | plain   | Moment in which the VPN session started. Obtained from $time_unix OpenVPN's envvar.
 duration   | interval                 |                                                                             | plain   | Duration of the connection, value is NULL until connection drops. Obtained from $time_duration OpenVPN's envvar.
 id_chamber | integer                  | not null                                                                    | plain   | Id of the Chamber related to the connection.
Indexes:
  "pk_id" PRIMARY KEY, btree (id)
  "connection_session_duration_idx" btree (duration)
  "connection_session_id_chamber_idx" btree (id_chamber)
Foreign-key constraints:
  "fk_chamber" FOREIGN KEY (id_chamber) REFERENCES acs_service.chamber(id)
Has OIDs: no

postgres# REINDEX TABLE "acs_service"."connection_session";
REINDEX
postgres# EXPLAIN ANALYZE SELECT bcs.id, bcs.duration FROM acs_service.connection_session bcs WHERE bcs.duration IS NULL;
                                                              QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on connection_session bcs  (cost=4.82..234.49 rows=70 width=20) (actual time=0.117..0.180 rows=67 loops=1)
 Recheck Cond: (duration IS NULL)
 ->  Bitmap Index Scan on connection_session_duration_idx  (cost=0.00..4.80 rows=70 width=0) (actual time=0.106..0.106 rows=67 loops=1)
       Index Cond: (duration IS NULL)
Total runtime: 0.208 ms (5 rows)

postgres=# EXPLAIN ANALYZE SELECT bcs.id, bcs.duration FROM acs_service.connection_session bcs WHERE bcs.duration IS NOT NULL;
                                                        QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on connection_session bcs  (cost=0.00..2829.78 rows=149208 width=20) (actual time=0.008..36.858 rows=149211 loops=1)
 Filter: (duration IS NOT NULL)
Total runtime: 46.523 ms (3 rows)

Se creo indice sull'espressione "duration IS NOT NULL" cosa succede?

Niente …. :(

  postgres=# CREATE INDEX ON "acs_service".connection_session ((duration IS NOT NULL));
  CREATE INDEX
  postgres=# REINDEX TABLE "acs_service"."connection_session";
  REINDEX
  postgres=# EXPLAIN ANALYZE SELECT bcs.id, bcs.duration FROM acs_service.connection_session bcs WHERE bcs.duration IS NOT NULL;
                                                            QUERY PLAN                                                       
  -------------------------------------------------------------------------------------------------------------------------------
   Seq Scan on connection_session bcs  (cost=0.00..2829.90 rows=149220 width=20) (actual time=0.011..38.323 rows=149226 loops=1)
     Filter: (duration IS NOT NULL)
   Total runtime: 48.491 ms
  (3 rows)

  postgres=# EXPLAIN ANALYZE SELECT bcs.id, bcs.duration FROM acs_service.connection_session bcs WHERE bcs.duration IS NULL;         
                                                                  QUERY PLAN                                                            
  ------------------------------------------------------------------------------------------------------------------------------------------
   Bitmap Heap Scan on connection_session bcs  (cost=4.82..234.49 rows=70 width=20) (actual time=0.061..0.119 rows=65 loops=1)
     Recheck Cond: (duration IS NULL)
     ->  Bitmap Index Scan on connection_session_duration_idx  (cost=0.00..4.80 rows=70 width=0) (actual time=0.050..0.050 rows=66 loops=1)
           Index Cond: (duration IS NULL)
   Total runtime: 0.148 ms
  (5 rows)

Occupazione disco

Database

Lo restituisce pure in formato carino :)

# select (pg_database_size(current_database())/(1024*1024))::text || ' MB' as db_size;

Tabelle

Restituisce un elenco ordinato in modo discendente di:

schema (aka “directory”)
tabella
occupazione su disco
occupazione totale compresi indici
numero stimato di righe

  SELECT table_schema, table_name, 
    (size/1024)::varchar || ' kB' as size,
    (total_size/1024)::varchar || ' kB' as total_size,
    (CASE WHEN pg_class.reltuples > 0 THEN
        basic_infos.size/(8192*relpages*reltuples)
        ELSE 0
        END
    )::bigint AS estimated_row_count
    FROM (
      SELECT table_schema, table_name, 
          pg_relation_size('"' || table_schema || '"."' || table_name || '"') as size,
          pg_total_relation_size('"' || table_schema || '"."' || table_name || '"') as total_size
          FROM information_schema.tables 
          WHERE table_schema NOT IN ('information_schema', 'pg_catalog') 
      ORDER BY size DESC, total_size DESC
  ) AS basic_infos 
      JOIN pg_class
      ON ('"' || table_schema || '"."' || table_name || '"')::regclass = pg_class.oid

Questa query funziona su PostgreSQL 8.4. E credo valga anche per PostgreSQL >= 9.0. Potrebbe essere interessante pubblicarla da qualche parte.

Informazioni varie...

Query comode x debug

Processi attivi: select * from pg_stat_activity;
Utilizzo delle tabelle: SELECT * FROM pg_stat_all_tables;
Lock da https://wiki.postgresql.org/wiki/Lock_Monitoring (ho aggiunto il WHERE nel mio caso)

SELECT a.datname,
       c.relname,
       l.transactionid,
       l.mode,
       l.granted,
       a.usename,
       a.current_query, 
       a.query_start,
       age(now(), a.query_start) AS "age", 
       a.procpid 
  FROM  pg_stat_activity a
   JOIN pg_locks         l ON l.pid = a.procpid
   JOIN pg_class         c ON c.oid = l.relation
  WHERE mode='RowExclusiveLock' ORDER BY a.query_start;

Avvio manuale di postgres

se service postgresql start offre poca flessibilità usare

pg_ctlcluster 8.4 main start -- -w

o ancora più a basso livello, da utente postgres eseguire:

/usr/lib/postgresql/8.4/bin/pg_ctl start -D /var/lib/postgresql/8.4/main -w -o '-c config_file="/etc/postgresql/8.4/main/postgresql.conf"'

Il nostro punto della strada

Table of Contents