PostgreSQL database profiling



The last few days I've been trying to performance-tune one of our applications. Being database intensive, it was only naturally to try to optimize database queries. 

To enable query logging, postgresql configuration should be updated like this. Note the configuration folder on Centos 7 distribution. 

In the postgresql.conf file, next lines under "ERROR REPORTING AND LOGGING" section need to be modified.

log_min_duration_statement = 0 
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ' 

Enable logging of all statements, regardless of minimum duration. Additionally log line format is adjusted, so that automatic analyzing with pgBadger becomes possible.

log_checkpoints = on 
log_connections = on 
log_disconnections = on 
log_hostname = on 
log_lock_waits = on

Enable logging of different types of events and statements.

log_statement = 'none'

In case we're using pgBadger, this setting can remain set to none, as already set in the defaults. Next, we need to restart database server. Example below for Centos 7.

systemctl restart postgresql

From this point on, the data is being gathered. Test your application, especially click through the slow running processes. All data is being logged and also the query durations. I suggest fiddling with the database until some representative quantity of the data is gathered. Normally, I let the log file grow to at least 2GB in size. 

Now it's time for pgBadger. You can get it from here. It's a really good tool for analyzing the PostgreSQL logs. When done, it will spit out the html file, containing all vital information about your database accesses. It will show you the longest running queries, the most frequent ones and much more. 

Please be patient, as parsing of huge log files may take some time. It might be good time for a coffee or something :-) 

When done, you can easily view the resulting file in your preferred Web browser. Hope the coffee has kicked in yet and you'll have a successfull query analysis.