Similarly to a normal PostgreSQL child process, a custom background worker should be running using a loop that can be interrupted with signals to update a given status or simply exit the process. Two types of signals are handled by custom background workers in the PostgreSQL architecture: SIGHUP and SIGTERM.

When such signals are received by a bgworker, you should be aware that processing similar to what is done for a normal backend process might be needed. For example, if your custom bgworker uses some GUC parameters, the process needs to take proper action to reload configuration parameters. Even if there are already good examples in the PostgreSQL source code for bgworkers, I noticed that there are no examples focusing only on certain fundamentals of bgworker. As someone who likes simple things, I think that this is essential to get the essence of what this feature can do piece by piece.

So here is in this post an example of bgworker that I wrote for beginners in order to understand only the fundamentals of signal handling. First you need to know that it is necessary to store a static variable to store the status of representing if signal has been activated or not in a way similar to that:
static bool got_sigterm = false;
static bool got_sighup = false;

You also need dedicated functions to set those flags to true if a signal is received by the worker.
static void
hello_sigterm(SIGNAL_ARGS)
{
 got_sigterm = true;
}
 
static void
hello_sighup(SIGNAL_ARGS)
{
 got_sighup = true;
}

A dedicated function that is used as a main loop for processing is essential as well.
static void
hello_main(void *main_arg)
{
 /* We're now ready to receive signals */
 BackgroundWorkerUnblockSignals();
 while (true)
 {
  /* Process signals */
  if (got_sighup)
  {
   got_sighup = false;
   ereport(LOG, (errmsg("hello signal: processed SIGHUP")));
  }
 
  if (got_sigterm)
  {
   /* Simply exit */
   ereport(LOG, (errmsg("hello signal: processed SIGTERM")));
   proc_exit(0);
  }
 }
 proc_exit(0)
}

Note the call to BackgroundWorkerUnblockSignals. This is extremely important in order to allow the reception of signals by the background worker.

Once you have this basic infrastructure in place, you need to register this worker process correctly with something like that:
void
_PG_init(void)
{
 BackgroundWorker worker;
 worker.bgw_flags = 0;
 worker.bgw_start_time = BgWorkerStart_PostmasterStart;
 worker.bgw_main = hello_main;
 worker.bgw_sigterm = hello_sigterm;
 worker.bgw_sighup = hello_sighup;
 worker.bgw_name = "hello_signal";
 /* Wait 10 seconds for restart before crash */
 worker.bgw_restart_time = 10;
 worker.bgw_main_arg = NULL;
 RegisterBackgroundWorker(&worker);
}

Also don’t forget that you need a header similar to that to have this code working properly.
/* Some general headers for custom bgworker facility */
#include "postgres.h"
#include "fmgr.h"
#include "postmaster/bgworker.h"
#include "storage/ipc.h"
 
/* Allow load of this module in shared libs */
PG_MODULE_MAGIC;
 
/* Entry point of library loading */
void _PG_init(void);

Also, be sure that when you create a custom background worker, signal handling is similar to what is done for normal backend process. For example, the configuration file reload should be processed if SIGHUP is received. You can do that properly by calling ProcessConfigFile in a manner similar to that in the main loop.
if (got_sighup)
{
 /* Process config file */
 ProcessConfigFile(PGC_SIGHUP);
 got_sighup = false;
 ereport(LOG, (errmsg("hello signal: processed SIGHUP")));
}

In order to bring more fluidity to you custom worker and not have it use all the CPU of your server by running continuously, don’t forget to define a latch to control some sleep period of your worker. It can be defined with that:
/* The latch used for this worker to manage sleep correctly */
static Latch signalLatch;

Then when entering in the main loop process, initialize the latch with that:
InitializeLatchSupport();
InitLatch(&signalLatch);

Finally you need to set up your main loop to use the latch
while (true)
{
 int rc;
 
 /* Wait 1s */
 rc = WaitLatch(&signalLatch,
   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
   1000L);
 ResetLatch(&signalLatch);
 
 /* Emergency bailout if postmaster has died */
 if (rc & WL_POSTMASTER_DEATH)
  proc_exit(1);
 
 [... code for signal handling ...]
}

You can also set up the latch such as the sleep will stop immediately if a signal is received. Simply add the following call in hello_sighup and hello_sigterm to do that.
SetLatch(&signalLatch);

This code can be found on Github as repository pg_workers. I created it to group all the bgworker examples I wrote using the facility of PostgreSQL 9.3 and above. You can find the example presented in this post in the folder hello_signal.

Setting up logging for a PostgreSQL server using syslog on a Linux machine is intuitive especially with logging systems like syslog-ng, you just need to put the correct parameters at the right place.

First, you need to setup the system side, by adding the following settings in /etc/syslog-nd/syslog-nd.conf (or similar, don’t hesitate to customize that with your own paths).
destination postgres { file("/var/log/pgsql"); };
filter f_postgres { facility(local0); };
log { source(src); filter(f_postgres); destination(postgres); };

This will send all the logs of postgresql server to /var/log/pgsql. Be sure to combine that with some solution rotating log files to avoid a single file becoming too large… And reload syslog-ng with a command similar to that (varies depending on distribution used, here Archlinux).
systemctl reload syslog-ng

Then, you need to add those settings in postgresql.conf.
log_destination = 'syslog' # Can specify multiple destinations
syslog_facility='LOCAL0'
syslog_ident='postgres'

Based on the documentation, syslog_facility can be set from LOCAL0 to LOCAL7.
Don’t forget that you can also specify multiple log destinations. For example when using stderr and syslog at the same time, simply do that:
log_destination = 'stderr,syslog'

Finally, reload the parameters of server and you are done.
pg_ctl reload -D $PGDATA
Note that restarting the server is not necessary.

Based on my previous experience using custom background workers (new feature of PostgreSQL 9.3), here is more detailed example of bgworker doing a simple “Hello World” in the server logs.

Well, the example provided in this post does a little bit more than logging a simple “Hello World”, as logging is controlled by a loop running at a given interval of time. Also, the process can be immediately stopped even during its sleep with SIGTERM. So here is in more details how this is done.

Header files

/* Minimum set of headers */
#include "postgres.h"
#include "postmaster/bgworker.h"
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/proc.h"
#include "fmgr.h"

This is the minimal set of header files that needs to be provided in order to have the bgworker working correctly. Note that in the case of this example, the sleep time is controlled by a Latch on the process of the background worker, explaining why latch.h and proc.h are included.

Declarations

/* Essential for shared libs! */
PG_MODULE_MAGIC;
 
/* Entry point of library loading */
void _PG_init(void);
 
/* Signal handling */
static bool got_sigterm = false;

PG_MODULE_MAGIC is absolutely necessary for libraries loaded via shared_preload_libraries or server will fail with a FATAL error. Then the only function needed in library is _PG_init, entry point to register the bgworker using the dedicated APIs. Finally a static boolean is used as a flag for SIGTERM activation.

Initialization

void
_PG_init(void)
{
 BackgroundWorker worker;
 
 /* Register the worker processes */
 worker.bgw_flags = BGWORKER_SHMEM_ACCESS;
 worker.bgw_start_time = BgWorkerStart_RecoveryFinished;
 worker.bgw_main = hello_main;
 worker.bgw_sighup = NULL;
 worker.bgw_sigterm = hello_sigterm;
 worker.bgw_name = "hello world";
 worker.bgw_restart_time = BGW_NEVER_RESTART;
 worker.bgw_main_arg = NULL;
 RegisterBackgroundWorker(&worker);
}

This portion of code is used to register the new worker. In this example bgw_start_time is set to start only once the system has reached a stable read-write state. The process is also allowed access to shared memory with the flag BGWORKER_SHMEM_ACCESS (this is used for MyProc, as this is statically included in procarray.c). Finally there are definitiosn for the functions used first as a main loop for logging of “Hello World” and for the function to kick when there is a SIGTERM on background worker. The process is requested not to restart in case of a crash.

Main loop

static void
hello_main(void *main_arg)
{
 /* We're now ready to receive signals */
 BackgroundWorkerUnblockSignals();
 while (!got_sigterm)
 {
  int rc;
  /* Wait 10s */
  rc = WaitLatch(&MyProc->procLatch,
    WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
    10000L);
  ResetLatch(&MyProc->procLatch);
  elog(LOG, "Hello World!"); /* Say Hello to the world */
 }
 proc_exit(0);
}

This is the main loop used for the background process. As long as SIGTERM is not received, the process will continue to loop and log “Hello World” every 10s, time interval being symbolized by 10000L. Note that WaitLatch can be awaken with different events: timeout of the time interval provided, Latch being set or postmaster death.

SIGTERM handler

static void
hello_sigterm(SIGNAL_ARGS)
{
 int save_errno = errno;
 got_sigterm = true;
 if (MyProc)
  SetLatch(&MyProc->procLatch);
 errno = save_errno;
}

When SIGTERM is used on the process, the sleep time previously invoked with WaitLatch is stopped with SetLatch immediately. This is controlled by flag WL_LATCH_SET that awakes the Latch when set properly.

Makefile

MODULES = hello_world
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

This is a simple Makefile. Be sure to name the file containing the code given in this post as hello_world.c. The library generated will be called hello_world.so.

Once this code is run, you will be able to see the process running with a simple ps command.
$ ps x | grep "hello"
13327 ?? Ss 0:00.00 postgres: bgworker: hello world

Be sure to set this variable in postgresql.conf to load the worker correctly.
shared_preload_libraries = 'hello_world'

pg_reorg is a postgresql module developped and maintained by NTT that allows to redistribute a table without taking locks on it.
The code is hosted by pg_foundry here.
However, pgfoundry uses CVS :( , so I am also maintaining a fork in github in sync with pgfoundry here.

What pg_reorg can do for you is to reorganize a whole table in the same fashion way as a CLUSTER or a VACUUM FULL, while allowing write operations on the table being reorganized at the same time. No locks are needed.

Once you have downloaded the code, you just need to install it on your server.
cd $CODE_FOLDER
make install

Then install the EXTENSION module (for version upper than 9.1) after connecting to the postgres server.
CREATE EXTENSION pg_reorg;

Then, it is possible to perform several types of operations.
CLUSTER reorganization on the table $TABLE.
pg_reorg --dbname $DATABASE -t $TABLE
VACUUM FULL reorganization on the table $TABLE.
pg_reorg --dbname $DATABASE -t $TABLE -n
Reorganization of an entire database.
pg_reorg --dbname $DATABASE

The main limitation of this utility is that table being redistributed needs to have a primary key or a non-null unique key.

Then, a little bit more about the technique it uses to reorganize the table.
Basically, a temporary copy of the table to be redistributed is created using a CREATE TABLE AS query. The CTAS query definition is changed depending on the distribution user wants. For example, if user wants a redistribution using a different column (option -o), the CTAS is completed with an ORDER BY clause on the wanted column. The indexes of the new table depend on what the user wants.

Then the following operations are done.

  • creation of triggers to register all the DMLs that occur on the former table to an intermediate log table
  • creation of indexes on the temporary table based on what the user wants (new column index, VACUUM FULL…)
  • Apply the logs registered during the index creation and wait for old transactions to finish
  • Swap the names between the freshly-created table and old table
  • Drop the useless objects: the old table, the old triggers and remaining objects

This functionality is particularly handy when you wish to reorganize a huge table. Performing a VACUUM/CLUSTER on it might take time, and your application might need this table to be accessible in write for a maximum amount of time. So pretty useful, uh?

pgbadger is a recent Postgres module presented during the lightning talks of PGCon 2012 by its original author Gilles Darold. It is thought as an alternative to PgFouine, able to replace it thanks to its flexibility, extensibility and performance.

So, like PGFouine, pgbadger is a Postgres log file analyzer, meaning that it is able to deparse, analyze the data of your log files and then provide statistics about your database: from overall data (query list, number of transactions), error counts (most frequent events) to more performance details like the queries that took the longest run time.
This means that basically you use in input log files from PostgreSQL, and you get in output an html or txt page that gives you all the statistics you want. So you do not need to sneak anymore in your log files to analyze what is happening anymore. Simply launch pgbadger, wait for parsing (which is pretty fast btw), and see.

Before describing more in details the functionalities of pgbadger, why is it an alternative to pgfouine?

  1. pgbadger is written in perl, pgfouine is written in php. So with pgbadger you do not need to install extra packages, Postgres core using natively perl. And well, perl is more performant than php. And php… is php…
  2. Developped by the community, for the community
  3. Latest release of pgfouine is from 2010, it doesn’t look to be that much maintained. pgbadger is a new project, young and dynamic, and more and more people are gathering to develop it.
  4. It is developped by cool guys, OK pgfouine also… That is maybe not a real argument…

Now, let’s put our hands on the beast. There are several ways to get this module.
First, you can fetch the code of the project directly from Github with those commands:
git clone https://github.com/dalibo/pgbadger.git
Also, you can download the tarball from here.
Then install it with those commands.
tar xzf pgbadger-1.x.tar.gz
cd pgbadger-1.x/
perl Makefile.PL
make && sudo make install

This will install pgbadger in /usr/bin/local and some man pages. You can refer to the README of the project for more details.

As told before, pgbadger is a log file deparser, so you need to set up the output of your log files correctly to allow pgbadger to look at your database server information. The more logging parameters you activate, the more information you will be able to get from your log files. Here are the settings I used for this post and the test below.
logging_collector = on
log_min_messages = debug1
log_min_error_statement = debug1
log_min_duration_statement = 0
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0

You will need to customize those options depending on the information you want, just do not forget that setting up the root correctly is the key for success.
At the state of pgbadger 2.0, log_statement and log_duration cannot be activated, so take care not to use them.

For the purpose of this post and in order to produce a couple of log files, I ran a 5-minute pgbench test on a fresh Postgres server.
So the logs I obtained are not at all production-like, but enough to show what pgbadger can do.

By the way, pgbadger has several options making it pretty flexible, among them you have the possibility to specify multiple log files at the same time, specify an interval of time for the analysis, and far more. It is also possible to choose output format of result. Among the possible use cases I got on top of my head:

  • Creation of a text file report, and send it to a mailing list automatically
  • Creation of an html file, and upload it automatically to a web server
  • Base log analysis on a cron with a certain time interval
  • etc.

pgbadger has few dependencies, so it makes it pretty flexible for your environments. Once again the README of the project gives more examples of use, so do not hesitate to refer to it.

So, just after my short pgbench run, I got a set of log files ready for analysis. Now it is time to parse them.
Note: the pgbench test has been done with default values without thinking, so don’t worry about the bad performance results :)
$ ./pgbadger ~/pgsql/master/pg_log/postgresql-2012-08-30_132*
[========================>] Parsed 10485768 bytes of 10485768 (100.00%)
[========================>] Parsed 10485828 bytes of 10485828 (100.00%)
[========================>] Parsed 10485851 bytes of 10485851 (100.00%)
[========================>] Parsed 10485848 bytes of 10485848 (100.00%)
[========================>] Parsed 10485839 bytes of 10485839 (100.00%)
[========================>] Parsed 982536 bytes of 982536 (100.00%)

In result, I got a file called out.html (default, but customizable) showing a bunch of data, analyzing things automatically.
The most interesting part is perhaps the performance analysis, showing you a list of the less performant queries, so this will allow you to tune your database based on the log data obtained.

Here are some pictures showing portions of the output results you will get.

pgbadger example 1

pgbadger example 1


pgbadger example 2

pgbadger example 2

I forgot to tell something: this utility is 100% compatible with Postgres-XC, the only thing you need to do is just to launch pgbadger for each node of your cluster.

So, pgbadger is light, fast and is waiting for your love. It is one of those utilities that you can use not only for production database systems, but for extra things like benchmark or performance analysis. Its installation is easy, will not heavy your system with packages you might not want, so go ahead and use it.

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use