On top of bringing the latest PostgreSQL version at hand inside a virtual machine, vFabric Postgres 9.2.4 includes some new scripts to facilitate the integration of replication features of community version of PostgreSQL.

Here is a list of the new functionalities:

  • Separate virtual disk for archives with a default size of 2G mounted on /var/vmware/vpostgres/9.2-archive.
  • New default parameters in postgresql.conf to support replication by default (wal_level, max_wal_senders, archive_command, archive_mode, etc.).
  • New scripts for the management of replication between vFabric Postgres nodes: slave creation, node promotion, replication monitoring. Those scripts are located in folder /opt/vmware/vpostgres/current/share.

The replication configuration that can be achieved thanks to those scripts is really simple and convenient for most of real-world applications.

  • All the slaves use streaming replication to catch up with master in asynchronous mode
  • If a slave is disconnected for a too long time from the cluster and it cannot get necessary WAL files from archives of master, a new base backup is needed. This helps in maintaining the archives at a low size level (size by the way customizable by changing the disk size of archives in the virtual machine settings) by keeping in memory only the WAL files that are needed by slaves to catch up with a master
  • Recovery default is the latest timeline. This is to ensure that a slave trying to reconnect to a node freshly promoted will catch up with the latest changes that occurred in cluster
  • Slave/slave connections are possible with cascading replication

Here are more details about the scripts implemented. They are aimed to be used only with vFabric Postgres nodes whose version is higher than 9.2.4.

run_as_replica

Transform the existing vFabric Postgres server into a slave by getting it in sync with a given root node (either slave or master), or reconnect to an existing node in the cluster. When creating a new slave on a freshly-installed vFabric Postgres 9.2.4, you need to use options -b to take a new base backup and -W to specify password to connect to remote node, recommended user being “postgres”.

A single command based on this script is enough to set up a read-only slave becoming the replica of an existing root node in order to leverage read activity of an application with load balancing through multiple virtual machines of vFabric Postgres.
$ /opt/vmware/vpostgres/current/share/run_as_replica -h $IP_MASTER -b -W -U postgres
This script registers automatically SSH authorization key of slave node on root node for archive transfer between nodes so there is no need to worry in doing additional settings for pg_hba.conf, recovery.conf, postgresql.conf or SSH on root node side. Note that as default value max_wal_senders is set to 3, so you might want to increase this value on root node when connecting setting up more slave on a given root node.
This script can also be used to reconnect an existing slave to a new root node. In the case where slave node is not able to catch up with the root node because of missing WAL archive files or because the slave node was in advance compared to the root node in term of WAL replay, a new base backup is necessary.

A generated recovery.conf looks like that:
standby_mode= 'on'
primary_conninfo = 'host=''$IP_ADDRESS'' user=''postgres'' application_name=''localhost.localdom'' password=''$PASSWORD'''
recovery_target_timeline = 'latest'
restore_command = 'scp $IP_ADDRESS:/var/vmware/vpostgres/9.2-archive/%f %p'

$IP_ADDRESS is the IP used by slave node to connect to a root node (either slave or master). $PASSWORD is the password that has been specified when calling run_as_replica.

show_replication_status

This script can be used to monitor the WAL replay activity of slave nodes connected to the node where script is launched. The following query is used on the server.
with xl as (
select pg_last_xlog_receive_location() as log_receive_position,
pg_last_xlog_replay_location() as log_replay_position)
select xl.log_receive_position as log_receive_position,
xl.log_replay_position as log_replay_position,
pg_xlog_location_diff(xl.log_receive_position,
xl.log_replay_position) as replay_delta
from xl;

Compared to the default columns of pg_stat_replication (catalog table used to report activity of replicated nodes) a new column called replay_delta is used to monitor how a given slave is late compared to a master node when replaying WAL.
When monitoring replication activity, results similar to that are obtained:
postgres@localhost:~> /opt/vmware/vpostgres/current/share/show_replication_status
sync_priority | slave | sync_state | log_receive_position | log_replay_position | receive_delta | replay_delta
---------------+--------------------+------------+----------------------+---------------------+---------------+--------------
0 | localhost.localdom | async | 0/8000000 | 0/8000000 | 0 | 0
(1 row)

The lower the values of receive_delta and replay_delta are, the closer slave node is getting to the master in term of WAL replay (replication state).

promote_replica_to_primary

This script allows promoting a given slave to master using the default settings in recovery.conf.
$ /opt/vmware/vpostgres/current/share/promote_replica_to_primary
server promoting

Other slave nodes can reconnect to a new master using run_as_replica.

archive_command

This script is used as a command for archiving WAL files (set by archive_command in postgresql.conf, kicked by vFabric Postgres server each time a WAL file is ready to be archived, or archive_timeout is reached). This script includes some checks on the existence of WAL file to be archived and some checks on the size of archive disk to ensure that only required WAL files are maintained in archives based on the size of disk available for archives. WAL files are archived if root node is part of a cluster as primary or replica.

create_replication_user

This script can be used to create a new replication user on a master node. It generates a CREATE USER query using the password asked at prompt.

Having those scripts already in place inside the vFabric Postgres appliance and RPMs has several advantages, two of them being:

  • Having unique and consistent scripts used for the management of vFabric Postgres cluster without deploying any additional and home-made tools
  • Minimizing node replication/promotion operations, only a basic understanding of internal mechanics of database server is enough: failover, reconnection and slave/master structure

As a DBA and an operator, such things make the management of clusters and applications really easier in an ESX cluster of hundreds of vFabric Postgres servers for example.

It is planned in PostgreSQL 9.2 to have support for cascading replication, which is the ability to add slaves under other slaves. In 9.2, slave-slave connections are only asynchronous.

This feature has been introduced by this commit.
commit 5286105800c7d5902f98f32e11b209c471c0c69c
Author: Simon Riggs
Date: Tue Jul 19 03:40:03 2011 +0100
 
Cascading replication feature for streaming log-based replication.
Standby servers can now have WALSender processes, which can work with
either WALReceiver or archive_commands to pass data. Fully updated
docs, including new conceptual terms of sending server, upstream and
downstream servers. WALSenders terminated when promote to master.
 
Fujii Masao, review, rework and doc rewrite by Simon Riggs

So let’s try to use it with the following configuration.

Cascading replication graph

Data files of master, slaves and archive folder are located in $HOME/bin/postgres.
You need to checkout the master branch of PostgreSQL GIT to try this feature.

The settings are pretty similar to what you can find here.
However, you have to take care of the following settings in postgresql.conf:

  • Master
    wal_level = hot_standby
    archive_mode = on
    archive_command = 'cp -i %p $HOME/bin/postgres/archive/%f'
    max_wal_senders = 10
  • Slave 1
    hot_standby = on
    port = 15432
  • Slave 2
    hot_standby = on
    port = 15433
  • Slave 1-1
    hot_standby = on
    port = 25432
  • Slave 1-2
    hot_standby = on
    port = 25433

You can setup slave1, and slave2 as synchronously with master by using synchronous_standby_names in postgresql.conf of master.
This is set a little bit later after initializing the master backup.

You need to change recovery.conf of slave 1 and slave 2 with those parameters:
standby_mode = on
primary_conninfo = 'host=localhost port=5432 application_name=$SLAVE_NAME'
restore_command = 'cp -i $HOME/bin/postgres/archive/%f %p'

$SLAVE_NAME being slave1 for slave 1, and slave2 for slave 2.

Then as, slave 1-1 and 1-2 have to connect to slave 1, use the same values as above for standby_mode and restore_command, but setup primary_conninfo like this:
primary_conninfo = 'host=localhost port=15432 application_name=$SLAVE_NAME'
$SLAVE_NAME being slave11 for slave 1-1, and slave12 for slave 1-2.

Here is the dirty script used to start up the system. This has been written quickly, so sorry for the bad quality :) .
#!/bin/bash
#Master has port 5432
#Slave1 has port 15432
#Slave2 has port 15433
#Slave1 has port 25432
#Slave2 has port 25433
 
PSQL_FOLDER=$HOME/bin/postgres
PSQL_BIN=$PSQL_FOLDER/bin #Binary folder
PSQL_CONFIG=$PSQL_FOLDER/config #Folder containing all the configuration files
PSQL_MASTER=$PSQL_FOLDER/master #Master data folder
 
PSQL_SLAVE1=$PSQL_FOLDER/slave1 #Slave 1 data folder
PSQL_SLAVE2=$PSQL_FOLDER/slave2 #Slave 2 data folder
PSQL_SLAVE11=$PSQL_FOLDER/slave11 #Slave 11 data folder
PSQL_SLAVE12=$PSQL_FOLDER/slave12 #Slave 12 data folder
PSQL_ARCHIVE=$PSQL_FOLDER/archive #Archive folder
 
#clean up, take down violently all the processes
killall postgres
rm -rf $PSQL_MASTER $PSQL_SLAVE1 $PSQL_SLAVE2 $PSQL_SLAVE11 $PSQL_SLAVE12 $PSQL_ARCHIVE
mkdir $PSQL_MASTER $PSQL_SLAVE1 $PSQL_SLAVE2 $PSQL_SLAVE11 $PSQL_SLAVE12 $PSQL_ARCHIVE
sleep 1
 
#Initialize master
$PSQL_BIN/initdb -D $PSQL_MASTER
cp $PSQL_CONFIG/postgresql.conf.master $PSQL_MASTER/postgresql.conf
cp $PSQL_CONFIG/pg_hba.conf.master $PSQL_MASTER/pg_hba.conf
 
#Start master
$PSQL_BIN/postgres -D $PSQL_MASTER &
#Wait a little before server start up, let it finish initialization
echo "Master started"
sleep 2
 
#Initialize slave
#This is used to start the backup so as slave does not have to recover from
#scratch when being build. It definitely accelerates standby start up
$PSQL_BIN/psql postgres -c "select pg_start_backup('backup')"
cp -pr $PSQL_MASTER/* $PSQL_SLAVE1/
cp -pr $PSQL_MASTER/* $PSQL_SLAVE2/
cp -pr $PSQL_MASTER/* $PSQL_SLAVE11/
cp -pr $PSQL_MASTER/* $PSQL_SLAVE12/
$PSQL_BIN/psql postgres -c "select pg_stop_backup()"
echo "Backup performed"
sleep 1
 
#Setup synchronous commit mode on master
echo "synchronous_standby_names='slave1,slave2'" >> $PSQL_MASTER/postgresql.conf
$PSQL_BIN/pg_ctl reload -D $PSQL_MASTER
echo "Force master to synchronize mode with slave1 (priority 1) and slave 2 (priority 2)"
sleep 1
 
#Then finish by copying all the configuration files for slaves
cp $PSQL_CONFIG/postgresql.conf.slave1 $PSQL_SLAVE1/postgresql.conf
cp $PSQL_CONFIG/recovery.conf.slave1 $PSQL_SLAVE1/recovery.conf
cp $PSQL_CONFIG/postgresql.conf.slave2 $PSQL_SLAVE2/postgresql.conf
cp $PSQL_CONFIG/recovery.conf.slave2 $PSQL_SLAVE2/recovery.conf
cp $PSQL_CONFIG/postgresql.conf.slave11 $PSQL_SLAVE11/postgresql.conf
cp $PSQL_CONFIG/recovery.conf.slave11 $PSQL_SLAVE11/recovery.conf
cp $PSQL_CONFIG/postgresql.conf.slave12 $PSQL_SLAVE12/postgresql.conf
cp $PSQL_CONFIG/recovery.conf.slave12 $PSQL_SLAVE12/recovery.conf
 
#Delete unnecessary xlog files and postmaster pid files
rm $PSQL_SLAVE1/postmaster.pid
rm -r $PSQL_SLAVE1/pg_xlog/*
rm $PSQL_SLAVE2/postmaster.pid
rm -r $PSQL_SLAVE2/pg_xlog/*
rm $PSQL_SLAVE11/postmaster.pid
rm -r $PSQL_SLAVE11/pg_xlog/*
rm $PSQL_SLAVE12/postmaster.pid
rm -r $PSQL_SLAVE12/pg_xlog/*
 
#Start Slave 1
chmod 700 $PSQL_SLAVE1
$PSQL_BIN/postgres -D $PSQL_SLAVE1 &
echo "Slave 1 started"
 
#Start Slave 2
chmod 700 $PSQL_SLAVE2
$PSQL_BIN/postgres -D $PSQL_SLAVE2 &
echo "Slave 2 started"
 
#Start Slave 11
chmod 700 $PSQL_SLAVE11
$PSQL_BIN/postgres -D $PSQL_SLAVE11 &
echo "Slave 11 started"
 
#Start Slave 12
chmod 700 $PSQL_SLAVE12
$PSQL_BIN/postgres -D $PSQL_SLAVE12 &
echo "Slave 12 started"
 
exit 0

Configuration files and scripts can be found here.

OK now let’s check if it works well.
$ psql postgres
postgres=# select application_name,state,sync_priority,sync_state from pg_stat_replication;
application_name | state | sync_priority | sync_state
------------------+-----------+---------------+------------
slave1 | streaming | 1 | sync
slave2 | streaming | 2 | potential
(2 rows)

On master, slave1 has priority 1 for synchronization (synchronous_standby_nodes has been set up ‘slave1,slave2′). It looks to be correctly synchronized.

Then let’s do the same check from slave1.
$ psql -p 15432 postgres
postgres=# select application_name,state,sync_priority,sync_state from pg_stat_replication;
application_name | state | sync_priority | sync_state
------------------+-----------+---------------+------------
slave12 | streaming | 0 | async
slave11 | streaming | 0 | async
(2 rows)

slave11 and slave12 are correctly linked to slave1. Yippee.

Why not some additional check with some data…
$ #Connection from master
$ psql -p 5432 postgres
postgres=# create table aa (a int);
CREATE TABLE
postgres=# insert into aa values (1),(2);
INSERT 0 2
postgres=# select * from aa;
a
---
1
2
(2 rows)
postgres=# \q

Has the slave of a slave been updated? Connection to let’s say… slave12
$ psql -p 25433 postgres
postgres=# select * from aa;
a
---
1
2
(2 rows)

OK, that rocks. I let you imagine then how to use that as an HA solution ;) .

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use