With last week’s commit:
commit 2a406e56dea3e750e74cc38115e83e30217b4822
Author: Michael P
Date: Thu Dec 8 13:50:25 2011 +0900
 
Simplify node DDL grammar and supress slave management part
 
New grammar uses WITH clause of CREATE TABLE in this manner:
CREATE/ALTER NODE nodename WITH (
[ TYPE = ('coordinator' | 'datanode'),]
[ HOST = 'string',]
[ PORT = portnum,]
[ PRIMARY,]
[ PREFERRED ]);
This applies to CREATE/ALTER NODE.
Grammar simplification results in the deletion in related_to column
of pgxc_node catalog.
 
Documentation is updated in consequence.
 
This commit solves also an issue with variable names sharing same
format between GTM and XC nodes.

The grammar of node DDL (CREATE NODE, ALTER NODE, DROP NODE) has been simplified and made more consistent with PostgreSQL. Now those queries are adapted as follows:

  • Use of the same WITH clause as CREATE TABLE
  • No slave nodes taken into account anymore: now a slave node needs to have the same name as its master. This facilitates also failover.

In consequence, here is how to create a cluster with a freshly-started Coordinator called coord1 and 2 Datanodes called dn1 and dn2 using respectively ports 15433 and 15434.
postgres=# create node dn1 with (type = 'datanode', port = 15433, host = 'localhost');
CREATE NODE
postgres=# create node dn2 with (type = 'datanode', port = 15434, host = 'localhost');
CREATE NODE
postgres=# select pgxc_pool_reload();
pgxc_pool_reload
------------------
t
(1 row)
postgres=# select pgxc_pool_check();
pgxc_pool_check
-----------------
t
(1 row)
postgres=# select * from pgxc_node;
node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred
-----------+-----------+-----------+-----------+----------------+------------------
coord1 | C | 5432 | localhost | f | f
dn1 | D | 15433 | localhost | f | f
dn2 | D | 15434 | localhost | f | f
(3 rows)

In bonus to this article, you can find here an updated version of the script that can setup a cluster on a local machine with the following options:

  • -c to indicate the number of Coordinators
  • -n to indicate the number of Datanodes

When tuning a PostgreSQL server, one the major setting parameters is the one controlling the amount of shared memory allowed with shared_buffers.
PostgreSQL has a default shared_buffers value at 32MB, what is enough for small configurations but it is said that this parameter should be set at 25% of the system’s RAM. This allows your system to keep a good performance in parallel with the database server.
So in the case of a machine with 4GB of RAM, you should set shared_buffers at 1GB.

In the case of ubuntu servers, you may find the following error when starting a PostgreSQL instance.
FATAL: could not create shared memory segment: 無効な引数です
DETAIL: Failed system call was shmget(key=5432001, size=1122263040, 03600).
HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX. To reduce the request size (currently 1122263040 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for.
The PostgreSQL documentation contains more information about shared memory configuration.

This means that Linux kernel cannot allow more shared memory than the kernel can.
In order to prevent that, customize the memory parameters of your machine kernel.
(for 1GB)
sysctl -w kernel.shmmax=1073741824
sysctl -w kernel.shmall=262144
(for 2GB)
sysctl -w kernel.shmmax=2147483648
sysctl -w kernel.shmall=524288

You need root rights to modify those parameters.

Using sysctl will not reinitialize those parameters at reboot. For a more permanent solution, add the following lines to /etc/sysctl.conf.
(for 1GB)
kernel.shmall = 262144
kernel.shmmax = 1073741824
(for 2GB)
kernel.shmall = 524288
kernel.shmmax = 2147483648

This post presents some basics when using foreign data wrappers with PostgreSQL for external files.
FOREIGN DATA WRAPPER is a part of SQL/MED (Management of external data with SQL) and its implementation has begun since Postgres 8.4. This mechanism is based on COPY FROM to import data files directly into your database.
Those tests have been done with 9.2 (development version).

First be sure that the contrib module file_fdw is correctly installed for your server.
cd /to/postgres/folder/contrib/file_fdw
make install

At the time of this post, PostgreSQL tar just contains a fdw library for external files (file_fdw). Some complementary work for PostgreSQL fdw will be done as a development for 9.2.

If you do not install that, you may get the following error when trying to create an extension.
CREATE EXTENSION file_fdw;
ERROR: could not open extension control file "/to/install/folder/share/extension/file_fdw.control": cannot find the following file

Let’s then take a try.
First create a simple text file that will be converted. This file has a CVS format
michael@boheme:~ $ cat ~/data/test.data
1,5,a
2,4,b
3,3,c
4,2,d
5,1,e

Then time to create the extension necessary for the fdw.
template1=# CREATE EXTENSION file_fdw;
CREATE EXTENSION

Then you need to create a *server* that will pinpoint to your file on your server.
template1=# CREATE SERVER test_server FOREIGN DATA WRAPPER file_fdw;
CREATE SERVER

As a last step, you only need to create a table referred in a foreign server
template1=# CREATE FOREIGN TABLE testdata (
id1 int,
id2 int,
text1 char(1)
) SERVER test_server
OPTIONS ( filename '/home/michael/data/test.data', format 'csv' );
CREATE FOREIGN TABLE

Finally try to look at your data:
template1=# select * from testdata;
id1 | id2 | text1
-----+-----+-------
1 | 5 | a
2 | 4 | b
3 | 3 | c
4 | 2 | d
5 | 1 | e
(5 rows)

And you’re done, congrats!

PostgreSQL 9.1 is out, you can download it source code from here or use latest GIT repository as I did like this:
mkdir postgres
git remote add postgres http://git.postgresql.org/git/postgresql.git
git fetch postgres
git branch --track master postgres/master
git checkout master

Asynchronous streaming replication is here since 9.0, but as I keep being always busy with development stuff of Postgres-XC, I have not taken time to play around with streaming replication and HOT Standby.
As now synchronous streaming replication is going to be released soon (beta 1 at the moment of this post), and that Postgres-XC will soon be merged with PostgreSQL 9.1, I tried to set up a synchronous streaming replication server. This functionality could be used to extend XC so as to make it a full HA solution based on Postgres.

So, let’s give it a try.

To be honest, I have been surprised by how easy it is to set it up. PostgreSQL developers have really done a good job on it. As a beta 1, it is not completely polished, but it looks close to completion. Here are the main steps you need to do to set up your server.
First download code, and compile it (written above).
./configure --prefix=$HOME/bin/postgres
make
make install

In my case my installation is located at $HOME/bin/postgres, and I use the following folder for all my settings. You can of course use the folder you prefer, even for slave, master or archive data folders.

To understand what has to be set where, let’s first have a look at the configuration parameters.
This is important to understand how to set your own servers, without depending on this post.
Settings are different for slave and master servers.

Here are the parameters you have to set for postgresql.conf file of master.
wal_level = hot_standby
archive_mode = on
archive_command = 'cp -i %p $HOME/bin/postgres/archive/%f'
max_wal_senders = 10

archive_command is the command used when launching pg_start_backup. This allows slave to restore master data not from scratch, really accelerating slave’s start up. In this case, directory for archives is $HOME/bin/postgres/archive.
max_wal_senders is the number of processes allowed to send WAL data, it cannot be 0, or master cannot send data to slave.
For the time being synchronous_standby_names is not set to avoid master hanging on a slave commit.
It is also necessary to set up your master to authorize connection from slave for replication purposes. In this case, you have to add those lines in pg_hba.conf:
host replication michael 127.0.0.1/32 trust
host replication michael ::1/128 trust

This setup is OK if slave and master are on local host.

In case you want to have a slave with the same configuration parameters as the master, you should copy copy the master’s configuration file. Then modify the following parameters to make it a slave. In postgresql.conf:
hot_standby = on
port = 5433

It is also necessary to add an additional configuration file called recovery.conf in slave’s data folder. You can find a sample of this file in share/ called recovery.conf.sample.
Rename it to recovery.conf and copy it to the slave’s data folder. Then modify the following parameters in it.
standby_mode = on
primary_conninfo = 'host=localhost port=5432 application_name=slave1'
restore_command = 'cp -i $HOME/bin/postgres/archive/%f %p'

primary_conninfo contains all the connection parameters to allow slave to connect to master, for streaming replication purposes. In this parameter, application_name is the name used to identify slave on master.
restore_command contains a shell command that is used to copy archive files. This helps in speeding up slave startup by not having to copy all the WAL from scratch. In this case restore command picks up archive files in the same place where it has been saved by master.

Now, let’s have a look at how to use your master/slave configuration.
Master port is 5432 (PostgreSQL default). Slave port is 5433.

This is the script I used to automatize the whole setup.

#!/bin/bash
#Master has port 5432
#Slave has port 5433
PSQL_FOLDER=$HOME/bin/postgres
PSQL_BIN=$PSQL_FOLDER/bin #Binary folder
PSQL_CONFIG=$PSQL_FOLDER/config #Folder containing all the configuration files
PSQL_MASTER=$PSQL_FOLDER/master #Master data folder
PSQL_SLAVE=$PSQL_FOLDER/slave #Slave data folder
PSQL_ARCHIVE=$PSQL_FOLDER/archive #Archive folder
 
#clean up, take down violently all the processes
killall postgres
rm -rf $PSQL_MASTER $PSQL_SLAVE $PSQL_ARCHIVE
mkdir $PSQL_MASTER $PSQL_SLAVE $PSQL_ARCHIVE
sleep 1
 
#Initialize master
$PSQL_BIN/initdb -D $PSQL_MASTER
cp $PSQL_CONFIG/postgresql.conf.master $PSQL_MASTER/postgresql.conf
cp $PSQL_CONFIG/pg_hba.conf.master $PSQL_MASTER/pg_hba.conf
#Start master
$PSQL_BIN/postgres -D $PSQL_MASTER &
 
#Wait a little before server start up, let it finish initialization
echo "Master started"
sleep 2
 
#Initialize slave
#This is used to start the backup so as slave does not have to recover from
#scratch when being build. It definitely accelerates standby start up
$PSQL_BIN/psql postgres -c "select pg_start_backup('backup')"
cp -pr $PSQL_MASTER/* $PSQL_SLAVE/
$PSQL_BIN/psql postgres -c "select pg_stop_backup()"
sleep 1
echo "Backup performed"
 
#Setup synchronous commit mode on master
echo "synchronous_standby_names='slave1'" >> $PSQL_MASTER/postgresql.conf
$PSQL_BIN/pg_ctl reload -D $PSQL_MASTER
echo "Force master to synchronize mode"
sleep 1
 
#Then finish by copying all the configuration files for slave
cp -r $PSQL_MASTER/* $PSQL_SLAVE
cp $PSQL_CONFIG/postgresql.conf.slave $PSQL_SLAVE/postgresql.conf
cp $PSQL_CONFIG/recovery.conf.slave $PSQL_SLAVE/recovery.conf
rm $PSQL_SLAVE/postmaster.pid
rm -r $PSQL_SLAVE/pg_xlog/*
 
#Start Slave
chmod 700 $PSQL_SLAVE
$PSQL_BIN/postgres -D $PSQL_SLAVE &
echo "Slave started"
exit 0

With this script, you can set up your own master/slave server based on streaming replication.

If you have finished wetting up your environment, let’s check if it is working as planned. Slave can just perform read operations (no DML or DDL), and each write operation performed or master has to be seen on slave.
Let’s first create a database.
michael@lucid-virtual:~/bin/postgres $ ./bin/createdb test
Then create a table on master and fill it with some data.
michael@lucid-virtual:~/bin/postgres $ ./bin/psql test
psql (9.1beta1)
Type "help" for help.
test=# create table aa (a int);
CREATE TABLE
test=# insert into aa values (1);
INSERT 0 1
test=# select * from aa;
a
---
1
(1 row)

And what happens on slave?
michael@lucid-virtual:~/bin/postgres $ ./bin/psql -p 5433 test
psql (9.1beta1)
Type "help" for help.
test=# select * from aa;
a
---
1
(1 row)
test=# insert into aa values (2);
ERROR: cannot execute INSERT in a read-only transaction

This works as expected (Oh, no configuration miss), slave has received all the data from master and cannot perform any write operations.

On master you can check if slave is synchronized with master correctly.
test=# select application_name,state,sync_priority,sync_state from pg_stat_replication;
application_name | state | sync_priority | sync_state
------------------+-----------+---------------+------------
slave1 | streaming | 1 | sync
(1 row)

The keyword sync means that master and slave have synchronized commits.
If it were not the case, this would be in async mode.

The main point of a slave is to take care of the database operations in case master crashes or becomes inoperative.
So let’s imagine master crashes with something like.
kill -9 `ps ux | grep "postgres/master" | cut -d " " -f 3`
Now only slave is running, but it cannot perform any write operation, so fallback can be done by:
echo "standby_mode = off" >> slave/recovery.conf
echo "port = 5432" >> slave/postgresql.conf
./bin/pg_ctl -D slave restart

Setting standby_mode to off makes the slave react as a new master.
After restarting, recovery.conf has its name changed to recovery.done to prevent to reenter to a new backup.
After that a new master is up, based on the old slave. You can connect to it as if it was a normal master, and perform normal operations on it.

If you want to set up your own servers, here are the Standby setup files (tar archive) used in this post. It contains the following files:

  • Configuration files (saved in $PSQL_CONFIG for the automatized process)
    • Master: postgresql.conf file
    • Master: pg_hba.conf file
    • Slave: postgresql.conf file
    • Slave: recovery,conf file
  • Scripts
    • Setup Master/Slave servers
    • Fallback to Slave
  • © 2012 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2012. Privacy Policy - Terms of Use