Collation is a new functionality of PostgreSQL 9.1 that allows to specify the sort order by table column or for an operation.

In Ubuntu, the collation types supported by your system are listed in /var/lib/locales/supported.d.
In case you installed support for language French, you can have a look at all the languages supported in the file fr.
$ cat /var/lib/locales/supported.d/fr
fr_LU.UTF-8 UTF-8
fr_CA.UTF-8 UTF-8
fr_CH.UTF-8 UTF-8
fr_BE.UTF-8 UTF-8
fr_FR.UTF-8 UTF-8

So here, this system supports French from Luxembourg, Belgium, France, Canada and Switzerland.

Here is an easy example of a table suing collated columns. First a local collation based on French is created. Then table is created and filled with data.
postgres=# create collation fr_FR (locale = 'fr_FR.utf8');
CREATE COLLATION
postgres=# create table test(french_name text collate "fr_FR", eng_name text collate "en_US");
CREATE TABLE
postgres=# insert into test values ('lé', 'la');
INSERT 0 1
postgres=# insert into test values ('là', 'le');
INSERT 0 1
postgres=# insert into test values ('lè', 'li');
INSERT 0 1
postgres=# insert into test values ('lë', 'lo');
INSERT 0 1
postgres=# insert into test values ('lê', 'lu');
INSERT 0 1
postgres=# \d test
Table "public.test"
Column | Type | Modifiers
-------------+------+---------------
french_name | text | collate fr_FR
eng_name | text | collate en_US

So column 1 is collated in French, column 2 in English.

What happens in the case of a order by?
postgres=# select * from test order by 1;
french_name | eng_name
-------------+----------
là | le
lé | la
lè | li
lê | lu
lë | lo
(5 rows)

In this case the strings are classified in French.

postgres=# select * from test order by 2;
french_name | eng_name
-------------+----------
lé | la
là | le
lè | li
lë | lo
lê | lu
(5 rows)

Here the American English order is used.

However collation is very useful when doing string comparisons in different languages. Of course you cannot compare columns that have different collations.
postgres=# select * from test where french_name < eng_name;
ERROR: could not determine which collation to use for string comparison
HINT: Use the COLLATE clause to set the collation explicitly.

But you can enforce the order by the another collation in a kind of cast style.
postgres=# select * from test where french_name < (eng_name collate fr_FR);
french_name | eng_name
-------------+----------
là | le
lè | li
lë | lo
lê | lu
(4 rows)

You can do a lot of things with such features, just never forget that an ORDER BY always needs a collation or you will get an error.
postgres=# select * from test order by french_name || eng_name;
ERROR: collation mismatch between implicit collations "fr_FR" and "en_US"
LINE 1: select * from test order by french_name || eng_name;
postgres=# select * from test order by french_name || eng_name collate fr_FR;
french_name | eng_name
-------------+----------
là | le
lé | la
lè | li
lë | lo
lê | lu
(5 rows)

This study is made with PostgreSQL 9.1.1, released a couple of days before this post is written.
Unlogged tables are a new performance feature of PostgreSQL 9.1, created by Robert Hass. So, by guessing from this feature name, those tables are not logged in the database system :) . More precisely, those tables do not use at all WAL (Write ahead log) that insure a safe database crash.
Those tables are a good performance gain to contain data that do not especially need to survive from a crash, they are truncated automatically after a crash or unclean shutdown.
Unlogged tables are shared among sessions, and are not deleted when a session ends. Autovacuum runs on them.

So, in what cases could you use it.

  • On web applications, for session parameters
  • Data caching (Web page caching, why not?)
  • Application status, imagine that you add a on/off lock switch on your application that an admin could modify at will. This is not necessary at database server crash and could be reinitialized at a default value if necessary.
  • And many other things

In order to define an unlogged table, you need to use a new extension keyword called UNLOGGED (surprise!).
CREATE UNLOGGED TABLE aa (a int, b int);

This is a performance feature, so let’s see how much gain you could expect with pgbench.
Environment used is a 2.6GHz Dual core i5 with 4GB of memory.
PostgreSQL server has the following settings:

  • shared_buffers = 1GB
  • synchronous_commit = off
  • checkpoint_segments = 32
  • checkpoint_completion_target = 0.9

By default, pgbench is not able to use unlogged tables, so the code has been a bit modified to change all DDL definitions when tests are made on unlogged tables.
First, pgbench can be found in contrib directory. Once installed, you can initialize with pgbench with the following commands:
createdb benchtest
pgbench -i -s $SCALE_FACTOR benchtest

SCALE_FACTOR is used at 10 and 100 for this study. Roughly, it represents the number of tables. I do not advice using default value to avoid lock contention.

Then you can launch pgbench with commands like:
pgbench -c $CLIENT_NUM -T 300 benchtest
CLIENT_NUM is the number of clients connected to the database. Here we use successively 1, 24 and 48.
For each configuration, 5 tests of a duration of 5 minutes are made. The lowest and highest values are not taken into account, and the average based on the other values is calculated.

Here are the results found in TPS (transaction/second).

Clients Scale factor Normal tables Unlogged tables Gain (Unlogged – Perm)/avg(Unlogged, Perm)
1 10 561.63 632.55 11.87%
24 10 1419.30 1678.23 16.71%
48 10 1323.78 1555.40 16.08%
1 100 510.25 436.87 13.22%
24 100 1252.38 1493.44 17.55%
48 100 1260.09 1462.92 14.89%

So in short, in the environment tested unlogged tables have shown an increase of output by 13~17%.

When tuning a PostgreSQL server, one the major setting parameters is the one controlling the amount of shared memory allowed with shared_buffers.
PostgreSQL has a default shared_buffers value at 32MB, what is enough for small configurations but it is said that this parameter should be set at 25% of the system’s RAM. This allows your system to keep a good performance in parallel with the database server.
So in the case of a machine with 4GB of RAM, you should set shared_buffers at 1GB.

In the case of ubuntu servers, you may find the following error when starting a PostgreSQL instance.
FATAL: could not create shared memory segment: 無効な引数です
DETAIL: Failed system call was shmget(key=5432001, size=1122263040, 03600).
HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX. To reduce the request size (currently 1122263040 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for.
The PostgreSQL documentation contains more information about shared memory configuration.

This means that Linux kernel cannot allow more shared memory than the kernel can.
In order to prevent that, customize the memory parameters of your machine kernel.
(for 1GB)
sysctl -w kernel.shmmax=1073741824
sysctl -w kernel.shmall=262144
(for 2GB)
sysctl -w kernel.shmmax=2147483648
sysctl -w kernel.shmall=524288

You need root rights to modify those parameters.

Using sysctl will not reinitialize those parameters at reboot. For a more permanent solution, add the following lines to /etc/sysctl.conf.
(for 1GB)
kernel.shmall = 262144
kernel.shmmax = 1073741824
(for 2GB)
kernel.shmall = 524288
kernel.shmmax = 2147483648

This post presents some basics when using foreign data wrappers with PostgreSQL for external files.
FOREIGN DATA WRAPPER is a part of SQL/MED (Management of external data with SQL) and its implementation has begun since Postgres 8.4. This mechanism is based on COPY FROM to import data files directly into your database.
Those tests have been done with 9.2 (development version).

First be sure that the contrib module file_fdw is correctly installed for your server.
cd /to/postgres/folder/contrib/file_fdw
make install

At the time of this post, PostgreSQL tar just contains a fdw library for external files (file_fdw). Some complementary work for PostgreSQL fdw will be done as a development for 9.2.

If you do not install that, you may get the following error when trying to create an extension.
CREATE EXTENSION file_fdw;
ERROR: could not open extension control file "/to/install/folder/share/extension/file_fdw.control": cannot find the following file

Let’s then take a try.
First create a simple text file that will be converted. This file has a CVS format
michael@boheme:~ $ cat ~/data/test.data
1,5,a
2,4,b
3,3,c
4,2,d
5,1,e

Then time to create the extension necessary for the fdw.
template1=# CREATE EXTENSION file_fdw;
CREATE EXTENSION

Then you need to create a *server* that will pinpoint to your file on your server.
template1=# CREATE SERVER test_server FOREIGN DATA WRAPPER file_fdw;
CREATE SERVER

As a last step, you only need to create a table referred in a foreign server
template1=# CREATE FOREIGN TABLE testdata (
id1 int,
id2 int,
text1 char(1)
) SERVER test_server
OPTIONS ( filename '/home/michael/data/test.data', format 'csv' );
CREATE FOREIGN TABLE

Finally try to look at your data:
template1=# select * from testdata;
id1 | id2 | text1
-----+-----+-------
1 | 5 | a
2 | 4 | b
3 | 3 | c
4 | 2 | d
5 | 1 | e
(5 rows)

And you’re done, congrats!

The post presents a proposal to implement an HA solution based on PostgreSQL streaming replication and Standby node structure. This solution is still in construction, so the final implementation design that will be chosen for Postgres-XC may slightly change.

Before reading this post and if you are not experienced with PostgreSQL 9.0/9.1 features, you should refer to the background about PostgreSQL Master/Slave fallback.

Synchronous streaming replication for Postgres-XC

This part specifies how synchronous streaming replication will be implemented in Postgres-XC. Even if functionalities in PostgreSQL 9.1 already implemented are pretty stable, some extensions related to node communication control have to be designed to have a real HA solution.

Some background

    Postgres-XC is a multi-master database cluster based on PostgreSQL.

  • It is made of a unique global transaction manager which feeds consistently transaction IDs and snapshots in the cluster to each node.
  • Nodes are made of 2 kinds of nodes: Coordinator and Datanode.
    • Coordinators are holding other node information. A coordinator is able to communicate with other Coordinators and other Datanodes through a connection pooler. This connection pooler saves all the connection parameters to nodes (host name, port number) to be able to distribute connection with a libpq string protocol depending on database name and user name. Coordinators also have all the catalog information, and primarily the distribution information of each table in the database. With this information, Coordinator is able to push down SQL queries to correct Datanodes and then merge results that are sent back to application according a new plan type called RemoteQuery. A coordinator does not hold table data, and all the data contained each Coordinator is the same. So one Coordinator is the clone of another one.
    • Datanodes react more or less like a PostgreSQL normal instance. They hold database data. What has been added is an interface to permit Datanodes to receive from Coordinators transaction IDs, snapshots, timestamp values instead of requesting them locally.

Limitations

  • Postgres-XC does not support yet tuple relocation from one node to another (impossible to update for instance column that holds the distribution key), so this design is limited to the case where the cluster has a fixed number of master nodes (for Datanodes and Coordinators (?)).
  • It is not thought here about trying to add or delete a Datanode on the fly. By that, it means that cluster configuration is not changed in a way that it modifies the node number and data distribution.
  • With those assumptions what remains is a cluster with a fixed size
  • This specification is based on PostgreSQL 9.1, but design is though to take into account as many replication features as possible.

Specifications

Here are the list of functionalities that will be added for the support of synchronous streaming replication. Most of them concern node management, master/slave identification and slave promotion system.

  • In no way Postgres-XC nodes are thought as being able to make system calls to be able to kick a slave initdb or something like this. NO WAY!
Catalog addition

A catalog called pgxc_nodes will be added with the following columns:

  • node type
  • node ID number
  • node host name
  • node port number
  • node immediate master ID
  • connection type: replication or not?

This table has as a primary key constraint on node number and node type.
“node immediate master ID” is the node ID that a slave is using to connect to a master or another slave (case of cascade replication, not implemented in PostgreSQL 9.1 though). This catalog is created on Coordinator.

Cluster startup
  • As a limitation, all the configuration files of postgres-XC coordinator nodes only contain master Coordinator numbers.
    With that, the initialization of the catalog table pgxc_nodes is made only with data of master nodes. In the case of a master node, “node immediate master ID” is filled with 0.
  • Once the cluster is up with a fixed number of nodes, the administrator has he possibility to update pgxc_nodes catalog with slaves already on that have already there configuration files set correctly to connect to the wanted node.
SQL interface
Adding a slave node after cluster start up

Following SQL is sent to Coordinators:

CREATE [COORDINATOR | DATANODE] SLAVE
IDENTIFIED BY ID $id_num
WITH CONNECTION ('$host1:$port1',...,'$hostN:$portN')
ON MASTER $master_number.
{ REPLICATION TO [SYNCHRONOUS | ASYNCHRONOUS] }

If only 1 host/port couple is specified, the same values are applied for all the coordinators. In case multiple host/port values are specified, they have to correspond to the number of Coordinators in the cluster. The following SQL is sent to all the coordinators. If replication option is not specified, slave is contacted to get the information.

Promoting a slave as a master

Following SQL is sent to Coordinators:

ALTER [COORDINATOR | DATANODE] SLAVE id PROMOTE TO MASTER
{WITH CONNECTION ( [:$port | $host: | $host:$port] )};

This will modify pgxc_nodes like this for a coordinator for example:

  • Former tuples:
    • Master: C, ID: 1, host:localhost, port:5432, master ID: 0
    • Slave: C, ID: 4, host:localhost, port:5433, master ID: 1
  • New tuples:
    • Former master: erased
    • Slaved promoted: C, ID: 1, host:localhost, port:5432, master ID: 0

The following restrictions apply at promotion

  • Before promoting the slave as a new master it is necessary to restart slave new parameters. Postgres-XC does not take responsabilities in kicking new nodes.
  • Promotion can be made on a synchronous slave only, this check is made on pgxc_nodes.
  • Before promoting, a check is made on slave to be sure that it has not been modified from synchronous to asynchronous mode when beginning the promotion. This check is done locally on pgxc_nodes.
  • When promoting, make a check on slave node to be sure that its standby mode is off. This has to be kicked from an external utility and not by XC itself.
Changing a slave status

Following SQL is sent to Coordinators:

ALTER [COORDINATOR | DATANODE] SLAVE $id REPLICATION TO [SYNCHRONOUS | ASYNCHRONOUS];
ALTER [COORDINATOR | DATANODE] SLAVE $id ID TO $new_id;

The following rules are applied:

  • Take an exclusive lock on pgxc_table to make other backends waiting on pgxc_nodes
  • The lock is taken externally with LOCK TABLE and sent to all the Coordinators first. Then the table is updated. Then lock is released on all the Coordinators from remote Coordinator
  • When changing replication mode, connect to slave node and check if mode has effectively been changed correctly by an external application kick.
Disabling a slave node from cluster

Following SQL is Sent to Datanodes:

DROP [COORDINATOR | DATANODE] SLAVE $id;

The following rules are applied:

  • Take an exclusive lock (SHARE ROW EXCLUSIVE MODE?) on pgxc_nodes and send this lock to all the Coordinators before performing the deletion from pgxc_nodes
  • Lock is released once deletion on all the nodes is completed
Pooler process modification

Connection pooling has to be modified with following guidelines:

  • At initialization phase, Pooler fills in the catalog table pgxc_nodes with initial value in global memory context with values found in postgresql.conf: pgxc_nodes has to be accessible from postmaster and child processes.
  • When a new slave is added with a new code ID, Pooler caches this new connection data on each Coordinator once pgxc_nodes has been updated on Coordinator associated with new node ID.
  • When a slave is dropped, Pooler information cached is updated also.
  • Pooler saves in shared memory information related to master nodes at cluster initialization.

Important:

Pooler only remains in charge in distributing connections. It does not have to know if connection is to a slave or a master. This is the reponsability of postgres child as it takes a row shared lock on pgxc_nodes when beginning a transaction on certain nodes.

Postmaster child process modification
  • A child returns an error to application in case it cannot read pgxc_nodes.
  • When a postmaster child determines the list of nodes for transaction, it needs to know if current transaction is read-only or write depending on SQL. Then node list is built from information in pgxc_nodes when requesting new connections.
  • When a child postmaster uses connection information of a slave/master when taking new connections, it takes a row shared lock on the associated tuples of pgxc_nodes where it took connections. This preserves catalog modification when running a transaction on those nodes.
    • If the transaction is read-only, connections to master/slave are both possible. Choice is made with round-robin.
    • If the transaction first requested read-only connections, but launches on the way a DML, new connections are requested from pooler to necessary masters.
    • If transaction was first write, then does read operations, keep going with connections to master.
©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use