A replication and clustering event for PostgreSQL has been hold in Paris on 2nd of February 2012. I had the chance to be invited to give a presentation about Postgres-XC and here is a summary of what happened there.
First a couple of words about the place, called “Le comptoir general”, used by associations to organize events. The place has been designed in an old-fashion way to well, show it as a kind of voodoo or sorcery place. Have a look at the pictures to make you an image of how it is.
The conference, organized by Dalibo, a French company specialized in consulting of PostgreSQL, was free of participation with a number of places limited at 70/80.
The presentation about Postgres-XC went well, you can find the presentation document here. Initially planned to last 45-50 minutes including questions, I finally stayed 1,5 hour on stage. The public was great, they asked a lot of pertinent questions and made me feel they understood what Postgres-XC is and what the project is aiming as an open-source software. Here is a summary of the main questions asked.
- Could Postgres-XC be used as an alternative for Oracle RAC? Yes, definitely.
- What means “transparent transaction management”? A way to manage transaction in a global way such as application sees a database cluster as if it is a single instance. This transparent management uses an MVCC-based method with transaction ID and snapshot fed in consistent way to all the cluster nodes.
- Is it possible to use normal Postgres instead of a Datanode? No, there are corner cases where a Datanode needs a global snapshot like autovacuum.
- Can we set a Coordinator and a Datanode on the same server? You can set as many nodes as you wish of the types you wish, you need just avoid port conflicts.
- Is it possible to change the table definition on a single node? No, you need to have consistent table definition on all the nodes. You can however manage users and roles like normal Postgres, there is also a pooler protocol to separate user and database-dedicated connections to avoid inter-user conflicts on connections. So why not changing the access of a column for one special user or group of users globally?
- Like RAC, in case of a read query that failed due to a node failure, do you have a functionality to target another node still alive inside the same transaction? There is no such functionality yet, but once we had slave node management in the code, this is a functionality we want to add. So definitely yes.
- Can we participate in the project? Like PostgreSQL, Postgres-XC is community-based. All contributions, even minimal are warmly welcome.
There have been a couple of other presentations in the conference about Londiste, Slony and streaming replication usage. However, the point that caught my attention is the presentation made by Simon Riggs about the future of replication features in PostgreSQL. In order to make PostgreSQL multi-master, his idea is to implement a replication type called circular replication, feature already done in MySQL and Oracle. What is circular replication? Have a look at the picture below.
As can be understood from the name, such a cluster configuration has database master instances placed in circle. You can have those masters located at several places in the world. If an update occurs on one of the masters, it is sent circularly to the next master, until the update circle completes. In case a master fails, what you do is simply skip it for the time being and move forward to the next node in circle. In the diagram masters are noted as M and slaves as S (asynchronous/synchronous standby nodes, cascade of nodes…). It takes a certain time to make the update available to all the nodes, that is why it is a kind of asynchronous multi-master configuration.
Circular replication clearly targets applications using non-critical operations. It is not possible to imagine operations on this structure for billing or ordering systems that cannot have any transaction loss. However, this could be used for social networking and blogging network, where there are a lot of reads occurring and users do not really care if there is a delay occurring during the update. Well, there are issues that easily come out with such configurations. A simple one is what happens to sequences. The delay occurring between masters makes it hard to use serial tables or global frameworks. However, it is great to see that PostgreSQL community is going forward on implementing more and more things that will make it a more advanced database in the world.