A PostgreSQL conference has happened on the 24th of February in Tokyo, Shinagawa, event organized by JPUG (Japanese PostgreSQL user’s group). You can go to this page where all the materials of presentations are available. Most of the presentations were in Japanese, but the following ones were in English (links provided to materials if possible):

  • How a large organisation moved its critical application toward PostgreSQL, By Philippe Beaudoin, special quest of the event.
  • An overview of PostgreSQL 9.2, by Robert Haas
  • Postgres-XC, toward 1.0, well if you are on this blog you might already know who did it and the content of this material

As a main summary of the events, I was really surprised by the number of participants, 250 people came from Tokyo and even farer. This resulted in the bad impression that the organizers did not really manage clearly this event because for each presentation, the rooms were completely crowded and there were always people standing up. It is good to see that PostgreSQL has so much success in Japan.

Participating at this event both as translator for Philippe (French -> Japanese) and as a presenter of Postgres-XC, well, to be honest, it has been a pretty busy day. I don’t really know if I did a good translation, but at least I got good feedback from the public. As a first experience, it was a nice one.

So, a couple of words about the presentations at the conference I saw. As the official translator of the presentation of the first Keynote, I had some time to understand the presentation of Philippe. And I believe it is really a great example of a success-story using PostgreSQL. The migration project lasted 18 months, for a team of more or less 10 engineers. So when you do such a migration, what are the points you should really care about? Here is what I understood from this presentation:

  • Do a deep study of what are the modifications necessary to the table structures to make a migration without problems. Postgres supports a lot of types, but still you never know
  • Build a prototype to limit the risk when performing a migration
  • Do huge and long acceptance test. PostgreSQL is robust and is famous for that, so you should more worry about the interface you put in place for the migration and the new interface between postgres and the old frontend application.
  • Tests, tests, and tests… And more tests. It is essential to accumulate confidence by overdoing tests.
  • Do not underestimate the impact of migration on external tools: monitoring, batch applications or query modifications

This was really a productive presentation.

Then there was the presentation of Robert, about all the new features of 9.2. To be honest, this is going to be a performance release. Robert has worked a lot on improving performance on multiple core machines. He has shown in this aim a couple of graphs showing results with pgbench. A guy in the public has promised him access to a 64-core machine to do some tests on more powerful machines. So, there was nothing really surprising in this presentation, people following the hackers mailing list or the commits in GIT are already updated on the subject. However, here is a small list of features new in 9.2 presented by Robert:

  • Scalability performance
  • JSON type is available in 9.2, basic support, there are still bugs in it but still nice
  • Index only scans
  • Cascading replication
  • Reduction of power consumption (nice for hosting services)

This post is getting long, but here is some feedback about the presentation I gave about Postgres-XC. I got the feeling that people are expected a lot from the project (too much??). The public has been very enthusiastic about the technology presented and few people slept this time :) . This was a very general presentation showing the policy we try to respect for 1.0 release. Here is a list of the questions I got, well there were a lot of things about failure and HA, nothing really on performance or feature:

  • What to do if you have a 2PC which finishes as non-consistent in cluster, like when a node fails during 2PC? You need to clean up the 2PC info: force commit for transactions partially prepared/committed, abort the transactions partially prepared/aborted, commit the transactions prepared. If you got transactions with abort/commit/prepare status in your cluster => use PITR and fallback.
  • Datanode is a SPOF, how to fix that? You can use internal streaming replication in Postgres. Current code of XC is based on Postgres 9.1.
  • And for GTM? There is a GTM-Standby feature for this purpose.

That was indeed a nice event. A lot of people participated, and organizers are thinking about doing it with more people next year (300~350 perhaps), as more and more people are orienting their business to Open source solutions for Databases in Japan (take that, Or**le!), and PostgreSQL is the world’s most advanced open source database, no?

Edit: For those of you who are wondering what about the rest of the conference. This post will be completed by a 2nd presenting 2 high-availability technologies designed in Japan. This report was too long for a single post.

19th and 20th of May have been days of the PostgreSQL conference.
During those 2 days, I saw some good presentations and stuff like that may help (perhaps) in increasing my own database knowledge related to PostgreSQL.
By the way, for sure, what I heard from this conference will help me not only for my current work but also it could give ideas for future design tasks.

As a lucky one, there were never two presentations happening at the same time even if conference was on 3 tracks. So let’s take chronologically each presentation I had the chance to see. In this post I don’t give my impression about everything seen, but just on the main matters that I think have a relative importance to facilitate your lecture.

On the 19th I saw first a presentation about Sharding for unlimited growth, given by Robert Treat. Sharding is a technique that could bring scale growth to large database systems (millions of operations, users) through an horizontal scaling (scaling in/out, increase the number of database nodes in terms of servers, and not in term of local resources). The idea behind it is to try to bring to a database the possibility to grow without losing its scalability and resolve SPOF (single point of failure) problems within a database system. For this purpose a couple of solutions were proposed based on data mapping or on the division of applications data into various databases located on multiple nodes (for example a website application may have its user data and forum data on separated nodes). All the ideas based their assumptions on making the application taking care of data mapping, so the database does not need to do anything but just deal with data. So I would say that Sharding is an up-layer of a database application that is focused on the optimization of applications running on top of database node(s).

Then there was an interesting presentation about the review of patches by Stephen Frost. The goal of this presentation was to teach the attendance about all the tools and formats used within and for PostgreSQL. Useful stuff such as when you want to send a patch, who you should contact. If you want to help in reviewing a patch, you can contact such or such person. This presentation told also about the formatting used in PostgreSQL: code refactoring, code quality, code duplication. Still, if a project has no such an organization, it for sure can become a mess quickly. So I personally keep a good impression about it.

The first day was full of surprises, there were another 2 presentations that caught my attention: something about Foreign data wrapper and another by Tom Lane, “How to hack the planner”.
Always by being focused on what I do for Postgres-XC, I am not very familiar with the functionalities introduced since 9.0. So it was a pleasure to find a presentation that introduced the foreign data wrapper functionality and some additional stuff a Japanese functionality is developing based on the feature of 9.0. A foreign data wrapper adds functionalities to enable a Postgres server to interact with a remote database or remote data files and show it in a nice way in your PostgreSQL instance. For instance you can show cvs files stored somewhere directly on a psql terminal. By the way, the presentation by Yotaro Nakayama shew a couple of additional features for foreign data wrappers: the capacity to interact with additional database systems and not only Postgres instances. His team has developed some extra features to be able to create foreign tables that can be seen from Oracle or MySQL instances. This consists more or less of taking into account the specificities used in each db softwares and to translate them in a Postgres-way. Fascinating. For the impression I keep, it looked that the development was at a fairly advanced stage but it wasn’t in the plans of Nakayama’s team to release publicly the work done :( .

By the end of the day came a presentation about PostgreSQL planner. For sure the presentation which was the most difficult to access to not only by the level of understanding which is necessary to understand what is dealt about but also by the quantity of information that has been discussed about. So in two terms this presentations can be qualified as: qualitative and quantitative. The planner of PostgreSQL is perhaps the hardest part of the code in terms of complexity, so making a presentation about it is even more complex. The presentation begun with some general explanation about Postgres’ parser/rewriter/planner/executor but after a couple of minutes quickly came the main dish, and the audience became aware of how planner is complicated not by its general way of working, but by all the cases that have to be taken into account in the most generalized way in their implementation to increase dependencies between each case. However, some cases such as the analysis of JOIN planning made the comprehension even easier. Some general explanation about the key structures also came at the good time to light up the basics of planning. The part that personally caught the most my attention was about costs planning of queries, and particularly the fact that sometimes a cost estimation could lead to cost higher than expected (case of LIMIT). However, to conclude on it, there are still areas of improvement of planner and Postgres is in need of people who could work on it.

The second day, one presentation in particular caught my attention. PostgreSQL 9.1 introduces SSI level, serializable snapshot isolation. One result is particular is amazing… Let’s tell more about that. In serializable transactions, you have to take care of cycles of transactions due to their read/write conflicts. For example, let’s imagine that you have a transaction performing a read on a tuple being written (by DML, UPDATE, DELETE, INSERT) by another transaction, you need to check if the transaction performing the write does not perform a read on a tuple being modified by a third transaction… This continuing until you know that there are no transactions trying to perform a read on something that has been modified by the first transaction. If you have a read/write conflict cycle, you need to abort one transaction to break the cycle and save all the other transactions from a deadlock condition. However, in order to check that, you have to go through all the transactions that could enter in the cycle, which is really resource consuming. By the way, the idea that caught my mind was that you do not need to check that all the cycle of transactions. You just need to check if the transaction you are on has not at the same time an in and an out read/write conflict. An in read/write conflict means that your transaction reads something that is being updated. An out read/write conflict means that what is updated by your transaction is not read. In case your transaction has at the same time and in and out read/write conflict, you need to abort something on the cycle it is on. However, if such a check is made on each transaction, doesn’t it increase the number of transaction being aborted, as there could be transactions in a semi cycle not closed, what would not need to be aborted, but would be aborted to satisfy the SSI check?

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use