A new set of APIs for foreign data wrappers has been added to allow writable operations on foreign sources. This feature has been committed by Tom Lane a couple of days ago.
commit 21734d2fb896e0ecdddd3251caa72a3576e2d415
Author: Tom Lane
Date: Sun Mar 10 14:14:53 2013 -0400
 
Support writable foreign tables.
 
This patch adds the core-system infrastructure needed to support updates
on foreign tables, and extends contrib/postgres_fdw to allow updates
against remote Postgres servers. There's still a great deal of room for
improvement in optimization of remote updates, but at least there's basic
functionality there now.
 
KaiGai Kohei, reviewed by Alexander Korotkov and Laurenz Albe, and rather
heavily revised by Tom Lane.

Based on the documentation, the implementation is still very basic as nothing is done with clause shippability. Just to give some notions about that: roughly a clause in a SELECT query (LIMIT, OFFSET, GROUP BY, HAVING, ORDER BY, etc.) is shippable if this clause can be entirely evaluated on remote server, making less processing happening on local server, and reducing the tuple selectivity. A direct consequence of clause shippability limitation is that UPDATE and DELETE queries can take quite a long time if they are run on many rows because query is run in two steps:

  • Scan remote table and fetch back to local server the tuples to be manipulated
  • Process UPDATE or DELETE based on the tuples fetched

INSERT does not need such scan as in this case new data is simply sent to the remote table, the tuple values being computed before sending the query (even for immutable functions). Not really performant but it is the safest approach. Postgres-XC has similar and more advanced features for foreign DDL planning and execution in its core (some of them implemented by me), have a look for example at this article I wrote a while ago.

It is possible to test writable foreign tables with postgres_fdw as it has been extended to support this new feature. So let’s give it a try with two postgres servers using ports 5432 and 5433. Server with port 5432 has postgres_fdw installed and will interact with the remote server running under port 5433. In order to get the basics of postgres_fdw, you can refer to this article written a couple of weeks ago.

Now, it is time to test the feature. First let’s create a table on remote server.
$ psql -p 5433 -c "CREATE TABLE aa_remote (a int, b int)" postgres
CREATE TABLE

Then it is necessary to create a foreign table on the local server.
postgres=# CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', port '5433', dbname 'postgres');
CREATE SERVER
postgres=# CREATE USER MAPPING FOR PUBLIC SERVER postgres_server OPTIONS (password '');
CREATE USER MAPPING
postgres=# CREATE FOREIGN TABLE aa_foreign (a int, b int) SERVER postgres_server OPTIONS (table_name 'aa_remote');
CREATE FOREIGN TABLE

Then let’s test the new feature by performing some DML operations on the foreign table from local server.
postgres=# INSERT into aa_foreign values (1,2);
INSERT 0 1
postgres=# select * from aa_foreign;
a | b
---+---
1 | 2
(1 row)
postgres=# update aa_foreign set b = 3;
UPDATE 1
postgres=# select * from aa_foreign;
a | b
---+---
1 | 3
(1 row)

Everything is going well on local side, and on remote side what happened?
$ psql -p 5433 -c 'SELECT * FROM aa_remote' --dbname postgres
a | b
---+---
1 | 3
(1 row)

So the data of the remote table has been correctly changed from local server.

Just before the tests, I explained that a scan is done for UPDATE and DELETE before actually running the DML, you can get more details about that with EXPLAIN.
postgres=# explain verbose update aa_foreign set b = 3;
QUERY PLAN
-----------------------------------------------------------------------------------
Update on public.aa_foreign (cost=100.00..182.27 rows=2409 width=10)
 Remote SQL: UPDATE public.aa_remote SET b = $2 WHERE ctid = $1
 -> Foreign Scan on public.aa_foreign (cost=100.00..182.27 rows=2409 width=10)
  Output: a, 3, ctid
  Remote SQL: SELECT a, NULL, ctid FROM public.aa_remote FOR UPDATE
(5 rows)

In the case of postgres_fdw, selectivity of tuple is done with ctid of tuple, which ensures tuple uniqueness. Note that if you implement your own foreign data wrapper, you might need to use columns having primary keys for selectivity of tuples.

There are also a couple of things to be aware of when using this feature.

  • There are risk of data incompatibility for data formatted with GUC parameters. This has been mentionned in the community but try for example to manipulate servers with different settings of datesyle…
  • Transactions are open on remote server using repeatable read.
  • UPDATE and DELETE can be costly if scan is done with a good-old-fashioned sequential scan, but well that’s a known thing
  • Things I forgot…

Postgres-XC, write-scalable multi-master symetric cluster based on PostgreSQL, version XC 1.0beta1 has been released.
This beta version is based on PostgreSQL 9.1beta2, and all the fixes of PostgreSQL 9.1 stable branch will be backported in Postgres-XC 1.0 stabilized version.
It is under PostgreSQL license.

You can download the binary directly from here.
This tarball includes all the HTML and man documentation.

A PDF file containing all the references is also available here: Reference PDF.

Compared to version 0.9.7, the following features have been added:

  • Fast query shipping (FQS), quick identification of expressions in a query that can be pushed down to remote nodes
  • SERIAL types
  • TABLESPACE
  • Utility to clean up 2PC transactions in cluster (pgxc_clean)
  • Utility for initialization of GTM (global transaction manager, utility called initgtm)
  • Relation-size functions
  • Regression stabilization

The project can be followed on Source Forge.
And we use a couple of GIT repositories for development:

Postgres-XC tutorial at PGCon in Ottawa this May will use a 1.0 version, so be sure to touch this beta version to have an idea of what is Postgres-XC before coming!
Since the last release, a special effort has been made on stabilization and performance improvement, but be sure to give your feedback in order to provide a stable 1.0 release for PostgreSQL community.

When using a database cluster where nodes are completely dependent on the network behavior like it is the case for shared-nothing based architecture, an essential step in integrating a new application is its design to maximize the performance and enhance the cluster strengths knowing its capabilities. The problem with shared-nothing architectures is that queries that may run very quickly in a single instance environment might take a bunch of time in your cluster, especially for queries having aggregates that need global results for all the nodes in the cluster or queries needing subsequent results from internal sub-queries. The nightmare of most of database developers trying to optimize an application is always to face queries of the type “SELECT *”, fetching all the tuples of a table in a single shot. Such queries are already costly for single database instances, but just don’t imagine how much you might load your application when fetching all the tuples of your table with additional node layers forcing your database to fetch results from multiple nodes.
I would say by experience that there are three guidelines when customizing an application for a cluster:

  • Choose the number of nodes in cluster and servers where they are located to minimize overall network load.
  • Customize queries such as they request results from a minimal number of nodes.
  • Design the data distribution strategy of your tables to minimize the results to be fetched to a single place for join conditions.

After this short digression, the aim of this post is not to discuss deeply about such conditions, it is to enlighten a feature that your application needs to target when you design its database. Well, the name of this feature is in the title, called “Fast Query Shipping”. It is not really a feature in itself, more a goal applications should try to reach as much as possible to improve their performance on database cluster softwares. Fast query shipping (FQS) is the ability for a cluster to evaluate if a query can be completely shipped to a remote node in the cluster, making it a simple send and receive, minimizing the plan cost on the node planning the query and the data transfer cost because the data fetched from remote node with such a query is minimized to exactly what the application targets.

To be honest, googling “Fast Query shipping” does not bring any result except on Postgres-XC. The basic implementation of this feature has been done with the following commit.
commit 191d55ebf1faf897aed51f1b5fdcd71ec3ccdc6c
Author: Ashutosh Bapat
Date: Thu Feb 2 16:59:04 2012 +0530
 
Add the support for Fast Query Shipping (FQS), a method to identify
whether a query can be sent to the datanode/s as it is for evaluation and do so
if deemed fit. In such cases, we create a plan with a single RemoteQuery node
corresponding to the query and avoid the planning phase on coordinator.
 
A query tree walker analyses all the nodes in the query tree and finds out the
conditions under which the query is shippable and detects presence of
expressions which can not be evaluated on the datanode. It looks at the
relations involved in the query and deducts whether JOINs between these
relations can be evaluated on a single datanode.
 
Adds testcases xc_FQS and xc_FQS_join to test the fast query shipping
functionality and make it independent of cluster configuration.

So, in this case, an extension of PostgreSQL planner has been done exclusively for Postgres-XC to evaluate if a query is entirely shippable to its dedicated remote node. This planning step determines the list of target nodes where to launch the query. The query can be basically shipped as depending on a lot of conditions like analysis of clauses, but basically you cannot ship a query if it contains expressions that cannot be evaluated on a remote node. A simple expression following that is the next value of a sequence, or timestamps. In a more general way it is a volatile or stable functions. There are also other expressions that cannot be shipped like window functions, GROUP BY clauses, aggregates, etc. Sometimes you may be able to ship entirely a query having an aggregate function, but targeting a single query. Well, there are a lot of cases possible, and you might look at the code in details if you are interested about each corner case.

Let’s have a look at some simple test cases with replicated and hash tables:
postgres=# create table rep (a int) distribute by replication;
CREATE TABLE
postgres=# create table hash (a int) distribute by hash(a);
CREATE TABLE

For a table replicated on all nodes, shipping results from a single node is sufficient for the simple “select *”. For a distribute table, all the nodes are targetted and results are sent back as such. This is in such configuration that you will get the most efficient queries running in a cluster environment.
postgres=# explain select * from rep;
QUERY PLAN
----------------------------------------------------------------------------
Data Node Scan on "__REMOTE_FQS_QUERY__" (cost=0.00..0.00 rows=0 width=0)
Node/s: dn1
(2 rows)
postgres=# explain select * from hash;
QUERY PLAN
----------------------------------------------------------------------------
Data Node Scan on "__REMOTE_FQS_QUERY__" (cost=0.00..0.00 rows=0 width=0)
Node/s: dn1, dn2
(2 rows)

A session parameter called enable_fast_query_shipping is available to set switch this feature to on/off. Let’s see what happens.
postgres=# SET enable_fast_query_shipping TO false;
SET
postgres=# explain select * from hash;
QUERY PLAN
-------------------------------------------------------------------
Result (cost=0.00..1.01 rows=1000 width=4)
-> Data Node Scan on hash (cost=0.00..1.01 rows=1000 width=4)
Node/s: dn1, dn2
(3 rows)
postgres=# explain select * from rep;
QUERY PLAN
------------------------------------------------------------------
Result (cost=0.00..1.01 rows=1000 width=4)
-> Data Node Scan on rep (cost=0.00..1.01 rows=1000 width=4)
Node/s: dn1
(3 rows)

Here what happens is that the query is not using any FQS features, so what happens is that you do not directly fetch the results from the node, but you also materialize them on the Coordinator where query is launched before sending them back to client.

This was a small introduction of the fast query shipping feature, just do not forget to test it.

The release of Postgres-XC 0.9.7 has been done in Source Forge and in GIT.
Source forge page is here.
The source tarball and manuals are available here.

Postgres-XC 0.9.7, based on PostgreSQL 9.1, contains a bunch of new features:

  • SELECT INTO/CREATE TABLE AS
  • INSERT SELECT complete support
  • Node subsets and sub clusters (data of table distributed on a portion of nodes)
  • Cluster node management with SQL and catalog extensions
  • Dynamic reload of pool connection information
  • Window functions
  • DML remote planning (partial expression push-down)
  • DEFAULT values (non-shippable expressions: volatile functions, etc.) => Use of DEFAULT nextval(‘seq’) available.
  • GTM Standby synchronous, asynchronous recovery

Compared to Postgres-XC 0.9.6, in 0.9.7 the cluster setting is mainly SQL-based and does not use GUC params anymore. So be sure to read the Install manual before creating a cluster.
HTML documentation and man pages are also incorporated in the downloadable tarball, and the reference PDF document is built from the same SGML source.

The code is also available on GIT:
git://postgres-xc.git.sourceforge.net/gitroot/postgres-xc/postgres-xc
A new maintenance branch called XC0_9_7_PG9_1 only for 0.9.7.

This week, a great feature has been added by commit 8a05756, completed by commit caf1554 in Postgres-XC GIT repository.
commit 8a05756a702051d55a35ec3f4953f381f977b53a
Author: Pavan Deolasee Date: Wed Dec 14 09:35:53 2011 +0530
 
Implement support for CREATE TABLE AS, SELECT INTO and INSERT INTO
statements. We start by fixing the INSERT INTO support. For every result
relation, we now build a corresponding RemoteQuery node so that the
inserts can be carried out at the remote datanodes. Subsequently, at
the coordinator at execution time, instead of inserting the resulting tuples
in a local heap, we invoke remote execution and insert the rows in the
remote datanodes. This works nicely even for prepared queries, multiple
values clause for insert as well as any other mechanism of generating
tuples.
 
We use this infrastructure to then support CREATE TABLE AS SELECT (CTAS).
The query is transformed into a CREATE TABLE statement followed by
INSERT INTO statement and then run through normal planning/execution.
 
There are many regression cases that need fixing because these statements
now work correctly. This patch fixes many of them. Few might still be
failing, but they seem unrelated to the work itself and might be a
side-effect. We will fix them once this patch gets in.

Simply, this is the support for CREATE TABLE AS and SELECT INTO. All the possible combinations of INSERT SELECT are also possible whatever the type of table used.

Let’s see through a couple of examples with this cluster of 1 Coordinator and 4 Datanodes.
postgres=# select oid,node_name,node_type from pgxc_node;
oid | node_name | node_type
-------+-----------+-----------
11133 | coord1 | C
16384 | dn1 | D
16385 | dn2 | D
16386 | dn3 | D
16387 | dn4 | D
(5 rows

Let’s create a table and populate it with some data.
postgres=# create table a as select generate_series(1,100);
INSERT 0 100
postgres=# select count(*) from a;
count
-------
100
(1 row)

The data is distributed through the cluster of the 4 Datanodes.
postgres=# execute direct on node dn4 'select count(*) from a';
count
-------
27
(1 row)
postgres=# execute direct on node dn3 'select count(*) from a';
count
-------
19
(1 row)
postgres=# execute direct on node dn2 'select count(*) from a';
count
-------
31
(1 row)
postgres=# execute direct on node dn1 'select count(*) from a';
count
-------
23
(1 row)

CREATE TABLE AS is not only limited to global tables, you can define a distribution type, a subset of nodes, and of course the table can be unlogged or temporary. Here the table is distributed by round robin on datanodes dn1 and dn2.
postgres=# create table c distribute by round robin to node dn1,dn2 as select * from b;
INSERT 0 100
postgres=# execute direct on node dn1 'select count(*) from c';
count
-------
50
(1 row)
postgres=# execute direct on node dn2 'select count(*) from c';
count
-------
50
(1 row)
postgres=# execute direct on node dn3 'select count(*) from c';
count
-------
0
(1 row)
postgres=# execute direct on node dn4 'select count(*) from c';
count
-------
0
(1 row)

However, SELECT INTO does not have any extension for distribution type and node subsets. The reason for that is because SELECT INTO is by default a SELECT query, CREATE TABLE AS is a DDL. So in this case table created is distributed by hash on all the nodes.
postgres=# select * into d from b;
INSERT 0 100
postgres=# select pclocatortype,nodeoids from pgxc_class where pcrelid = 'd'::regclass;
-[ RECORD 1 ]-+------------------------
pclocatortype | H
nodeoids | 16384 16385 16386 16387

Yeah, that rocks.

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use