A new set of APIs for foreign data wrappers has been added to allow writable operations on foreign sources. This feature has been committed by Tom Lane a couple of days ago.
commit 21734d2fb896e0ecdddd3251caa72a3576e2d415
Author: Tom Lane
Date: Sun Mar 10 14:14:53 2013 -0400
 
Support writable foreign tables.
 
This patch adds the core-system infrastructure needed to support updates
on foreign tables, and extends contrib/postgres_fdw to allow updates
against remote Postgres servers. There's still a great deal of room for
improvement in optimization of remote updates, but at least there's basic
functionality there now.
 
KaiGai Kohei, reviewed by Alexander Korotkov and Laurenz Albe, and rather
heavily revised by Tom Lane.

Based on the documentation, the implementation is still very basic as nothing is done with clause shippability. Just to give some notions about that: roughly a clause in a SELECT query (LIMIT, OFFSET, GROUP BY, HAVING, ORDER BY, etc.) is shippable if this clause can be entirely evaluated on remote server, making less processing happening on local server, and reducing the tuple selectivity. A direct consequence of clause shippability limitation is that UPDATE and DELETE queries can take quite a long time if they are run on many rows because query is run in two steps:

  • Scan remote table and fetch back to local server the tuples to be manipulated
  • Process UPDATE or DELETE based on the tuples fetched

INSERT does not need such scan as in this case new data is simply sent to the remote table, the tuple values being computed before sending the query (even for immutable functions). Not really performant but it is the safest approach. Postgres-XC has similar and more advanced features for foreign DDL planning and execution in its core (some of them implemented by me), have a look for example at this article I wrote a while ago.

It is possible to test writable foreign tables with postgres_fdw as it has been extended to support this new feature. So let’s give it a try with two postgres servers using ports 5432 and 5433. Server with port 5432 has postgres_fdw installed and will interact with the remote server running under port 5433. In order to get the basics of postgres_fdw, you can refer to this article written a couple of weeks ago.

Now, it is time to test the feature. First let’s create a table on remote server.
$ psql -p 5433 -c "CREATE TABLE aa_remote (a int, b int)" postgres
CREATE TABLE

Then it is necessary to create a foreign table on the local server.
postgres=# CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', port '5433', dbname 'postgres');
CREATE SERVER
postgres=# CREATE USER MAPPING FOR PUBLIC SERVER postgres_server OPTIONS (password '');
CREATE USER MAPPING
postgres=# CREATE FOREIGN TABLE aa_foreign (a int, b int) SERVER postgres_server OPTIONS (table_name 'aa_remote');
CREATE FOREIGN TABLE

Then let’s test the new feature by performing some DML operations on the foreign table from local server.
postgres=# INSERT into aa_foreign values (1,2);
INSERT 0 1
postgres=# select * from aa_foreign;
a | b
---+---
1 | 2
(1 row)
postgres=# update aa_foreign set b = 3;
UPDATE 1
postgres=# select * from aa_foreign;
a | b
---+---
1 | 3
(1 row)

Everything is going well on local side, and on remote side what happened?
$ psql -p 5433 -c 'SELECT * FROM aa_remote' --dbname postgres
a | b
---+---
1 | 3
(1 row)

So the data of the remote table has been correctly changed from local server.

Just before the tests, I explained that a scan is done for UPDATE and DELETE before actually running the DML, you can get more details about that with EXPLAIN.
postgres=# explain verbose update aa_foreign set b = 3;
QUERY PLAN
-----------------------------------------------------------------------------------
Update on public.aa_foreign (cost=100.00..182.27 rows=2409 width=10)
 Remote SQL: UPDATE public.aa_remote SET b = $2 WHERE ctid = $1
 -> Foreign Scan on public.aa_foreign (cost=100.00..182.27 rows=2409 width=10)
  Output: a, 3, ctid
  Remote SQL: SELECT a, NULL, ctid FROM public.aa_remote FOR UPDATE
(5 rows)

In the case of postgres_fdw, selectivity of tuple is done with ctid of tuple, which ensures tuple uniqueness. Note that if you implement your own foreign data wrapper, you might need to use columns having primary keys for selectivity of tuples.

There are also a couple of things to be aware of when using this feature.

  • There are risk of data incompatibility for data formatted with GUC parameters. This has been mentionned in the community but try for example to manipulate servers with different settings of datesyle…
  • Transactions are open on remote server using repeatable read.
  • UPDATE and DELETE can be costly if scan is done with a good-old-fashioned sequential scan, but well that’s a known thing
  • Things I forgot…

This week, I spent a long time working on this commit. Just by looking at the date, commit happened before leaving for week-end :)
commit 8ef0c48acadec3c9888d302888a7d279d82323e5
Author: Michael P
Date: Fri Jan 13 16:05:00 2012 +0900
 
Improve target list selection for remote DML queries
 
This commit makes remote DML planning generally available
for replicated and hash tables. There are still issues
related to node selection for round robin tables though.
The target list of UPDATE and DELETE using coordinator quals
was set to fetch only CTID when generating SELECT in their
inner plan generated by create_remotequery_plan.
 
Their target list is rewritten to include the columns in quals
so as to be able to evaluate those quals correctly on Coordinator.
In addition remote planning for UPDATE has been improved to be
able to target correct node when launching query.
 
A new regression test called xc_remote is added, it uses the
parameter enable_fast_query_shipping to force all the queries
to go through standard planner. Tests are done on replicated,
hash and round robin tables.

In all the examples of this article, those two tables are used with the following cluster configuration of Postgres-XC cluster.
db=# select node_name, node_type from pgxc_node; -- 1 Coordinator, 2 Datanodes
node_name | node_type
-----------+-----------
coord1 | C
dn1 | D
dn2 | D
(3 rows)
db=# create table aa (a int, b timestamp) distribute by hash(a);
CREATE TABLE
db=# create table bb (a int, b timestamp) distribute by replication;
CREATE TABLE

For database clusters in general, it is essential to have an efficient and consistent way to manage queries on both local and remote nodes. Efficiency is important to reduce data load on the system. Consistency is even more important to avoid dirty data in your database. So, about queries in general, let’s use an example. SELECT queries may contain expressions that can be evaluated on remote nodes. A common example for that is when the expression is a constant.
db=# explain verbose select * from aa where a = 1;
QUERY PLAN
---------------------------------------------------
Data Node Scan (cost=0.00..0.00 rows=0 width=0)
  Output: a, b
  Node/s: dn1
  Remote query: SELECT a, b FROM aa WHERE (a = 1)
(4 rows)

In this case the query can be completely shipped to the remote node, returning correct results.

Expressions that cannot be pushed down are those who need to be evaluated on local nodes with all the necessary data fetched from remote nodes. For example, let’s take the replicated table bb. We want to select data on it with a time-based expression. Each node of the cluster (at least in the case of Postgres-XC) is located on a different server, each server having a different time line.So, is the following SQL shippable?
SELECT a from bb where b < now();
The answer is no. What is necessary to do is to get all the tuples (a,b) from table bb (a is necessary to send back result), and then apply the time based condition on all the results (explaining why b is necessary).
This results in the following plan.
db=# explain verbose select a from bb where b < now();
QUERY PLAN
-----------------------------------------------------------------
 Result (cost=0.00..1.01 rows=1000 width=4)
 Output: a
 -> Data Node Scan on bb (cost=0.00..1.01 rows=1000 width=4)
   Output: a, b
   Node/s: dn1
   Remote query: SELECT a, b FROM ONLY bb WHERE true
   Coordinator quals: (bb.b < now())
(7 rows)

Well, Postgres-XC has already a lot of mechanisms to manage SELECT and INSERT queries. But what was missing are the parts related to UPDATE and DELETE. So the new functionality committed this week allows to use complex expressions.
For example, in the case of update, you can run sequence and time based updates needing local node evaluation to run consistently.
db=# insert into bb values (1,now());
INSERT 0 1
db=# insert into bb values (2,now());
INSERT 0 1
db=# insert into bb values (3,now());
INSERT 0 1
db=# select * from bb;
a | b
---+---------------------------------
1 | Fri Jan 13 06:26:32.872665 2012
2 | Fri Jan 13 06:26:38.261489 2012
3 | Fri Jan 13 06:26:40.943182 2012
(3 rows)
db=# update bb set a = nextval('seq'), b = now();
UPDATE 3
db=# select * from bb;
a | b
---+---------------------------------
1 | Fri Jan 13 06:28:01.273496 2012
2 | Fri Jan 13 06:28:01.273496 2012
3 | Fri Jan 13 06:28:01.273496 2012
(3 rows)
db=# explain verbose update bb set a = nextval('seq'), b = now();
QUERY PLAN
-----------------------------------------------------------------------
 Update on public.bb (cost=0.00..11.01 rows=1000 width=6)
 Node/s: dn1, dn2
 Remote query: UPDATE public.bb SET a = $1, b = $2 WHERE ctid = $3
 -> Result (cost=0.00..11.01 rows=1000 width=6)
   Output: nextval('seq'::regclass), now(), ctid
   -> Data Node Scan on bb (cost=0.00..1.01 rows=1000 width=6)
      Output: ctid
      Node/s: dn1
      Remote query: SELECT ctid FROM ONLY bb WHERE true
(9 rows)

You need here to select all the data to be updated from remote nodes, then you have to apply the time base expression (now) and the sequence value (nextval), and finally push those values to dedicated remote nodes.

This works also with WHERE clauses using non-shippable expressions.
db=# explain verbose update bb set a = nextval('seq'), b = now() WHERE b < now();
QUERY PLAN
---------------------------------------------------------------------------------
 Update on public.bb (cost=0.00..11.02 rows=1000 width=14)
  Node/s: dn1, dn2
  Remote query: UPDATE public.bb SET a = $1, b = $2 WHERE b = $3 AND ctid = $4
  -> Result (cost=0.00..11.02 rows=1000 width=14)
    Output: nextval('seq'::regclass), now(), b, ctid
    -> Data Node Scan on bb (cost=0.00..1.01 rows=1000 width=14)
      Output: b, ctid
      Node/s: dn1
      Remote query: SELECT b, ctid FROM ONLY bb WHERE true
      Coordinator quals: (bb.b < now())
(10 rows)

Here what is added is a condition to pre-select a subset of rows. Such operation is costly though because you have to fetch all the rows of the table first in inner plan.

The same kind of crazy SQL are also possible for DELETE with mixing shippable and non-shippable expressions.
db=# explain verbose delete from bb where a = 2 and b < now();
QUERY PLAN
------------------------------------------------------------------------------
 Delete on public.bb (cost=0.00..1.02 rows=1000 width=18)
 Node/s: dn1, dn2
 Remote query: DELETE FROM public.bb WHERE a = $1 AND b = $2 AND ctid = $3
 -> Result (cost=0.00..1.02 rows=1000 width=18)
   Output: a, b, ctid
   -> Data Node Scan on bb (cost=0.00..1.02 rows=1000 width=18)
     Output: a, b, ctid
     Node/s: dn1
     Remote query: SELECT a, b, ctid FROM ONLY bb WHERE (a = 2)
     Coordinator quals: (bb.b < now())
(10 rows)

You can notice here that the constant expression "a = 2" is shipped in the most inner plan, improving query efficiency by that much.
A lot of things are now possible, and all this stuff will be included in release 0.9.7!

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use