A new set of APIs for foreign data wrappers has been added to allow writable operations on foreign sources. This feature has been committed by Tom Lane a couple of days ago.
commit 21734d2fb896e0ecdddd3251caa72a3576e2d415
Author: Tom Lane
Date: Sun Mar 10 14:14:53 2013 -0400
 
Support writable foreign tables.
 
This patch adds the core-system infrastructure needed to support updates
on foreign tables, and extends contrib/postgres_fdw to allow updates
against remote Postgres servers. There's still a great deal of room for
improvement in optimization of remote updates, but at least there's basic
functionality there now.
 
KaiGai Kohei, reviewed by Alexander Korotkov and Laurenz Albe, and rather
heavily revised by Tom Lane.

Based on the documentation, the implementation is still very basic as nothing is done with clause shippability. Just to give some notions about that: roughly a clause in a SELECT query (LIMIT, OFFSET, GROUP BY, HAVING, ORDER BY, etc.) is shippable if this clause can be entirely evaluated on remote server, making less processing happening on local server, and reducing the tuple selectivity. A direct consequence of clause shippability limitation is that UPDATE and DELETE queries can take quite a long time if they are run on many rows because query is run in two steps:

  • Scan remote table and fetch back to local server the tuples to be manipulated
  • Process UPDATE or DELETE based on the tuples fetched

INSERT does not need such scan as in this case new data is simply sent to the remote table, the tuple values being computed before sending the query (even for immutable functions). Not really performant but it is the safest approach. Postgres-XC has similar and more advanced features for foreign DDL planning and execution in its core (some of them implemented by me), have a look for example at this article I wrote a while ago.

It is possible to test writable foreign tables with postgres_fdw as it has been extended to support this new feature. So let’s give it a try with two postgres servers using ports 5432 and 5433. Server with port 5432 has postgres_fdw installed and will interact with the remote server running under port 5433. In order to get the basics of postgres_fdw, you can refer to this article written a couple of weeks ago.

Now, it is time to test the feature. First let’s create a table on remote server.
$ psql -p 5433 -c "CREATE TABLE aa_remote (a int, b int)" postgres
CREATE TABLE

Then it is necessary to create a foreign table on the local server.
postgres=# CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', port '5433', dbname 'postgres');
CREATE SERVER
postgres=# CREATE USER MAPPING FOR PUBLIC SERVER postgres_server OPTIONS (password '');
CREATE USER MAPPING
postgres=# CREATE FOREIGN TABLE aa_foreign (a int, b int) SERVER postgres_server OPTIONS (table_name 'aa_remote');
CREATE FOREIGN TABLE

Then let’s test the new feature by performing some DML operations on the foreign table from local server.
postgres=# INSERT into aa_foreign values (1,2);
INSERT 0 1
postgres=# select * from aa_foreign;
a | b
---+---
1 | 2
(1 row)
postgres=# update aa_foreign set b = 3;
UPDATE 1
postgres=# select * from aa_foreign;
a | b
---+---
1 | 3
(1 row)

Everything is going well on local side, and on remote side what happened?
$ psql -p 5433 -c 'SELECT * FROM aa_remote' --dbname postgres
a | b
---+---
1 | 3
(1 row)

So the data of the remote table has been correctly changed from local server.

Just before the tests, I explained that a scan is done for UPDATE and DELETE before actually running the DML, you can get more details about that with EXPLAIN.
postgres=# explain verbose update aa_foreign set b = 3;
QUERY PLAN
-----------------------------------------------------------------------------------
Update on public.aa_foreign (cost=100.00..182.27 rows=2409 width=10)
 Remote SQL: UPDATE public.aa_remote SET b = $2 WHERE ctid = $1
 -> Foreign Scan on public.aa_foreign (cost=100.00..182.27 rows=2409 width=10)
  Output: a, 3, ctid
  Remote SQL: SELECT a, NULL, ctid FROM public.aa_remote FOR UPDATE
(5 rows)

In the case of postgres_fdw, selectivity of tuple is done with ctid of tuple, which ensures tuple uniqueness. Note that if you implement your own foreign data wrapper, you might need to use columns having primary keys for selectivity of tuples.

There are also a couple of things to be aware of when using this feature.

  • There are risk of data incompatibility for data formatted with GUC parameters. This has been mentionned in the community but try for example to manipulate servers with different settings of datesyle…
  • Transactions are open on remote server using repeatable read.
  • UPDATE and DELETE can be costly if scan is done with a good-old-fashioned sequential scan, but well that’s a known thing
  • Things I forgot…

Prior to PostgreSQL 9.3, there are two levels of locks allowing to control DML operations on a given set of rows for a transaction by using SELECT FOR SHARE and FOR UPDATE. Such locks taken on rows in a transaction block allow blocking INSERT/DELETE/UPDATE on those rows.
There is also a protocol between those lock levels. FOR UPDATE is equivalent to an exclusive lock on the row selected, meaning that no other backend can take a FOR UPDATE lock on the same row and waits until the other other transaction finishes. FOR SHARE means that all the other backends can take a FOR SHARE lock on those rows. No FOR UPDATE locks can be taken on rows already locked with FOR SHARE. It is also possible to use the NOWAIT option, making the server return an error if there is a wait situation.

PostgreSQL 9.3 introduces two new levels of locks: FOR KEY SHARE and FOR NO KEY UPDATE. This feature has been committed thanks to the perseverance of Alvaro Herrera after two years of effort. Really congratulations to Alvaro!
commit 0ac5ad5134f2769ccbaefec73844f8504c4d6182
Author: Alvaro Herrera
Date: Wed Jan 23 12:04:59 2013 -0300
 
Improve concurrency of foreign key locking
 
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE". UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.
 
Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.

The main point of this feature is to reduce lock contention for foreign key triggers, as now those ones use FOR KEY SHARE instead of FOR SHARE. Also, UPDATE commands that do not update columns related to the key of the tuple now take now a FOR NO KEY UPDATE, explaining the name of the lock. With this level of locking, UPDATE queries that do not involve columns of the tuple key can perform concurrently.

Honestly, with now 4 levels of locks, it is becoming complicated to remember which operation blocks or allows the other on the same tuple. So let’s make a couple of tests to determine what blocks what with a simple table with some data:
postgres=# CREATE TABLE aa AS SELECT 1 AS a;
SELECT 1

The test scenario is pretty simple: two client sessions trying to take a lock on the same tuple. Session 1 launches its commands first, then session 2, the goal being to see if session 2 takes the lock or waits for it.
Session 1:
BEGIN;
SELECT * FROM aa FOR $LOCK;

Then session 2 does that
SELECT * FROM aa FOR $LOCK;
$LOCK can be either FOR SHARE, FOR UPDATE, FOR NO KEY UPDATE or FOR KEY SHARE.

Here are the results:

Session 1
Locks UPDATE NO KEY UPDATE SHARE KEY SHARE
Session 2 UPDATE Waits Waits Waits Waits
NO KEY UPDATE Waits Waits Waits OK
SHARE Waits Waits OK OK
KEY SHARE Waits OK OK OK

I hope this table helps. Have fun.

This post has as goal to provide basics to help you understanding how work triggers in PostgreSQL.
A trigger is the possibility to associate an automatic operation to a table in case a write event happens on this given table.

Here is the synopsis of this query.
CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event [ OR ... ] }
  ON table
  [ FROM referenced_table_name ]
  { NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY IMMEDIATE | INITIALLY DEFERRED } }
  [ FOR [ EACH ] { ROW | STATEMENT } ]
  [ WHEN ( condition ) ]
  EXECUTE PROCEDURE function_name ( arguments )

The parts that are essential to get the basics are written in strong characters.

  • event, this is the database operation that will cause the trigger to fire. In Postgres 9.2 and prior versions, this can occur for INSERT, UPDATE, DELETE and TRUNCATE
  • table, this is the database table where the event has to occur
  • EXECUTE PROCEDURE function_name ( arguments ), this is the operation that is launched by trigger after being fired. The procedure can be customized with data related to the table or other things depending on the circumstances trigger is fired. Have a look here for more details about trigger procedures.

Triggers have many usages. Defined on the given table of your database, you can set up a trigger to launch automatic operations on a table each time an event is done on it. Once fired, this trigger will execute an automatic procedure that will perform a list of operations predefined by user. So this limits the amount of code you need to write on the application side, limiting the possibility of bugs in your own code while using Postgres server robustness.

But let’s take a simple example: an address book.
Let’s imagine that you are managing your address book with Postgres. Really for simplicity’s sake, your system manages only the names and addresses of people you know.
The people you know have a unique name, but they can have multiple addresses, as you might register their main home address and work address for example. So, your system will have the following basic schema:
CREATE TABLE users (id int PRIMARY KEY, name varchar(256));
CREATE TABLE address (id_user int, address text);

Let’s suppose that you are a lucky guy and that you know where I live and where is my workplace (some of this data is perhaps wrong).
postgres=# INSERT INTO users VALUES (1, 'Michael P');
INSERT 0 1
postgres=# INSERT INTO address VALUES (1, 'Work in Tokyo, Japan');
INSERT 0 1
postgres=# INSERT INTO address VALUES (1, 'Live in San Francisco, California');
INSERT 0 1

Then you can recover my addresses easily.
postgres=# SELECT address FROM users JOIN address
postgres=# ON (users.id = address.id_user) WHERE name = 'Michael P';
address
-----------------------------------
Work in Tokyo, Japan
Live in San Francisco, California
(2 rows)

However it happens that you are not caring anymore about me and that you wish to delete my data from your address book. Intuitively, deleting an entry from an address book is simply removing the wanted name. But, if you do that the address data will remain. Of course you can let your application manage the deletion for both tables “users” and “address”, but you will need to send 2 SQL queries. This is a waste of resource as you need to go twice to your database to perform the complete deletion. In this case at least it is.

Triggers can allow you to simplify the deletion operation by automatizing the data deletion on table “addresses” if a user is deleted from your address book. You need to create the following objects in order to do that.
CREATE FUNCTION delete_address() RETURNS TRIGGER AS $_$
  BEGIN
    DELETE FROM address WHERE address.id_user = OLD.id;
    RETURN OLD;
  END $_$ LANGUAGE 'plpgsql';

This function is set to delete the addresses for a given user ID.
Then create the trigger event. What is necessary here is to launch the previous function each time an entry is removed from table “users”, explaining the clause “FOR EACH ROW”. The address clean up is also done before the actual DELETE happens on table “users”.
CREATE TRIGGER delete_user_address BEFORE DELETE ON users FOR EACH ROW EXECUTE PROCEDURE delete_address();

Let’s test the entry deletion.
postgres=# DELETE FROM users WHERE name = 'Michael P';
DELETE 1
postgres=# select * from address;
id_user | address
---------+---------
(0 rows)

And my address data has completely disappeared from your database thanks to the trigger.
Have fun with this feature.

One of the features that has been really improved the last couple of weeks is the stabilization of remote query planning for DML for Postgres-XC standard planner. And this has consequences on rules, because a rule is fired on Coordinators by design, and you need to provide a global way to plan queries correctly with remote nodes. Just to recall, a rule is the possibility to define an alternative action when doing an INSERT, UPDATE or DELETE on a table.
Another important point is that the query of a rules is not planned at the moment of the rule creation, but after rule is fired, however it doesn’t change the fact that a correct query planning is needed at a moment or another.

A rule can for example be used to define DML on views.
A view is roughly a projection of table data into a wanted shape, and it is by default not possible to define DML actions on it.

Let’s take an example, here are a simple table and a simple view.
postgres=# CREATE TEMP TABLE t1 (a int PRIMARY KEY, b int) DISTRIBUTE BY HASH(a);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for table "t1"
CREATE TABLE
postgres=# INSERT INTO t1 VALUES (1,1),(2,2),(3,3),(4,4);
INSERT 0 4
postgres=# CREATE VIEW t1_v AS SELECT a AS a_v, b AS b_v FROM t1;
NOTICE: view "t1_v" will be a temporary view
CREATE VIEW

When trying to UPDATE the view, you will get the following error.
postgres=# UPDATE t1_v SET b_v = 2 WHERE a_v = 3;
ERROR: cannot update view "t1_v"
HINT: You need an unconditional ON UPDATE DO INSTEAD rule or an INSTEAD OF UPDATE trigger.

So let’s define a view on it and check that an UPDATE on a view is possible.
postgres=# CREATE RULE t1_upd AS ON UPDATE TO t1_v DO INSTEAD UPDATE t1 SET b = new.b_v WHERE a = old.a_v AND b = old.b_v;
CREATE RULE
postgres=# UPDATE t1_v SET b_v = 2 WHERE a_v = 3;
UPDATE 1
postgres=# select * from t1_v;
a_v | b_v
-----+-----
1 | 1
2 | 2
4 | 4
3 | 2
(4 rows)

So yes, RULES are now completely supported in Postgres-XC. And it is included in 1.0. The secret of how it works? The thing that took me 4 weeks to figure out?
Well, some extensions have been added in standard planner for DELETE, INSERT and UPDATE remote planning (ModifyTable used as TopPlan). For people willing to look at the code, all the secrets are located in functions create_remoteinsert_plan, create_remoteupdate_plan and create_remotedelete_plan of createplan.c. Those functions have been built and adapted to former scan plan of PostgreSQL to react as a wrapper for inner remote table scans within PostgreSQL standard planner. The trick is to create a DML query generated based on the scan plans generated for each tables that has to be updated, deleted or insert.

One of the limitations? The use of constraints.
Depending on the distribution strategy used, you may not be able to check the consistency of constraints globally.
But this is another story…

Postgres-XC, write-scalable multi-master symetric cluster based on PostgreSQL, version XC 1.0beta1 has been released.
This beta version is based on PostgreSQL 9.1beta2, and all the fixes of PostgreSQL 9.1 stable branch will be backported in Postgres-XC 1.0 stabilized version.
It is under PostgreSQL license.

You can download the binary directly from here.
This tarball includes all the HTML and man documentation.

A PDF file containing all the references is also available here: Reference PDF.

Compared to version 0.9.7, the following features have been added:

  • Fast query shipping (FQS), quick identification of expressions in a query that can be pushed down to remote nodes
  • SERIAL types
  • TABLESPACE
  • Utility to clean up 2PC transactions in cluster (pgxc_clean)
  • Utility for initialization of GTM (global transaction manager, utility called initgtm)
  • Relation-size functions
  • Regression stabilization

The project can be followed on Source Forge.
And we use a couple of GIT repositories for development:

Postgres-XC tutorial at PGCon in Ottawa this May will use a 1.0 version, so be sure to touch this beta version to have an idea of what is Postgres-XC before coming!
Since the last release, a special effort has been made on stabilization and performance improvement, but be sure to give your feedback in order to provide a stable 1.0 release for PostgreSQL community.

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use