Just yesterday I committed that.
commit 2aea0c2e0e01031f5dd4260b6985dc0ed4eadc50
Author: Michael P
Date: Tue Nov 15 09:54:54 2011 +0900
 
 Support for data distribution among a subset of datanodes
 
 CREATE TABLE has been extended with the following clause:
 CREATE TABLE ...
 [ TO ( GROUP groupname | NODE nodename [, ... ] ) ]
 
 This clause allows to distribute data among a subset of nodes
 listed by a node list, or a group alias.
 Node groups can be defined with CREATE NODE GROUP.
 
 The base structure for this support was added with commit
 56a90674444df1464c8e7012c6113efd7f9bc7db, but check of mapping of
 subsets of node list was still missing for the management of join
 push down and materialization evaluation in planner.

So what the hell is it??? Simply a feature that allows you to better control the data distributed among your Postgres-XC cluster.

Let’s take an example of a cluster with 2 Coordinators and 4 Datanodes.
postgres=# select oid,node_name from pgxc_node;
oid | node_name
-------+-----------
11133 | coord1
11134 | coord2
11135 | dn1
11136 | dn2
11137 | dn3
11138 | dn4
(6 rows)

Prior to this functionality, creating a table forced you to distribute the data among all the datanodes of your cluster.
postgres=# create table test (a int);
CREATE TABLE
postgres=# select nodeoids from pgxc_class where pcrelid = 'test'::regclass;
nodeoids
-------------------------
11135 11136 11137 11138
(1 row)

Now, CREATE TABLE has a new clause extension to be able to create a table only on a subset of nodes.
This is documented here.
The new clause is written as:
CREATE TABLE
...
[ TO ( GROUP groupname | NODE nodename [, ... ] ) ]

So you can specify a list of node names or a node group. A node group is simply an alias for a node list.
Let’s try it.

  • test12_rep is a replicated table whose data is distributed in nodes 1 and 2
  • test34_hash is a hash table whose data is distributed in nodes 3 and 4
  • test234_rep is a replicated table whose data is distributed in nodes 2, 3 and 4

postgres=# create table test12_rep (a int) distribute by replication to node dn1,dn2;
CREATE TABLE
postgres=# create table test34_hash (a int) distribute by hash(a) to node dn3,dn4;
CREATE TABLE
postgres=# create node group dn234 with dn2,dn3,dn4;
CREATE NODE GROUP
postgres=# create table test234_rep (a int) distribute by replication to group dn234;
CREATE TABLE
-- Check the node subset for distribution
postgres=# select nodeoids from pgxc_class where pcrelid = 'test12_rep'::regclass;
nodeoids
-------------
11135 11136
(1 row)
postgres=# select nodeoids from pgxc_class where pcrelid = 'test34_hash'::regclass;
nodeoids
-------------
11137 11138
(1 row)
postgres=# select nodeoids from pgxc_class where pcrelid = 'test234_rep'::regclass;
nodeoids
-------------------
11136 11137 11138
(1 row)

Now let’s insert some data.
postgres=# insert into test12_rep values (1),(2),(3);
INSERT 0 3
postgres=# insert into test234_rep values (1),(2),(3);
INSERT 0 3
postgres=# insert into test34_hash values (1),(2),(3);
INSERT 0 3

Then is data of test12_rep correctly distributed?
postgres=# execute direct on node dn1 'select * from test12_rep';
a
---
1
2
3
(3 rows)
postgres=# execute direct on node dn2 'select * from test12_rep';
a
---
1
2
3
(3 rows)
postgres=# execute direct on node dn3 'select * from test12_rep';
a
---
(0 rows)
postgres=# execute direct on node dn4 'select * from test12_rep';
a
---
(0 rows)

test12_rep is only replicated in nodes dn1 and dn2 only.

Let’s do the same checks for test234_rep and test34_hash.
--First for test34_hash
postgres=# execute direct on node dn1 'select * from test34_hash';
a
---
(0 rows)
postgres=# execute direct on node dn2 'select * from test34_hash';
a
---
(0 rows)
postgres=# execute direct on node dn3 'select * from test34_hash';
a
---
1
2
(2 rows)
postgres=# execute direct on node dn4 'select * from test34_hash';
a
---
3
(1 row)
--Then for test234_rep
postgres=# execute direct on node dn1 'select * from test234_rep';
a
---
(0 rows)
postgres=# execute direct on node dn2 'select * from test234_rep';
a
---
1
2
3
(3 rows)
postgres=# execute direct on node dn3 'select * from test234_rep';
a
---
1
2
3
(3 rows)
postgres=# execute direct on node dn4 'select * from test234_rep';
a
---
1
2
3
(3 rows)

So test234_rep is correctly replicated in nodes 2, 3 and 4. test34_hash is correctly hash-partitioned in nodes 3 and 4.

Now let’s do some join and push down checks.
postgres=# explain verbose select a from test34_hash join test234_rep using (a);
QUERY PLAN
-------------------------------------------------------------------
 Data Node Scan (Node Count [2]) (cost=0.00..0.00 rows=0 width=0)
  Output: test34_hash.a
(2 rows)

In this case replicated table test234_rep is completely mapped by test34_hash so a push down is possible to nodes 3 and 4 directly.

postgres=# explain verbose select a from test34_hash join test12_rep using (a);
QUERY PLAN
------------------------------------------------------
 Nested Loop (cost=0.00..2.04 rows=1 width=4)
  Output: test34_hash.a
  Join Filter: (test34_hash.a = test12_rep.a)
  -> Materialize (cost=0.00..1.01 rows=1 width=4)
      Output: test34_hash.a
      -> Data Node Scan (Node Count [2]) on test34_hash (cost=0.00..1.01 rows=1000 width=4)
        Output: test34_hash.a
  -> Materialize (cost=0.00..1.01 rows=1 width=4)
      Output: test12_rep.a
      -> Data Node Scan (Node Count [1]) on test12_rep (cost=0.00..1.01 rows=1000 width=4)
        Output: test12_rep.a
(11 rows)

In this case test34_hash and test12_rep are distributed on a disjoint list of nodes, so performing a join needs to first fetch data from Datanodes then materialize it on Coordinator.

There is still no way to change the table distribution type or the node list after table creation. This is one of the next plans, based on ALTER TABLE this time.

After long weeks of battle, this week this commit has happened in Postgres-XC’s GIT repository.
commit 56a90674444df1464c8e7012c6113efd7f9bc7db
Author: Michael P
Date: Thu Oct 27 10:57:30 2011 +0900
 
Support for Node and Node Group DDL
 
Node information is not anymore supported by node number using
GUC parameters but node names.
Node connection information is taken from a new catalog table
called pgxc_node. Node group information can be found in pgxc_group.
 
Node connection information is taken from catalog when user session
begins and sticks with it for the duration of the session. This brings
more flexibility to the cluster settings. Cluster node information can
now be set when node is initialized with initdb using cluster_nodes.sql
located in share directory.
 
This commits adds support for the following new DDL:
- CREATE NODE
- ALTER NODE
- DROP NODE
- CREATE NODE GROUP
- DROP NODE GROUP
 
The following parameters are deleted from postgresql.conf:
- num_data_nodes
- preferred_data_nodes
- data_node_hosts
- data_node_ports
- primary_data_node
- num_coordinators
- coordinator_hosts
- coordinator_ports
pgxc_node_id is replaced by pgxc_node_name to identify the node-self.
 
Documentation is added for the new queries. Functionalities such as
EXECUTE DIRECT, CLEAN CONNECTION use node names instead of node numbers now.

So what is it about? Until now Postgres-XC has only used a heavy configuration to set up node connection information. There were 8 parameters dedicated to Coordinators and Datanodes, and those parameters had to follow a special format.
Now, the following SQL queries can be issued to set up cluster connection information, and information is cached once user session is up.
For the time being, a file called cluster_nodes.sql has to be set in share folder for initdb. But soon functionalities will be added to update pooler connection information based on node information update, insert or deletion.
This brings a lot of simplicity in cluster setting. And now, nodes are not identified by their position number in a GUC string, but by a unique global name that maintains consistency in the whole cluster.

Here are some examples of cluster settings.
1 Coordinator and 2 Datanodes:
CREATE NODE coord1 WITH (HOSTIP = 'localhost', COORDINATOR MASTER, NODEPORT = $COORD1_PORT);
CREATE NODE dn1 WITH (HOSTIP = 'localhost', NODE MASTER, NODEPORT = $DN1_PORT, PREFERRED);
CREATE NODE dn2 WITH (HOSTIP = 'localhost', NODE MASTER, NODEPORT = $DN2_PORT, PRIMARY);

2 Coordinators and 2 Datanodes:
CREATE NODE coord2 WITH (HOSTIP = 'localhost', COORDINATOR MASTER, NODEPORT = $COORD2_PORT);
CREATE NODE coord1 WITH (HOSTIP = 'localhost', COORDINATOR MASTER, NODEPORT = $COORD1_PORT);
CREATE NODE dn2 WITH (HOSTIP = 'localhost', NODE MASTER, NODEPORT = $DN2_PORT, PRIMARY);
CREATE NODE dn1 WITH (HOSTIP = 'localhost', NODE MASTER, NODEPORT = $DN1_PORT, PREFERRED);

So, what happens in the cluster for 2 Datanodes and 2 Coordinators?
postgres=# select oid,* from pgxc_node;
-[ RECORD 1 ]----+----------
oid | 11133
node_name | coord1
node_type | C
node_related | 0
node_port | 5432
node_host | localhost
nodeis_primary | f
nodeis_preferred | f
-[ RECORD 2 ]----+----------
oid | 11134
node_name | coord2
node_type | C
node_related | 0
node_port | 5452
node_host | localhost
nodeis_primary | f
nodeis_preferred | f
-[ RECORD 3 ]----+----------
oid | 11135
node_name | dn1
node_type | D
node_related | 0
node_port | 15451
node_host | localhost
nodeis_primary | f
nodeis_preferred | t
-[ RECORD 4 ]----+----------
oid | 11136
node_name | dn2
node_type | D
node_related | 0
node_port | 15452
node_host | localhost
nodeis_primary | t
nodeis_preferred | f

Other functionalities now also work with node names, like EXECUTE DIRECT and CLEAN CONNECTION:
postgres=# clean connection to node dn1 for database postgres;
CLEAN CONNECTION
postgres=# execute direct on node dn1 'select oid,* from pgxc_node where node_type = ''D''';
-[ RECORD 1 ]----+----------
oid | 11135
node_name | dn1
node_type | D
node_related | 0
node_port | 15451
node_host | localhost
nodeis_primary | f
nodeis_preferred | t
-[ RECORD 2 ]----+----------
oid | 11136
node_name | dn2
node_type | D
node_related | 0
node_port | 15452
node_host | localhost
nodeis_primary | t
nodeis_preferred | f

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use