IF EXISTS and IF NOT EXISTS are clauses allowing to return a notice message instead of an error if a DDL query running on a given object already exists or not depending on the DDL action done. If a given query tries to create an object when IF NOT EXISTS is specified, a notice message is returned to client if the object has already been created and nothing is done on server side. If the object is altered or dropped when IF EXISTS is used, a notice message is returned back to client if the object does not exist and nothing is done.

Here is what simply happens when a table that exists is created:
postgres=# CREATE TABLE IF NOT EXISTS aa (a int);
CREATE TABLE
postgres=# CREATE TABLE IF NOT EXISTS aa (a int);
NOTICE: relation "aa" already exists, skipping
CREATE TABLE

Similarly, when dropping this table based on its existence.
postgres=# DROP TABLE IF EXISTS aa;
DROP TABLE
postgres=# DROP TABLE IF EXISTS aa;
NOTICE: table "aa" does not exist, skipping
DROP TABLE

Prior to 9.3, PostgreSQL already proposed this feature with many objects: tables, index, functions, triggers, language, etc. Such SQL extensions are useful when running several times the same script several times and avoiding errors on environments already installed.

9.3 introduces some new flavors of IF [NOT] EXISTS completing a bit more the set of objects already supported.

  • CREATE SCHEMA [IF NOT EXISTS]
  • ALTER TYPE ADD VALUE [IF NOT EXISTS]
  • Extension of DROP TABLE IF NOT EXISTS such as it succeeds if the specified schema does not exists

Note also that the new materialized views are also supported with IF [NOT] EXISTS for CREATE, ALTER and DROP.

The extension of CREATE SCHEMA with IF NOT EXISTS is pretty simple. Similarly to the other objects, command succeeds if the schema already exists and a notice message about the existence of schema is sent back to client.
postgres=# CREATE SCHEMA foo;
ERROR: schema "foo" already exists
postgres=# CREATE SCHEMA IF NOT EXISTS foo;
NOTICE: schema "foo" already exists, skipping
CREATE SCHEMA

Note that subsequent schema elements cannot be used with this option.
postgres=# CREATE SCHEMA IF NOT EXISTS foo CREATE TABLE aa (a int);
ERROR: CREATE SCHEMA IF NOT EXISTS cannot include schema elements
LINE 1: CREATE SCHEMA IF NOT EXISTS foo CREATE TABLE aa (a int);

The second addition, ALTER TYPE ADD VALUE [IF NOT EXISTS] is useful in the case of enumeration types to condition the addition of new values.
postgres=# CREATE TYPE character_type AS ENUM ('warrior', 'priest', 'sorcerer');
CREATE TYPE
postgres=# ALTER TYPE character_type ADD VALUE IF NOT EXISTS 'magician';
ALTER TYPE
postgres=# ALTER TYPE character_type ADD VALUE IF NOT EXISTS 'magician';
NOTICE: enum label "magician" already exists, skipping
ALTER TYPE

The last improvement is also a nice thing to have. Here is what you could obtain prior to 9.3 when trying to use DROP TABLE IF EXISTS on a table using a schema that did not exist.
postgres=# DROP TABLE IF EXISTS foosch.foo;
ERROR: schema "foosch" does not exist

And here is what you get now:
postgres=# DROP TABLE IF EXISTS foosch.foo;
NOTICE: table "foo" does not exist, skipping
DROP TABLE

Those are definitely nice additions, especially the new extension of IF NOT EXISTS on schemas which was really missing in the existing set.

Continuing on the coverage of new JSON features added in Postgres 9.3, and after writing about JSON data generation and JSON operators, let’s now focus on some new functions that can be used for the parsing of JSON data.

The are many new functions introduced:

  • json_each, json_each_text
  • json_extract_path, json_extract_path_text
  • json_object_keys
  • json_populate_record, json_populate_recordset
  • json_array_length
  • json_array_elements

The following set of data is used in all the examples of this post,.
postgres=# CREATE TABLE aa (a int, b json);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (1, '{"f1":1,"f2":true,"f3":"Hi I''m \"Daisy\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (2, '{"f1":2,"f2":false,"f3":"Hi I''m \"Dave\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (3, '{"f1":3,"f2":true,"f3":"Hi I''m \"Popo\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (4, '{"f1":{"f11":11,"f12":12},"f2":2}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (5, '{"f1":[1,"Robert \"M\""],"f2":[2,"Kevin \"K\"",false]}');
INSERT 0 1

So now let’s begin. The most valuable functions might be json_each and json_each_text which can be used to expand JSON data as key/value records.
postgres=# SELECT * FROM json_each((SELECT b FROM aa WHERE a = 1));
key | value
-----+--------------------
f1 | 1
f2 | true
f3 | "Hi I'm \"Daisy\""
(3 rows)

The difference between json_each and json_each_text is that the former returns values as legal JSON format and the latter returns it as text.
postgres=# SELECT * FROM json_each_text((SELECT b FROM aa WHERE a = 1));
key | value
-----+----------------
f1 | 1
f2 | true
f3 | Hi I'm "Daisy"
(3 rows)

This operation is effective only on the outermost field.
postgres=# SELECT * FROM json_each((SELECT b FROM aa WHERE a = 4)) WHERE key = 'f1';
key | value
-----+---------------------
f1 | {"f11":11,"f12":12}
(1 row)

And you can also apply this operation on some inner fields by selecting directly an inner JSON field or using some WITH mechanism.
SELECT * FROM json_each((SELECT b->'f1' FROM aa WHERE a = 4));
key | value
-----+-------
f11 | 11
f12 | 12
(2 rows)

json_extract_path and json_extract_path_text can be used to extract a field value based on some given keys, or a chain or keys, equivalent to what the operators “->” and “->>” can respectively do.
postgres=# SELECT json_extract_path(b, 'f1') AS f1a, b->'f1' AS f1b FROM aa WHERE a = 4;
f1a | f1b
---------------------+---------------------
{"f11":11,"f12":12} | {"f11":11,"f12":12}
(1 row)
postgres=# SELECT json_extract_path(b, 'f1', 'f12') AS f12a, b->'f1'->'f12' AS f12b FROM aa WHERE a = 4;
f12a | f12b
------+------
12 | 12
(1 row)

json_object_keys retrieves the set of keys of a given JSON object on the outermost object. As it returns the field names of all the tuples scanned, be sure to group the results or to select a limited number of tuples.
postgres=# SELECT json_object_keys(b) FROM aa GROUP BY 1 ORDER BY 1;
json_object_keys
------------------
f1
f2
f3
(3 rows)
postgres=# SELECT json_object_keys(b->'f1') FROM aa WHERE a = 4;
json_object_keys
------------------
f11
f12
(2 rows)

Next, json_populate_record can help in casting a JSON record into a given type.
postgres=# CREATE TYPE aat AS (f1 int, f2 bool, f3 text);
CREATE TYPE
postgres=# SELECT * FROM json_populate_record(null::aat, (SELECT b FROM aa WHERE a = 1)) AS popo;
f1 | f2 | f3
----+----+----------------
1 | t | Hi I'm "Daisy"
(1 row)

This operation can only be used on a single row.
postgres=# SELECT * FROM json_populate_record(null::aat, (SELECT b FROM aa WHERE a = 1 OR a = 2)) AS popo;
ERROR: more than one row returned by a subquery used as an expression

Similarly to json_populate_record, json_populate_recordset can be used on a set of records. It can become particularly powerful when combined with json_agg.
postgres=# SELECT * FROM json_populate_recordset(null::aat, (SELECT json_agg(b) FROM aa WHERE a < 4)) AS popo;
f1 | f2 | f3
----+----+----------------
1 | t | Hi I'm "Daisy"
2 | f | Hi I'm "Dave"
3 | t | Hi I'm "Popo"
(3 rows)

Note that this operation does not work on nested objects, aka when the JSON fields are not strictly the same for each row.
postgres=# SELECT * FROM json_populate_recordset(null::aat, (SELECT json_agg(b) FROM aa WHERE a = 1 OR a = 4), false) AS popo;
ERROR: cannot call json_populate_recordset on a nested object

Finally there are two functions focused on the manipulation and analysis of JSON arrays. The first function is called json_array_length. With this you can get the number of elements in a JSON array.
SELECT json_array_length(b->'f1') FROM aa WHERE a = 5;
json_array_length
-------------------
2
(1 row)
postgres=# SELECT json_array_length(b->'f2') FROM aa WHERE a = 5;
json_array_length
-------------------
3
(1 row)

If used on an object that is not an array, this function complains with a nice error message.
postgres=# SELECT json_array_length(b->'f1') FROM aa WHERE a = 1;
ERROR: cannot get array length of a scalar
postgres=# SELECT json_array_length(b->'f1') FROM aa WHERE a = 4;
ERROR: cannot get array length of a non-array

The second one is json_array_elements which expends a JSON array to a set of elements.
postgres=# SELECT json_array_elements(b->'f1') FROM aa WHERE a = 5;
json_array_elements
---------------------
1
"Robert \"M\""
(2 rows)
postgres=# SELECT json_array_elements(b->'f1') FROM aa WHERE a = 1;
ERROR: cannot call json_array_elements on a scalar
postgres=# SELECT json_array_elements(b->'f1') FROM aa WHERE a = 4;
ERROR: cannot call json_array_elements on a non-array

Combined with the new JSON features for data generation and operators, parsing functions complete the new set of tools implemented in Postgres 9.3 here to leverage the manipulation of JSON data directly on server side. The addition of such features continues the morphing of PostgreSQL from a database software to a database platform, JSON features making it stepping more in the field of NoSQL and document-oriented systems. So now, if you want to create an application which is JSON-oriented, simply use Postgres!

Postgres 9.3 is going to be a great release for JSON data type. After having a look at the new functions for data generation, let’s look at the new JSON features that the commit below brings.
commit a570c98d7fa0841f17bbf51d62d02d9e493c7fcc
Author: Andrew Dunstan
Date: Fri Mar 29 14:12:13 2013 -0400
 
Add new JSON processing functions and parser API.
 
The JSON parser is converted into a recursive descent parser, and
exposed for use by other modules such as extensions. The API provides
hooks for all the significant parser event such as the beginning and end
of objects and arrays, and providing functions to handle these hooks
allows for fairly simple construction of a wide variety of JSON
processing functions. A set of new basic processing functions and
operators is also added, which use this API, including operations to
extract array elements, object fields, get the length of arrays and the
set of keys of a field, deconstruct an object into a set of key/value
pairs, and create records from JSON objects and arrays of objects.
 
Catalog version bumped.
 
Andrew Dunstan, with some documentation assistance from Merlin Moncure.

Based on stored JSON data, this commit introduces a new layer of APIs, operators and functions that can be used to manipulate and process JSON data. There are 4 new operators and 8 new functions (hopefully I counted right), so as there is a lot of content this post is only focused on the new operators.

The following set of data is used for all the examples presented in this post with some subsets of data, arrays and plain variables.
postgres=# CREATE TABLE aa (a int, b json);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (1, '{"f1":1,"f2":true,"f3":"Hi I''m \"Daisy\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (2, '{"f1":{"f11":11,"f12":12},"f2":2}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (3, '{"f1":[1,"Robert \"M\"",true],"f2":[2,"Kevin \"K\"",false]}');
INSERT 0 1

The first operator is “->”, that can be used to fetch field values directly from JSON data. It can be used with a text value identifying the key of field.
postgres=# SELECT b->'f1' AS f1, b->'f3' AS f3 FROM aa WHERE a = 1;
f1 | f3
----+--------------------
1 | "Hi I'm \"Daisy\""
(1 row)

Multiple keys can also be used in chain to retrieve data or another JSON subset of data.
postgres=# SELECT b->'f1'->'f12' AS f12 FROM aa WHERE a = 2;
f12
-----
12
(1 row)
postgres=# SELECT b->'f1' AS f1 FROM aa WHERE a = 2;
f1
---------------------
{"f11":11,"f12":12}
(1 row)

In a more interesting way, when an integer is used as key, you can fetch data directly in a stored array, like that for example:
postgres=# SELECT b->'f1'->0 as f1_0 FROM aa WHERE a = 3;
f1_0
------
1
(1 row)

The second operator added is “->>”. Contrary to “->” that returns a JSON legal text, “->>” returns plain text.
postgres=# SELECT b->>'f3' AS f1 FROM aa WHERE a = 1;
f1
----------------
Hi I'm "Daisy"
(1 row)
postgres=# SELECT b->'f3' AS f1 FROM aa WHERE a = 1;
f1
--------------------
"Hi I'm \"Daisy\""
(1 row)

Similarly to “->”, it is possible to use either an integer or a text as key. For an integer, the key represents the position of element in an array.
postgres=# SELECT b->'f1'->>1 as f1_0 FROM aa WHERE a = 3;
f1_0
------------
Robert "M"
(1 row)
postgres=# SELECT b->'f1'->1 as f1_0 FROM aa WHERE a = 3;
f1_0
----------------
"Robert \"M\""
(1 row)

Of course, you cannot fetch data from an array using a field name.
postgres=# SELECT b->'f1'->>'1' as f1_0 FROM aa WHERE a = 3;
ERROR: cannot extract field from a non-object

As well as you cannot fetch a field using an element number.
postgres=# SELECT b->1 as f1_0 FROM aa WHERE a = 3;
ERROR: cannot extract array element from a non-array

The last 2 operators added are “#>” and “#>>”. With those ones, it is possible to fetch directly an element in an array without using a combo of the type “column->’$FIELD’->$INT_INDEX. This can make your queries far more readable when manipulating arrays in JSON.
postgres=# SELECT b#>'{f1,1}' as f1_0 FROM aa WHERE a = 3;
f1_0
----------------
"Robert \"M\""
(1 row)
postgres=# SELECT b#>>'{f1,1}' as f1_0 FROM aa WHERE a = 3;
f1_0
------------
Robert "M"
(1 row)

“#>” fetches text data in a legal JSON format, and “#>>” fetches data as plain text.

In short, those operators are good meat for brain, and nice additions for many applications.

Postgres 9.2 has introduced JSON as a server data type. At this point, the data was simply stored on server side with integrated wrappers checking that data had a correct JSON format. It was a good first step in order to store directly JSON data on server side but core features in 9.2 have its limitations in terms of JSON data manipulation and transformation.

Two new sets of JSON features have been added to PostgreSQL 9.3 planned to be released this year: functions related to data generation and a new set of APIs for data processing. The one this post deals with the ability to generate JSON data based on existing data types. The second set of features (operators and new processing functions) will be explained in a future post.

So… Functions for JSON data generation have been added by this commit.
commit 38fb4d978c5bfc377ef979e2595e3472744a3b05
Author: Andrew Dunstan
Date: Sun Mar 10 17:35:36 2013 -0400
 
JSON generation improvements.
 
This adds the following:
 
 json_agg(anyrecord) -> json
 to_json(any) -> json
 hstore_to_json(hstore) -> json (also used as a cast)
 hstore_to_json_loose(hstore) -> json
 
The last provides heuristic treatment of numbers and booleans.
 
Also, in json generation, if any non-builtin type has a cast to json,
that function is used instead of the type's output function.
 
Andrew Dunstan, reviewed by Steve Singer.
Catalog version bumped.

The first function called to_json permits to return a given value as valid JSON.
postgres=# create table aa (a bool, b text);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (true, 'Hello "Darling"');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (false, NULL);
INSERT 0 1
postgres=# SELECT to_json(a) AS bool_json, to_json(b) AS txt_json FROM aa;
bool_json | txt_json
-----------+---------------------
true | "Hello \"Darling\""
false |
(2 rows)

Boolean values are returned as plain true/false, texts are quoted as valid JSON fields.

json_agg is a function that can transform a record into a JSON array.
postgres=# SELECT json_agg(aa) FROM aa;
json_agg
---------------------------------------
[{"a":true,"b":"Hello \"Darling\""}, +
{"a":false,"b":null}]
(1 row)

The other tools for data generation are included in the contrib module hstore. Do you remember? This module can be used to store key/value pairs in a single table column. It is now possible to cast hstore data as json with some native casting or with function hstore_to_json.
postgres=# CREATE TABLE aa (id int, txt hstore);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (1, 'f1=>t, f2=>2, f3=>"Hi", f4=>NULL');
INSERT 0 1
postgres=# SELECT id, txt::json, hstore_to_json(txt) FROM aa;
id | txt | hstore_to_json
----+------------------------------------------------+------------------------------------------------
1 | {"f1": "t", "f2": "2", "f3": "Hi", "f4": null} | {"f1": "t", "f2": "2", "f3": "Hi", "f4": null}
(1 row)

Note that in this case boolean and numerical values are treated as plain text when casted.

hstore_to_json_loose can enforce the conversion of boolean and numerical values to a better format, like that:
postgres=# SELECT id, hstore_to_json_loose(txt) FROM aa;
id | hstore_to_json_loose
----+-----------------------------------------------
1 | {"f1": true, "f2": 2, "f3": "Hi", "f4": null}
(1 row)

And now boolean and integer values inserted previously have a better look, no?

Having such tools natively in Postgres core server is really a nice addition for data manipulation and transformation of values into legal JSON.
However, you need to know that this set of tools is only the top of the iceberg for the JSON features added in 9.3… There are also new operators and APIs, which will be covered in more details with examples in one of my next posts. So… TBC.

This week, I spent a long time working on this commit. Just by looking at the date, commit happened before leaving for week-end :)
commit 8ef0c48acadec3c9888d302888a7d279d82323e5
Author: Michael P
Date: Fri Jan 13 16:05:00 2012 +0900
 
Improve target list selection for remote DML queries
 
This commit makes remote DML planning generally available
for replicated and hash tables. There are still issues
related to node selection for round robin tables though.
The target list of UPDATE and DELETE using coordinator quals
was set to fetch only CTID when generating SELECT in their
inner plan generated by create_remotequery_plan.
 
Their target list is rewritten to include the columns in quals
so as to be able to evaluate those quals correctly on Coordinator.
In addition remote planning for UPDATE has been improved to be
able to target correct node when launching query.
 
A new regression test called xc_remote is added, it uses the
parameter enable_fast_query_shipping to force all the queries
to go through standard planner. Tests are done on replicated,
hash and round robin tables.

In all the examples of this article, those two tables are used with the following cluster configuration of Postgres-XC cluster.
db=# select node_name, node_type from pgxc_node; -- 1 Coordinator, 2 Datanodes
node_name | node_type
-----------+-----------
coord1 | C
dn1 | D
dn2 | D
(3 rows)
db=# create table aa (a int, b timestamp) distribute by hash(a);
CREATE TABLE
db=# create table bb (a int, b timestamp) distribute by replication;
CREATE TABLE

For database clusters in general, it is essential to have an efficient and consistent way to manage queries on both local and remote nodes. Efficiency is important to reduce data load on the system. Consistency is even more important to avoid dirty data in your database. So, about queries in general, let’s use an example. SELECT queries may contain expressions that can be evaluated on remote nodes. A common example for that is when the expression is a constant.
db=# explain verbose select * from aa where a = 1;
QUERY PLAN
---------------------------------------------------
Data Node Scan (cost=0.00..0.00 rows=0 width=0)
  Output: a, b
  Node/s: dn1
  Remote query: SELECT a, b FROM aa WHERE (a = 1)
(4 rows)

In this case the query can be completely shipped to the remote node, returning correct results.

Expressions that cannot be pushed down are those who need to be evaluated on local nodes with all the necessary data fetched from remote nodes. For example, let’s take the replicated table bb. We want to select data on it with a time-based expression. Each node of the cluster (at least in the case of Postgres-XC) is located on a different server, each server having a different time line.So, is the following SQL shippable?
SELECT a from bb where b < now();
The answer is no. What is necessary to do is to get all the tuples (a,b) from table bb (a is necessary to send back result), and then apply the time based condition on all the results (explaining why b is necessary).
This results in the following plan.
db=# explain verbose select a from bb where b < now();
QUERY PLAN
-----------------------------------------------------------------
 Result (cost=0.00..1.01 rows=1000 width=4)
 Output: a
 -> Data Node Scan on bb (cost=0.00..1.01 rows=1000 width=4)
   Output: a, b
   Node/s: dn1
   Remote query: SELECT a, b FROM ONLY bb WHERE true
   Coordinator quals: (bb.b < now())
(7 rows)

Well, Postgres-XC has already a lot of mechanisms to manage SELECT and INSERT queries. But what was missing are the parts related to UPDATE and DELETE. So the new functionality committed this week allows to use complex expressions.
For example, in the case of update, you can run sequence and time based updates needing local node evaluation to run consistently.
db=# insert into bb values (1,now());
INSERT 0 1
db=# insert into bb values (2,now());
INSERT 0 1
db=# insert into bb values (3,now());
INSERT 0 1
db=# select * from bb;
a | b
---+---------------------------------
1 | Fri Jan 13 06:26:32.872665 2012
2 | Fri Jan 13 06:26:38.261489 2012
3 | Fri Jan 13 06:26:40.943182 2012
(3 rows)
db=# update bb set a = nextval('seq'), b = now();
UPDATE 3
db=# select * from bb;
a | b
---+---------------------------------
1 | Fri Jan 13 06:28:01.273496 2012
2 | Fri Jan 13 06:28:01.273496 2012
3 | Fri Jan 13 06:28:01.273496 2012
(3 rows)
db=# explain verbose update bb set a = nextval('seq'), b = now();
QUERY PLAN
-----------------------------------------------------------------------
 Update on public.bb (cost=0.00..11.01 rows=1000 width=6)
 Node/s: dn1, dn2
 Remote query: UPDATE public.bb SET a = $1, b = $2 WHERE ctid = $3
 -> Result (cost=0.00..11.01 rows=1000 width=6)
   Output: nextval('seq'::regclass), now(), ctid
   -> Data Node Scan on bb (cost=0.00..1.01 rows=1000 width=6)
      Output: ctid
      Node/s: dn1
      Remote query: SELECT ctid FROM ONLY bb WHERE true
(9 rows)

You need here to select all the data to be updated from remote nodes, then you have to apply the time base expression (now) and the sequence value (nextval), and finally push those values to dedicated remote nodes.

This works also with WHERE clauses using non-shippable expressions.
db=# explain verbose update bb set a = nextval('seq'), b = now() WHERE b < now();
QUERY PLAN
---------------------------------------------------------------------------------
 Update on public.bb (cost=0.00..11.02 rows=1000 width=14)
  Node/s: dn1, dn2
  Remote query: UPDATE public.bb SET a = $1, b = $2 WHERE b = $3 AND ctid = $4
  -> Result (cost=0.00..11.02 rows=1000 width=14)
    Output: nextval('seq'::regclass), now(), b, ctid
    -> Data Node Scan on bb (cost=0.00..1.01 rows=1000 width=14)
      Output: b, ctid
      Node/s: dn1
      Remote query: SELECT b, ctid FROM ONLY bb WHERE true
      Coordinator quals: (bb.b < now())
(10 rows)

Here what is added is a condition to pre-select a subset of rows. Such operation is costly though because you have to fetch all the rows of the table first in inner plan.

The same kind of crazy SQL are also possible for DELETE with mixing shippable and non-shippable expressions.
db=# explain verbose delete from bb where a = 2 and b < now();
QUERY PLAN
------------------------------------------------------------------------------
 Delete on public.bb (cost=0.00..1.02 rows=1000 width=18)
 Node/s: dn1, dn2
 Remote query: DELETE FROM public.bb WHERE a = $1 AND b = $2 AND ctid = $3
 -> Result (cost=0.00..1.02 rows=1000 width=18)
   Output: a, b, ctid
   -> Data Node Scan on bb (cost=0.00..1.02 rows=1000 width=18)
     Output: a, b, ctid
     Node/s: dn1
     Remote query: SELECT a, b, ctid FROM ONLY bb WHERE (a = 2)
     Coordinator quals: (bb.b < now())
(10 rows)

You can notice here that the constant expression "a = 2" is shipped in the most inner plan, improving query efficiency by that much.
A lot of things are now possible, and all this stuff will be included in release 0.9.7!

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use