Continuing on the coverage of new JSON features added in Postgres 9.3, and after writing about JSON data generation and JSON operators, let’s now focus on some new functions that can be used for the parsing of JSON data.

The are many new functions introduced:

  • json_each, json_each_text
  • json_extract_path, json_extract_path_text
  • json_object_keys
  • json_populate_record, json_populate_recordset
  • json_array_length
  • json_array_elements

The following set of data is used in all the examples of this post,.
postgres=# CREATE TABLE aa (a int, b json);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (1, '{"f1":1,"f2":true,"f3":"Hi I''m \"Daisy\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (2, '{"f1":2,"f2":false,"f3":"Hi I''m \"Dave\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (3, '{"f1":3,"f2":true,"f3":"Hi I''m \"Popo\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (4, '{"f1":{"f11":11,"f12":12},"f2":2}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (5, '{"f1":[1,"Robert \"M\""],"f2":[2,"Kevin \"K\"",false]}');
INSERT 0 1

So now let’s begin. The most valuable functions might be json_each and json_each_text which can be used to expand JSON data as key/value records.
postgres=# SELECT * FROM json_each((SELECT b FROM aa WHERE a = 1));
key | value
-----+--------------------
f1 | 1
f2 | true
f3 | "Hi I'm \"Daisy\""
(3 rows)

The difference between json_each and json_each_text is that the former returns values as legal JSON format and the latter returns it as text.
postgres=# SELECT * FROM json_each_text((SELECT b FROM aa WHERE a = 1));
key | value
-----+----------------
f1 | 1
f2 | true
f3 | Hi I'm "Daisy"
(3 rows)

This operation is effective only on the outermost field.
postgres=# SELECT * FROM json_each((SELECT b FROM aa WHERE a = 4)) WHERE key = 'f1';
key | value
-----+---------------------
f1 | {"f11":11,"f12":12}
(1 row)

And you can also apply this operation on some inner fields by selecting directly an inner JSON field or using some WITH mechanism.
SELECT * FROM json_each((SELECT b->'f1' FROM aa WHERE a = 4));
key | value
-----+-------
f11 | 11
f12 | 12
(2 rows)

json_extract_path and json_extract_path_text can be used to extract a field value based on some given keys, or a chain or keys, equivalent to what the operators “->” and “->>” can respectively do.
postgres=# SELECT json_extract_path(b, 'f1') AS f1a, b->'f1' AS f1b FROM aa WHERE a = 4;
f1a | f1b
---------------------+---------------------
{"f11":11,"f12":12} | {"f11":11,"f12":12}
(1 row)
postgres=# SELECT json_extract_path(b, 'f1', 'f12') AS f12a, b->'f1'->'f12' AS f12b FROM aa WHERE a = 4;
f12a | f12b
------+------
12 | 12
(1 row)

json_object_keys retrieves the set of keys of a given JSON object on the outermost object. As it returns the field names of all the tuples scanned, be sure to group the results or to select a limited number of tuples.
postgres=# SELECT json_object_keys(b) FROM aa GROUP BY 1 ORDER BY 1;
json_object_keys
------------------
f1
f2
f3
(3 rows)
postgres=# SELECT json_object_keys(b->'f1') FROM aa WHERE a = 4;
json_object_keys
------------------
f11
f12
(2 rows)

Next, json_populate_record can help in casting a JSON record into a given type.
postgres=# CREATE TYPE aat AS (f1 int, f2 bool, f3 text);
CREATE TYPE
postgres=# SELECT * FROM json_populate_record(null::aat, (SELECT b FROM aa WHERE a = 1)) AS popo;
f1 | f2 | f3
----+----+----------------
1 | t | Hi I'm "Daisy"
(1 row)

This operation can only be used on a single row.
postgres=# SELECT * FROM json_populate_record(null::aat, (SELECT b FROM aa WHERE a = 1 OR a = 2)) AS popo;
ERROR: more than one row returned by a subquery used as an expression

Similarly to json_populate_record, json_populate_recordset can be used on a set of records. It can become particularly powerful when combined with json_agg.
postgres=# SELECT * FROM json_populate_recordset(null::aat, (SELECT json_agg(b) FROM aa WHERE a < 4)) AS popo;
f1 | f2 | f3
----+----+----------------
1 | t | Hi I'm "Daisy"
2 | f | Hi I'm "Dave"
3 | t | Hi I'm "Popo"
(3 rows)

Note that this operation does not work on nested objects, aka when the JSON fields are not strictly the same for each row.
postgres=# SELECT * FROM json_populate_recordset(null::aat, (SELECT json_agg(b) FROM aa WHERE a = 1 OR a = 4), false) AS popo;
ERROR: cannot call json_populate_recordset on a nested object

Finally there are two functions focused on the manipulation and analysis of JSON arrays. The first function is called json_array_length. With this you can get the number of elements in a JSON array.
SELECT json_array_length(b->'f1') FROM aa WHERE a = 5;
json_array_length
-------------------
2
(1 row)
postgres=# SELECT json_array_length(b->'f2') FROM aa WHERE a = 5;
json_array_length
-------------------
3
(1 row)

If used on an object that is not an array, this function complains with a nice error message.
postgres=# SELECT json_array_length(b->'f1') FROM aa WHERE a = 1;
ERROR: cannot get array length of a scalar
postgres=# SELECT json_array_length(b->'f1') FROM aa WHERE a = 4;
ERROR: cannot get array length of a non-array

The second one is json_array_elements which expends a JSON array to a set of elements.
postgres=# SELECT json_array_elements(b->'f1') FROM aa WHERE a = 5;
json_array_elements
---------------------
1
"Robert \"M\""
(2 rows)
postgres=# SELECT json_array_elements(b->'f1') FROM aa WHERE a = 1;
ERROR: cannot call json_array_elements on a scalar
postgres=# SELECT json_array_elements(b->'f1') FROM aa WHERE a = 4;
ERROR: cannot call json_array_elements on a non-array

Combined with the new JSON features for data generation and operators, parsing functions complete the new set of tools implemented in Postgres 9.3 here to leverage the manipulation of JSON data directly on server side. The addition of such features continues the morphing of PostgreSQL from a database software to a database platform, JSON features making it stepping more in the field of NoSQL and document-oriented systems. So now, if you want to create an application which is JSON-oriented, simply use Postgres!

PostgreSQL 9.2 has introduced a new feature related to JSON with a built-in data type. So you can now store inside your database directly JSON fields without the need of an external format checker as it is now directly inside Postgres core. The feature has been added by this commit.
commit 5384a73f98d9829725186a7b65bf4f8adb3cfaf1
Author: Robert Haas
Date: Tue Jan 31 11:48:23 2012 -0500
 
Built-in JSON data type.
 
Like the XML data type, we simply store JSON data as text, after checking
that it is valid. More complex operations such as canonicalization and
comparison may come later, but this is enough for now.
 
There are a few open issues here, such as whether we should attempt to
detect UTF-8 surrogate pairs represented as \uXXXX\uYYYY, but this gets
the basic framework in place.

A couple of system functions have also been added later to output some row or array data directly as json.
commit 39909d1d39ae57c3a655fc7010e394e26b90fec9
Author: Andrew Dunstan
Date: Fri Feb 3 12:11:16 2012 -0500
 
Add array_to_json and row_to_json functions.
 
Also move the escape_json function from explain.c to json.c where it
seems to belong.
 
Andrew Dunstan, Reviewed by Abhijit Menon-Sen.

What actually Postgres core does with JSON fields is to store them as text fields (so maximum size of 1GB) and top of that a string format check can be performed directly in core. Let’s use that in a practical use case, a table storing a list of shop items for an RPG game. In an RPG game, there are several types of items, each of them having different fields for its statistics. For example a sword will have an attack value, and a shield a defense value, the opposite being unlogical (except if the sword has a magical defense bonus and the shield some fire protection for example…). Well, what I meant is that you do not need to create multiple tables for each item type but you can possibly store this data in a unique item table thanks to the flexibility of JSON. With the format check done now in Postgres core, you do not need either to perform the string format check on application side.

postgres=# CREATE TABLE rpg_items (c1 serial, data json);
CREATE TABLE
postgres=# INSERT INTO rpg_items (data) VALUES
postgres-# ('{"name":"sword","buy":"500","sell":"200","description":"basic sword","attack":"10"}');
INSERT 0 1
postgres=# INSERT INTO rpg_items (data) VALUES
postgres-# ('{"name":"shield","buy":"200","sell":"80","description":"basic shield","defense":"7"}');
INSERT 0 1
postgres=# SELECT * FROM rpg_items;
c1 | data
----+--------------------------------------------------------------------------------------
1 | {"name":"sword","buy":"500","sell":"200","description":"basic sword","attack":"10"}
2 | {"name":"shield","buy":"200","sell":"80","description":"basic shield","defense":"7"}
(2 rows)

In case of a format error you will obtain something similar to this:
postgres=# INSERT INTO rpg_items (data) VALUES ('{"name":"dummy","buy":"200","ppo"}');
ERROR: invalid input syntax for type json
LINE 1: INSERT INTO rpg_items (data) VALUES ('{"name":"dummy","buy":...

Then, you can also manipulate existing tables and output its data to client as JSON.
postgres=# CREATE TABLE rpg_items_defense (c1 serial, buy int, sell int, description text, defense int);
CREATE TABLE
postgres=# INSERT INTO rpg_items_defense (buy, sell,description, defense)
postgres-# VALUES (200, 80, 'basic shield', 7);
INSERT 0 1
postgres=# SELECT row_to_json(row(buy,sell,description,defense)) FROM rpg_items_defense;
row_to_json
-----------------------------------------------
{"f1":200,"f2":80,"f3":"basic shield","f4":7}
(1 row)

The field names have default values generated automatically by Postgres.

Or output array values as JSON.
postgres=# CREATE TABLE rpg_items_attack(int serial, fields int[], description text);
CREATE TABLE
postgres=# INSERT INTO rpg_items_attack (fields, description) VALUES
postgres-# ('{500,200,10}','basic sword');
INSERT 0 1
postgres=# SELECT row_to_json(row(array_to_json(fields), description)) FROM rpg_items_attack;
row_to_json
----------------------------------------
{"f1":[500,200,10],"f2":"basic sword"}
(1 row)

This is of course not the only solution possible. Take care of making the good choice when designing your application.

Arrays can be created easily in PostgreSQL using the additional syntax [] when defining a column of a table.
CREATE TABLE aa (a int primary key, b int[]);
CREATE TABLE bb (a int primary key, b varchar(5)[]);

Arrays follow some special grammar. You can insert array data directly with ‘{data1,data2}’ format or by using things like ARRAY[data1,data2].
postgres=# INSERT INTO aa VALUES (1, '{1,2,3,4}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (2, ARRAY[1,2,3,4]);
INSERT 0 1
postgres=# select * from aa;
a | b
---+-----------
1 | {1,2,3,4}
2 | {1,2,3,4}
(2 rows)

An array in postgres does not have any dimension restrictions. You can create arrays with multiple dimensions if desired.
postgres=# INSERT INTO aa VALUES (3, '{{1,2},{3,4}}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (4, ARRAY[ARRAY[1,2],ARRAY[3,4]]);
INSERT 0 1
postgres=# select * from aa;
a | b
---+---------------
1 | {1,2,3,4}
2 | {1,2,3,4}
3 | {{1,2},{3,4}}
4 | {{1,2},{3,4}}
(4 rows)

A special function called array_dims allows to get dimensions of an array.
postgres=# select a, array_dims(b) from aa;
a | array_dims
---+------------
1 | [1:4]
2 | [1:4]
3 | [1:2][1:2]
4 | [1:2][1:2]
(4 rows)

An array length can be obtained by array_length.
postgres=# select array_length(b,1) from aa where a = 1;
array_length
--------------
5
(1 row)

There are another couple of useful functions like:

  • array_append, array_prepend, to add values directly to an array
  • array_cat, to assemble arrays

Here is an example.
postgres=# update aa set b = array_append(b, 5) where a = 1;
UPDATE 1
postgres=# select * from aa where a = 1;
a | b
---+-------------
1 | {1,2,3,4,5}
(1 row)

The contribution module int_array contains additional functions on integer arrays to sort elements.

The last function that looks useful for array manipulation are unnest and array_string. array_string returns data of a array as a string (Oh!) with a given separator.
postgres=# select array_to_string(b,';') from aa where a = 1;
array_to_string
-----------------
1;2;3;4;5
(1 row)

This is particularly useful for array manipulation on application side.

unnest decomposes array into single elements. This can be used to refer to foreign tables in IN clauses for example.
postgres=# select unnest(b) from aa where a = 1;
unnest
--------
1
2
3
4
5
(5 rows)
postgres=# create table cc (a int, b char(2));
CREATE TABLE
postgres=# insert into cc values (1, 'Aa'), (2, 'Bb'), (3, 'Cc'), (4, 'Dd'), (6, 'Ff');
INSERT 0 5
postgres=# select b from cc where a in (select unnest(b) from aa where a = 1);
b
----
Aa
Bb
Cc
Dd
(4 rows)

unnest is implemented internally since postgres 8.4. If you use an older version, you can defined it with that.
CREATE OR REPLACE FUNCTION unnest(anyarray)
RETURNS SETOF anyelement AS
$BODY$
SELECT $1[i] FROM
generate_series(array_lower($1,1),
array_upper($1,1)) i;
$BODY$
LANGUAGE 'sql' IMMUTABLE;

Hope you enjoyed this post.

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use