Continuing on the coverage of new JSON features added in Postgres 9.3, and after writing about JSON data generation and JSON operators, let’s now focus on some new functions that can be used for the parsing of JSON data.

The are many new functions introduced:

  • json_each, json_each_text
  • json_extract_path, json_extract_path_text
  • json_object_keys
  • json_populate_record, json_populate_recordset
  • json_array_length
  • json_array_elements

The following set of data is used in all the examples of this post,.
postgres=# CREATE TABLE aa (a int, b json);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (1, '{"f1":1,"f2":true,"f3":"Hi I''m \"Daisy\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (2, '{"f1":2,"f2":false,"f3":"Hi I''m \"Dave\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (3, '{"f1":3,"f2":true,"f3":"Hi I''m \"Popo\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (4, '{"f1":{"f11":11,"f12":12},"f2":2}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (5, '{"f1":[1,"Robert \"M\""],"f2":[2,"Kevin \"K\"",false]}');
INSERT 0 1

So now let’s begin. The most valuable functions might be json_each and json_each_text which can be used to expand JSON data as key/value records.
postgres=# SELECT * FROM json_each((SELECT b FROM aa WHERE a = 1));
key | value
-----+--------------------
f1 | 1
f2 | true
f3 | "Hi I'm \"Daisy\""
(3 rows)

The difference between json_each and json_each_text is that the former returns values as legal JSON format and the latter returns it as text.
postgres=# SELECT * FROM json_each_text((SELECT b FROM aa WHERE a = 1));
key | value
-----+----------------
f1 | 1
f2 | true
f3 | Hi I'm "Daisy"
(3 rows)

This operation is effective only on the outermost field.
postgres=# SELECT * FROM json_each((SELECT b FROM aa WHERE a = 4)) WHERE key = 'f1';
key | value
-----+---------------------
f1 | {"f11":11,"f12":12}
(1 row)

And you can also apply this operation on some inner fields by selecting directly an inner JSON field or using some WITH mechanism.
SELECT * FROM json_each((SELECT b->'f1' FROM aa WHERE a = 4));
key | value
-----+-------
f11 | 11
f12 | 12
(2 rows)

json_extract_path and json_extract_path_text can be used to extract a field value based on some given keys, or a chain or keys, equivalent to what the operators “->” and “->>” can respectively do.
postgres=# SELECT json_extract_path(b, 'f1') AS f1a, b->'f1' AS f1b FROM aa WHERE a = 4;
f1a | f1b
---------------------+---------------------
{"f11":11,"f12":12} | {"f11":11,"f12":12}
(1 row)
postgres=# SELECT json_extract_path(b, 'f1', 'f12') AS f12a, b->'f1'->'f12' AS f12b FROM aa WHERE a = 4;
f12a | f12b
------+------
12 | 12
(1 row)

json_object_keys retrieves the set of keys of a given JSON object on the outermost object. As it returns the field names of all the tuples scanned, be sure to group the results or to select a limited number of tuples.
postgres=# SELECT json_object_keys(b) FROM aa GROUP BY 1 ORDER BY 1;
json_object_keys
------------------
f1
f2
f3
(3 rows)
postgres=# SELECT json_object_keys(b->'f1') FROM aa WHERE a = 4;
json_object_keys
------------------
f11
f12
(2 rows)

Next, json_populate_record can help in casting a JSON record into a given type.
postgres=# CREATE TYPE aat AS (f1 int, f2 bool, f3 text);
CREATE TYPE
postgres=# SELECT * FROM json_populate_record(null::aat, (SELECT b FROM aa WHERE a = 1)) AS popo;
f1 | f2 | f3
----+----+----------------
1 | t | Hi I'm "Daisy"
(1 row)

This operation can only be used on a single row.
postgres=# SELECT * FROM json_populate_record(null::aat, (SELECT b FROM aa WHERE a = 1 OR a = 2)) AS popo;
ERROR: more than one row returned by a subquery used as an expression

Similarly to json_populate_record, json_populate_recordset can be used on a set of records. It can become particularly powerful when combined with json_agg.
postgres=# SELECT * FROM json_populate_recordset(null::aat, (SELECT json_agg(b) FROM aa WHERE a < 4)) AS popo;
f1 | f2 | f3
----+----+----------------
1 | t | Hi I'm "Daisy"
2 | f | Hi I'm "Dave"
3 | t | Hi I'm "Popo"
(3 rows)

Note that this operation does not work on nested objects, aka when the JSON fields are not strictly the same for each row.
postgres=# SELECT * FROM json_populate_recordset(null::aat, (SELECT json_agg(b) FROM aa WHERE a = 1 OR a = 4), false) AS popo;
ERROR: cannot call json_populate_recordset on a nested object

Finally there are two functions focused on the manipulation and analysis of JSON arrays. The first function is called json_array_length. With this you can get the number of elements in a JSON array.
SELECT json_array_length(b->'f1') FROM aa WHERE a = 5;
json_array_length
-------------------
2
(1 row)
postgres=# SELECT json_array_length(b->'f2') FROM aa WHERE a = 5;
json_array_length
-------------------
3
(1 row)

If used on an object that is not an array, this function complains with a nice error message.
postgres=# SELECT json_array_length(b->'f1') FROM aa WHERE a = 1;
ERROR: cannot get array length of a scalar
postgres=# SELECT json_array_length(b->'f1') FROM aa WHERE a = 4;
ERROR: cannot get array length of a non-array

The second one is json_array_elements which expends a JSON array to a set of elements.
postgres=# SELECT json_array_elements(b->'f1') FROM aa WHERE a = 5;
json_array_elements
---------------------
1
"Robert \"M\""
(2 rows)
postgres=# SELECT json_array_elements(b->'f1') FROM aa WHERE a = 1;
ERROR: cannot call json_array_elements on a scalar
postgres=# SELECT json_array_elements(b->'f1') FROM aa WHERE a = 4;
ERROR: cannot call json_array_elements on a non-array

Combined with the new JSON features for data generation and operators, parsing functions complete the new set of tools implemented in Postgres 9.3 here to leverage the manipulation of JSON data directly on server side. The addition of such features continues the morphing of PostgreSQL from a database software to a database platform, JSON features making it stepping more in the field of NoSQL and document-oriented systems. So now, if you want to create an application which is JSON-oriented, simply use Postgres!

Postgres 9.2 has introduced JSON as a server data type. At this point, the data was simply stored on server side with integrated wrappers checking that data had a correct JSON format. It was a good first step in order to store directly JSON data on server side but core features in 9.2 have its limitations in terms of JSON data manipulation and transformation.

Two new sets of JSON features have been added to PostgreSQL 9.3 planned to be released this year: functions related to data generation and a new set of APIs for data processing. The one this post deals with the ability to generate JSON data based on existing data types. The second set of features (operators and new processing functions) will be explained in a future post.

So… Functions for JSON data generation have been added by this commit.
commit 38fb4d978c5bfc377ef979e2595e3472744a3b05
Author: Andrew Dunstan
Date: Sun Mar 10 17:35:36 2013 -0400
 
JSON generation improvements.
 
This adds the following:
 
 json_agg(anyrecord) -> json
 to_json(any) -> json
 hstore_to_json(hstore) -> json (also used as a cast)
 hstore_to_json_loose(hstore) -> json
 
The last provides heuristic treatment of numbers and booleans.
 
Also, in json generation, if any non-builtin type has a cast to json,
that function is used instead of the type's output function.
 
Andrew Dunstan, reviewed by Steve Singer.
Catalog version bumped.

The first function called to_json permits to return a given value as valid JSON.
postgres=# create table aa (a bool, b text);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (true, 'Hello "Darling"');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (false, NULL);
INSERT 0 1
postgres=# SELECT to_json(a) AS bool_json, to_json(b) AS txt_json FROM aa;
bool_json | txt_json
-----------+---------------------
true | "Hello \"Darling\""
false |
(2 rows)

Boolean values are returned as plain true/false, texts are quoted as valid JSON fields.

json_agg is a function that can transform a record into a JSON array.
postgres=# SELECT json_agg(aa) FROM aa;
json_agg
---------------------------------------
[{"a":true,"b":"Hello \"Darling\""}, +
{"a":false,"b":null}]
(1 row)

The other tools for data generation are included in the contrib module hstore. Do you remember? This module can be used to store key/value pairs in a single table column. It is now possible to cast hstore data as json with some native casting or with function hstore_to_json.
postgres=# CREATE TABLE aa (id int, txt hstore);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (1, 'f1=>t, f2=>2, f3=>"Hi", f4=>NULL');
INSERT 0 1
postgres=# SELECT id, txt::json, hstore_to_json(txt) FROM aa;
id | txt | hstore_to_json
----+------------------------------------------------+------------------------------------------------
1 | {"f1": "t", "f2": "2", "f3": "Hi", "f4": null} | {"f1": "t", "f2": "2", "f3": "Hi", "f4": null}
(1 row)

Note that in this case boolean and numerical values are treated as plain text when casted.

hstore_to_json_loose can enforce the conversion of boolean and numerical values to a better format, like that:
postgres=# SELECT id, hstore_to_json_loose(txt) FROM aa;
id | hstore_to_json_loose
----+-----------------------------------------------
1 | {"f1": true, "f2": 2, "f3": "Hi", "f4": null}
(1 row)

And now boolean and integer values inserted previously have a better look, no?

Having such tools natively in Postgres core server is really a nice addition for data manipulation and transformation of values into legal JSON.
However, you need to know that this set of tools is only the top of the iceberg for the JSON features added in 9.3… There are also new operators and APIs, which will be covered in more details with examples in one of my next posts. So… TBC.

Setting up logging for a PostgreSQL server using syslog on a Linux machine is intuitive especially with logging systems like syslog-ng, you just need to put the correct parameters at the right place.

First, you need to setup the system side, by adding the following settings in /etc/syslog-nd/syslog-nd.conf (or similar, don’t hesitate to customize that with your own paths).
destination postgres { file("/var/log/pgsql"); };
filter f_postgres { facility(local0); };
log { source(src); filter(f_postgres); destination(postgres); };

This will send all the logs of postgresql server to /var/log/pgsql. Be sure to combine that with some solution rotating log files to avoid a single file becoming too large… And reload syslog-ng with a command similar to that (varies depending on distribution used, here Archlinux).
systemctl reload syslog-ng

Then, you need to add those settings in postgresql.conf.
log_destination = 'syslog' # Can specify multiple destinations
syslog_facility='LOCAL0'
syslog_ident='postgres'

Based on the documentation, syslog_facility can be set from LOCAL0 to LOCAL7.
Don’t forget that you can also specify multiple log destinations. For example when using stderr and syslog at the same time, simply do that:
log_destination = 'stderr,syslog'

Finally, reload the parameters of server and you are done.
pg_ctl reload -D $PGDATA
Note that restarting the server is not necessary.

PostgreSQL 9.3 adds a new feature related to monitoring with the commit below.
commit ac2e9673622591319d107272747a02d2c7f343bd
Author: Robert Haas
Date: Wed Jan 23 10:58:04 2013 -0500
 
pg_isready
 
New command-line utility to test whether a server is ready to
accept connections.
 
Phil Sorber, reviewed by Michael Paquier and Peter Eisentraut

Called pg_isready, this allows to ping a wanted server to get a status of its activity. This module is a simple wrapper of PQping that can be called directly and customized with a set of options.

Here are the possible options.
$ pg_isready --help
pg_isready issues a connection check to a PostgreSQL database.
 
Usage:
 pg_isready [OPTION]...
 
Options:
 -d, --dbname=DBNAME database name
 -q, --quiet run quietly
 -V, --version output version information, then exit
 -?, --help show this help, then exit
 
Connection options:
 -h, --host=HOSTNAME database server host or socket directory
 -p, --port=PORT database server port
 -t, --timeout=SECS seconds to wait when attempting connection, 0 disables (default: 3)
 -U, --username=USERNAME database username

This feature is really easy to use, for example in the case of a server online.
$ pg_isready -p 5432 -h localhost
localhost:5432 - accepting connections

For a server offline, sending no response back.
$ pg_isready -p 5433 -h localhost
localhost:5433 - no response

For a server rejecting connections.
pg_isready -p 5432 -h $SERVER_IP
$SERVER_IP:5432 - rejecting connections

The feature has also a quiet mode. So scripts can use the output value of pg_isready to check the server activity. Once again with the previous examples.
$ pg_isready -p 5432 -h localhost -q; echo $?
0
$ pg_isready -p 5433 -h localhost -q; echo $?
2

0 is outputted for a server accepting connections, 2 is used in the case where no response comes back from the server. Then, 3 is the result if an internal error happens, like a wrong option specified. 1 corresponds to the case where connections are rejected.

It is honestly more intuitive to have such a wrapper in core than something that uses a query of the type “SELECT 1″ to check the activity of a server. In summary, it is one of those little things that can make your life as a PostgreSQL user easier.

A foreign-data wrapper (FWD) in a Postgres server allows to fetch data from an foreign entity or a foreign server. In this case, the Postgres planner and executer have the notion of what is called a foreign scan, which can be called using customized routines and fetch data that is not directly stored inside the Postgres server itself.

The core code of Postgres includes one FDW which is fdw_file, postgres_fdw is planned to be also included at some point (9.3 discussions).

The installation of a FDW can be done since PostgreSQL 9.1 with the use of CREATE EXTENSION. There are many existing FDW modules that are developed and maintained by the community. Among some of them are:

  • oracle_fdw, to fetch data from an Oracle server
  • mysql_fdw, to fetch data from a MySQL server
  • pgsql_fdw (or sometimes postgres_fdw), to fetch data from another Postgres server
  • twitter_fdw, to fetch data from a Twitter server

Note: Once I thought about a git FDW as git is itself a NoSQL database managing concurrency of commits and branches its own way… But got no time to design or code it.

By the way, the FDW this post is focused on is called redis_fdw, which allows to fetch data from a foreign Redis server and materialize it directly on Postgres side. Before continuing reading this post, be sure that you already have running a Redis server and a Postgres server.
Here both Redis and Postgres server run on a local machine with respectively 6379 and 5432 as port numbers (default values).

Then it is time to install redis_fdw. First fetch the code.
mkdir $REDIS_SRC
cd $REDIS_SRC
git init
git remote add origin https://github.com/dpage/redis_fdw.git
git fetch origin
git checkout master

Then install it. Please note that the current version of the code is not compilable with Postgres 9.2 and upper versions, so for this post the Postgres server is 9.1.X.
make install USE_PGXS=1
This will add redis_fdw.so in folder lib of pgsql install folder and redis_fdw.control and redis_fdw–1.0.sql in share/extension.
Then finalize installation on the server by using CREATE EXTENSION.
postgres=# CREATE EXTENSION redis_fdw;
CREATE EXTENSION
postgres=# \dx redis_fdw
List of installed extensions
Name | Version | Schema | Description
-----------+---------+--------+--------------------------------------------------
redis_fdw | 1.0 | public | Foreign data wrapper for querying a Redis server
(1 row)

Then create the foreign server, its attached foreign table and a user mapping for remote connectivity (you can also refer to the redis_fdw README for additional details).
postgres=# CREATE SERVER redis_server
postgres-# FOREIGN DATA WRAPPER redis_fdw
postgres-# OPTIONS (address '127.0.0.1', port '6379');
CREATE SERVER
postgres=#
postgres=# CREATE FOREIGN TABLE redis_db0 (key text, value text)
postgres-# SERVER redis_server
postgres-# OPTIONS (database '0');
CREATE FOREIGN TABLE
postgres=# CREATE USER MAPPING FOR PUBLIC
postgres-# SERVER redis_server
postgres-# OPTIONS (password '');
CREATE USER MAPPING

On the Redis server side, let’s add a couple of keys with some values.
# redis-cli
redis 127.0.0.1:6379> set foo bar
OK
redis 127.0.0.1:6379> set foo2 bar2
OK

Finally it is possible to query the Redis data directly by connecting on Postgres.
postgres=# EXPLAIN VERBOSE SELECT * FROM redis_db0 WHERE key = 'foo2' OR key = 'foo';
QUERY PLAN
-----------------------------------------------------------------------------
Foreign Scan on public.redis_db0 (cost=10.00..12.00 rows=2 width=64)
Output: key, value
Filter: ((redis_db0.key = 'foo2'::text) OR (redis_db0.key = 'foo'::text))
Foreign Redis Database Size: 2
(4 rows)
postgres=# SELECT * FROM redis_db0 WHERE key = 'foo2' OR key = 'foo';
key | value
------+-------
foo | bar
foo2 | bar2
(2 rows)

And the set of key/values defined on Redis side have been fetched correctly.

Please note that redis_fdw code should not yet be used for production environment, I found for example that it crashes when the EXPLAIN query above is launched two times in a row. However, I think it is a good entry point to understand the possible Redis/Postgres interactions. It would also be worth stabilizing it and realigning it with Postgres master core code at some point.

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use