A couple of days ago I saw an email about MVCC catalog access on the PostgreSQL hackers mailing list. When evaluating if a given heap tuple of a table is valid or not, PostgreSQL uses a transaction snapshot which is more or less a list of active transactions IDs, but this snapshot can be used in many flavors to check if the heap tuple satisfies a given visibility condition or not. One of the tuple validation methods using snapshots is called SnapshotNow which is not MVCC-safe. Why? Because the validation using SnapshotNow includes the effects of all the committed transactions as of the instant when validation is done, while MVCC validation considers the transactions that were in progress or unstarted when snapshot was taken as uncommitted.

Validation with SnapshotNow is good for performance as it limits the interactions within backends when operations are done on system tables, and consistency is protected by the necessary lock levels taken when performing a DDL operation.

Trying to solve such problems is particularly interesting for operations like concurrent DDLs (and even for other things) that need to be performed with low level locks, meaning that such DDLs can run while other database operations like common write/read process on a table. The first set of features coming to my mind that could be implemented on top of MVCC catalogs are: REINDEX CONCURRENTLY, CLUSTER CONCURRENTLY and ALTER TABLE CONCURRENTLY. Note that those operations can now be performed out-of-the-box up to a certain level with extensions like pg_reorg or pg_repack, but even those operations need to take an exclusive lock on the objects reorganized when performing switch on the old and new objects.

MVCC catalog access is a necessary milestone in making such features bullet-proof from the consistency point of view, but people are concerned about the performance impact it could have. Using the set of tests created and written by Robert Haas on the mail previously linked, I did some measurements with a 32-core machine to try to have results with large snapshots by having open transactions with up to 2000 connections. The results of those measurements are posted on the mailing list, but they are in a raw shape, not really user-friendly. So I thought that having them in a better reshaped on tables with performance comparison would be nice.

In order to test the backup startup, the following test is done.
$ time for s in `seq 1 1000`; do rm -f
bin/pgsql/master/pg_internal.init && psql -c 'SELECT 2+2' >/dev/null;
done

The measurements have the following characteristics.

  • Tests done while 250, 500, 1000 or 2000 remain open with a transaction ID acquired
  • Test 1 is the creation and drop of 100,000 within 1 backend
  • Test 2 is measure the time to open a backend and run a simple query on it with no relcache file
  • Performance difference percentage is evaluated with a simple formula: abs(nonMVCC – VMCC)/avg(nonVMCC, MVCC)

Those measurements do not really take advantage of the multi-core architecture of the mechine used as they run on a single backend, but well… That’s a good machine…

After re-treating the data…


Series of DROP/CREATE
CREATE (ms) DROP (ms)
Connections MVCC Non-MVCC % perf MVCC Non-MVCC % perf
250 26904.755 24554.849 9.133012372 32891.556 29755.146 10.01300914
500 29105.713 25872.886 11.76031059 33674.336 30434.019 10.10887582
1000 33281.246 28178.21 16.60618669 36618.166 31747.451 14.24901936
2000 39987.815 28708.095 32.83956789 43157.006 32510.057 28.14156802
Backend startup time
Connections MVCC real CPU (s) Non-MVCC real CPU (s) % perf
250 9.297 9.124 1.878291081
500 9.592 9.221 3.94408122
1000 9.746 9.302 4.661906762
2000 9.974 9.113 9.021847331

With a really high number of connections, the performance loss is close 30% for the DDL case and 10% for the connection case. Those test cases are really the worst scenarios possible when working with DDL, as you would hardly find someone dropping that many objects at the same time (btw why not?), and users would use a connection pooling facility before reaching such a number of active connections. For a *low* number of open connections, this performance loss is really low for the backend startup case. For the DDL case, a loss of 10% can be seen even with a low number of connections but just by thinking the possible functionality gain if MVCC catalogs are implemented this looks acceptable, no?

Among one of the many new features implemented in 9.3, pg_dump now offers the possibility to perform parallel dumps. This feature has been introduced by the commit below.
commit 9e257a181cc1dc5e19eb5d770ce09cc98f470f5f
Author: Andrew Dunstan
Date: Sun Mar 24 11:27:20 2013 -0400
 
Add parallel pg_dump option.
 
New infrastructure is added which creates a set number of workers
(threads on Windows, forked processes on Unix). Jobs are then
handed out to these workers by the master process as needed.
pg_restore is adjusted to use this new infrastructure in place of the
old setup which created a new worker for each step on the fly. Parallel
dumps acquire a snapshot clone in order to stay consistent, if
available.
 
The parallel option is selected by the -j / --jobs command line
parameter of pg_dump.
 
Joachim Wieland, lightly editorialized by Andrew Dunstan

This is an extremely nice improvement of pg_dump as it allows accelerating the speed a dump is taken, particularly for machines having multiple cores as the load can be shared among separate threads.

Note that this option only works with the format called directoyy that can be specified with option -Fd or –format=directory, which outputs the database dump as a directory-format archive. A new option -j/–jobs can also be used to define the number of jobs that will run in parallel when performing the dump.

When using parallel pg_dump, it is important to remember that n+1 connections are opened to the server, n being the number of jobs defined, with an extra master connection to control the shared locks taken on the objects dumped. So be sure that max_connections is set up to a number high enough in accordance to the number of jobs that are planned.

Thanks to synchronized snapshots shared among the backends managed by the jobs, the dump is taken consistently ensuring that all the jobs share the same data view. However, as synchronized snapshots are only available since PostgreSQL 9.2, you need to be sure that no external sessions are doing any DML or DDL when performing a dump on servers whose version is lower than 9.2. It is also necessary to specify the option –no-synchronized-snapshots in this case.

Now, using a server having 16 cores, let’s check how this feature performs. For this test, the schema of the database dumped is extremely simple: 16 tables with a constant size of approximately 200MB each (5000000 rows with a single int4 column), for a database having a total size of 3.2GB. Tests are conducted with 1, 2, 4, 8 and 16 jobs, so in the case of 16 jobs one table would be dumped by a unique job running on a single connection. This is of course an unrealistic schema for a production database, but here the point is to give an idea of how this feature can speed up a dump in an ideal case. 5 successive runs are done for each case.

Each test case has been run with the following command:
time pg_dump -Fd -f $DUMP_DIRECTORY -j $NUM_JOBS $DATABASE_NAME

Jobs – Runs(s) 1 2 3 4 5 Avg
1 56.714 54.385 54.242 59.300 57.705 56.47
2 27.023 26.207 27.211 26.112 25.206 26.35
4 12.641 12.797 12.484 12.604 12.486 12.60
8 7.641 7.013 7.913 7.081 6.702 6.27
16 5.086 5.045 5.079 5.216 5.054 5.10

As expected, dump time is halved each time job number is doubled with this ideal database schema. However, due to some I/O disk bottleneck, the time gain is not that important with a high number of jobs. For example, in those series of tests, there is not much difference between 8 and 16 jobs, so be always aware of the I/O your dump disk can manage at most and choose carefully the number of jobs used for dumps based on that.

Postgres 9.3 is going to be a great release for JSON data type. After having a look at the new functions for data generation, let’s look at the new JSON features that the commit below brings.
commit a570c98d7fa0841f17bbf51d62d02d9e493c7fcc
Author: Andrew Dunstan
Date: Fri Mar 29 14:12:13 2013 -0400
 
Add new JSON processing functions and parser API.
 
The JSON parser is converted into a recursive descent parser, and
exposed for use by other modules such as extensions. The API provides
hooks for all the significant parser event such as the beginning and end
of objects and arrays, and providing functions to handle these hooks
allows for fairly simple construction of a wide variety of JSON
processing functions. A set of new basic processing functions and
operators is also added, which use this API, including operations to
extract array elements, object fields, get the length of arrays and the
set of keys of a field, deconstruct an object into a set of key/value
pairs, and create records from JSON objects and arrays of objects.
 
Catalog version bumped.
 
Andrew Dunstan, with some documentation assistance from Merlin Moncure.

Based on stored JSON data, this commit introduces a new layer of APIs, operators and functions that can be used to manipulate and process JSON data. There are 4 new operators and 8 new functions (hopefully I counted right), so as there is a lot of content this post is only focused on the new operators.

The following set of data is used for all the examples presented in this post with some subsets of data, arrays and plain variables.
postgres=# CREATE TABLE aa (a int, b json);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (1, '{"f1":1,"f2":true,"f3":"Hi I''m \"Daisy\""}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (2, '{"f1":{"f11":11,"f12":12},"f2":2}');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (3, '{"f1":[1,"Robert \"M\"",true],"f2":[2,"Kevin \"K\"",false]}');
INSERT 0 1

The first operator is “->”, that can be used to fetch field values directly from JSON data. It can be used with a text value identifying the key of field.
postgres=# SELECT b->'f1' AS f1, b->'f3' AS f3 FROM aa WHERE a = 1;
f1 | f3
----+--------------------
1 | "Hi I'm \"Daisy\""
(1 row)

Multiple keys can also be used in chain to retrieve data or another JSON subset of data.
postgres=# SELECT b->'f1'->'f12' AS f12 FROM aa WHERE a = 2;
f12
-----
12
(1 row)
postgres=# SELECT b->'f1' AS f1 FROM aa WHERE a = 2;
f1
---------------------
{"f11":11,"f12":12}
(1 row)

In a more interesting way, when an integer is used as key, you can fetch data directly in a stored array, like that for example:
postgres=# SELECT b->'f1'->0 as f1_0 FROM aa WHERE a = 3;
f1_0
------
1
(1 row)

The second operator added is “->>”. Contrary to “->” that returns a JSON legal text, “->>” returns plain text.
postgres=# SELECT b->>'f3' AS f1 FROM aa WHERE a = 1;
f1
----------------
Hi I'm "Daisy"
(1 row)
postgres=# SELECT b->'f3' AS f1 FROM aa WHERE a = 1;
f1
--------------------
"Hi I'm \"Daisy\""
(1 row)

Similarly to “->”, it is possible to use either an integer or a text as key. For an integer, the key represents the position of element in an array.
postgres=# SELECT b->'f1'->>1 as f1_0 FROM aa WHERE a = 3;
f1_0
------------
Robert "M"
(1 row)
postgres=# SELECT b->'f1'->1 as f1_0 FROM aa WHERE a = 3;
f1_0
----------------
"Robert \"M\""
(1 row)

Of course, you cannot fetch data from an array using a field name.
postgres=# SELECT b->'f1'->>'1' as f1_0 FROM aa WHERE a = 3;
ERROR: cannot extract field from a non-object

As well as you cannot fetch a field using an element number.
postgres=# SELECT b->1 as f1_0 FROM aa WHERE a = 3;
ERROR: cannot extract array element from a non-array

The last 2 operators added are “#>” and “#>>”. With those ones, it is possible to fetch directly an element in an array without using a combo of the type “column->’$FIELD’->$INT_INDEX. This can make your queries far more readable when manipulating arrays in JSON.
postgres=# SELECT b#>'{f1,1}' as f1_0 FROM aa WHERE a = 3;
f1_0
----------------
"Robert \"M\""
(1 row)
postgres=# SELECT b#>>'{f1,1}' as f1_0 FROM aa WHERE a = 3;
f1_0
------------
Robert "M"
(1 row)

“#>” fetches text data in a legal JSON format, and “#>>” fetches data as plain text.

In short, those operators are good meat for brain, and nice additions for many applications.

Postgres 9.2 has introduced JSON as a server data type. At this point, the data was simply stored on server side with integrated wrappers checking that data had a correct JSON format. It was a good first step in order to store directly JSON data on server side but core features in 9.2 have its limitations in terms of JSON data manipulation and transformation.

Two new sets of JSON features have been added to PostgreSQL 9.3 planned to be released this year: functions related to data generation and a new set of APIs for data processing. The one this post deals with the ability to generate JSON data based on existing data types. The second set of features (operators and new processing functions) will be explained in a future post.

So… Functions for JSON data generation have been added by this commit.
commit 38fb4d978c5bfc377ef979e2595e3472744a3b05
Author: Andrew Dunstan
Date: Sun Mar 10 17:35:36 2013 -0400
 
JSON generation improvements.
 
This adds the following:
 
 json_agg(anyrecord) -> json
 to_json(any) -> json
 hstore_to_json(hstore) -> json (also used as a cast)
 hstore_to_json_loose(hstore) -> json
 
The last provides heuristic treatment of numbers and booleans.
 
Also, in json generation, if any non-builtin type has a cast to json,
that function is used instead of the type's output function.
 
Andrew Dunstan, reviewed by Steve Singer.
Catalog version bumped.

The first function called to_json permits to return a given value as valid JSON.
postgres=# create table aa (a bool, b text);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (true, 'Hello "Darling"');
INSERT 0 1
postgres=# INSERT INTO aa VALUES (false, NULL);
INSERT 0 1
postgres=# SELECT to_json(a) AS bool_json, to_json(b) AS txt_json FROM aa;
bool_json | txt_json
-----------+---------------------
true | "Hello \"Darling\""
false |
(2 rows)

Boolean values are returned as plain true/false, texts are quoted as valid JSON fields.

json_agg is a function that can transform a record into a JSON array.
postgres=# SELECT json_agg(aa) FROM aa;
json_agg
---------------------------------------
[{"a":true,"b":"Hello \"Darling\""}, +
{"a":false,"b":null}]
(1 row)

The other tools for data generation are included in the contrib module hstore. Do you remember? This module can be used to store key/value pairs in a single table column. It is now possible to cast hstore data as json with some native casting or with function hstore_to_json.
postgres=# CREATE TABLE aa (id int, txt hstore);
CREATE TABLE
postgres=# INSERT INTO aa VALUES (1, 'f1=>t, f2=>2, f3=>"Hi", f4=>NULL');
INSERT 0 1
postgres=# SELECT id, txt::json, hstore_to_json(txt) FROM aa;
id | txt | hstore_to_json
----+------------------------------------------------+------------------------------------------------
1 | {"f1": "t", "f2": "2", "f3": "Hi", "f4": null} | {"f1": "t", "f2": "2", "f3": "Hi", "f4": null}
(1 row)

Note that in this case boolean and numerical values are treated as plain text when casted.

hstore_to_json_loose can enforce the conversion of boolean and numerical values to a better format, like that:
postgres=# SELECT id, hstore_to_json_loose(txt) FROM aa;
id | hstore_to_json_loose
----+-----------------------------------------------
1 | {"f1": true, "f2": 2, "f3": "Hi", "f4": null}
(1 row)

And now boolean and integer values inserted previously have a better look, no?

Having such tools natively in Postgres core server is really a nice addition for data manipulation and transformation of values into legal JSON.
However, you need to know that this set of tools is only the top of the iceberg for the JSON features added in 9.3… There are also new operators and APIs, which will be covered in more details with examples in one of my next posts. So… TBC.

PostgreSQL 9.3 comes with a pretty cool feature called materialized views. It has been created by Kevin Grittner and committed by the same person not so long ago.
commit 3bf3ab8c563699138be02f9dc305b7b77a724307
Author: Kevin Grittner
Date: Sun Mar 3 18:23:31 2013 -0600
 
Add a materialized view relations.
 
A materialized view has a rule just like a view and a heap and
other physical properties like a table. The rule is only used to
populate the table, references in queries refer to the
materialized data.
 
This is a minimal implementation, but should still be useful in
many cases. Currently data is only populated "on demand" by the
CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements.
It is expected that future releases will add incremental updates
with various timings, and that a more refined concept of defining
what is "fresh" data will be developed. At some point it may even
be possible to have queries use a materialized in place of
references to underlying tables, but that requires the other
above-mentioned features to be working first.
 
Much of the documentation work by Robert Haas.
Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja
Security review by KaiGai Kohei, with a decision on how best to
implement sepgsql still pending.

What is a materialized view? In short, it is the mutant of a table and a view. A view is a projection of data in a given relation and has no storage. A table is well… A table…
Between that, a materialized view is a projection of table data and has its own storage. It uses a query to fetch its data like a view, but this data is stored like a common table. The materialized view can also be refreshed with updated data by running once again the query it uses for its projection, or have its data truncated. In the last case it is left in an non-scannable state. Also, as a materialized view has its proper storage, it can use tablespaces and its own indexes. Note that it can also be an unlogged relation.

This feature introduces four new SQL commands:

CREATE, ALTER and DROP are common DDL commands here to manipulate the definition of materialized views. What is important here is the new command REFRESH (its name has been a long debate inside the community). This command can be used to update the materialized view with fresh data by running once again the scanning query. Note that REFRESH can also be used to *truncate* (not really though) the data of the relation by running it with the clause WITH NO DATA.

Materialized views have their own advantages in many scenarios: faster access to data than needs to be brought from a remote server (read a file on postgres server through file_fdw, etc.), using data that needs to be refreshed periodically (cache system), projecting data with embedded ORDER BY from a large table, running an expensive join in background periodically, etc.

I can also imagine some nice combinations with data refresh and custom background workers. Who said that automatic data refresh on a materialized view was not possible?

Now let’s have a look at how it works.
postgres=# CREATE TABLE aa AS SELECT generate_series(1,1000000) AS a;
SELECT 1000000
postgres=# CREATE VIEW aav AS SELECT * FROM aa WHERE a <= 500000;
CREATE VIEW
postgres=# CREATE MATERIALIZED VIEW aam AS SELECT * FROM aa WHERE a <= 500000;
SELECT 500000

Here is the size that each relation uses.
postgres=# SELECT pg_relation_size('aa') AS tab_size, pg_relation_size('aav') AS view_size, pg_relation_size('aam') AS matview_size;
tab_size | view_size | matview_size
----------+-----------+--------------
36249600 | 0 | 18137088
(1 row)

A materialized view uses storage (here 18M), as much as it needs to store the data it fetched from its parent table (with size of 36M) when running the view query.

The refresh of a materialized view can be controlled really easily.
postgres=# DELETE FROM aa WHERE a <= 500000;
DELETE 500000
postgres=# SELECT count(*) FROM aam;
count
--------
500000
(1 row)
postgres=# REFRESH MATERIALIZED VIEW aam;
REFRESH MATERIALIZED VIEW
postgres=# SELECT count(*) FROM aam;
count
-------
0
(1 row)

The new status of table aa is effective on its materialized view aam only once REFRESH has been kicked. Note that at the time of this post, REFRESH uses an exclusive lock (ugh...).

A materialized view can also be set as not scannable thanks to the clause WITH NO DATA of REFRESH.
postgres=# REFRESH MATERIALIZED VIEW aam WITH NO DATA;
REFRESH MATERIALIZED VIEW
postgres=# SELECT count(*) FROM aam;
ERROR: materialized view "aam" has not been populated
HINT: Use the REFRESH MATERIALIZED VIEW command.

There is a new catalog table to help you find the current state of materialized views called pg_matviews.
postgres=# SELECT matviewname, isscannable FROM pg_matviews;
matviewname | isscannable
-------------+-------------
aam | f
(1 row)

It is also not possible to run DML queries on it. This makes sense as the data this view has might not reflect the current state of its parent relation(s). On the contrary, a simple view runs its underlying query each time it is needed, so a parent table could be modified through it (per se updatable views).
postgres=# INSERT INTO aam VALUES (1);
ERROR: cannot change materialized view "aam"
postgres=# UPDATE aam SET a = 5;
ERROR: cannot change materialized view "aam"
postgres=# DELETE FROM aam;
ERROR: cannot change materialized view "aam"

Now, a couple of words about performance improvement and degradation you can have with materialized views as you can manipulate indexes on those relations. For example, it is easily possible to improve queries on the materialized views without caring about the schema of its parent relations.
postgres=# EXPLAIN ANALYZE SELECT * FROM aam WHERE a = 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------
Seq Scan on aam (cost=0.00..8464.00 rows=1 width=4) (actual time=0.060..155.934 rows=1 loops=1)
Filter: (a = 1)
Rows Removed by Filter: 499999
Total runtime: 156.047 ms
(4 rows)
postgres=# CREATE INDEX aam_ind ON aam (a);
CREATE INDEX
postgres=# EXPLAIN ANALYZE SELECT * FROM aam WHERE a = 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Index Only Scan using aam_ind on aam (cost=0.42..8.44 rows=1 width=4) (actual time=2.096..2.101 rows=1 loops=1)
Index Cond: (a = 1)
Heap Fetches: 1
Total runtime: 2.196 ms
(4 rows)

Take care also that indexes and constraint (materialized views can have constraints!) of the parent relation are not copied with the materialized view. For example, a fast query scanning some table's primary key might finish with a deadly sequential scan if it is run on an underlying materialized view based on this table.
postgres=# INSERT INTO bb VALUES (generate_series(1,100000));
INSERT 0 100000
postgres=# EXPLAIN ANALYZE SELECT * FROM bb WHERE a = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Index Only Scan using bb_pkey on bb (cost=0.29..8.31 rows=1 width=4) (actual time=0.078..0.080 rows=1 loops=1)
Index Cond: (a = 1)
Heap Fetches: 1
Total runtime: 0.159 ms
(4 rows)
postgres=# CREATE MATERIALIZED VIEW bbm AS SELECT * FROM bb;
SELECT 100000
postgres=# EXPLAIN ANALYZE SELECT * FROM bbm WHERE a = 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Seq Scan on bbm (cost=0.00..1776.00 rows=533 width=4) (actual time=0.144..41.873 rows=1 loops=1)
Filter: (a = 1)
Rows Removed by Filter: 99999
Total runtime: 41.935 ms
(4 rows)

Such designs are of course not recommended on a production system, only be aware that bad designs will badly impact your application performance (that's always the case btw).

It is really a nice thing to have particularly for caching applications! So enjoy!

©2010-2013 Michael Paquier All content is ©Copyright of Otacoo.com 2010-2013. Privacy Policy - Terms of Use