PostgreSQL 9.4 Beta 1 Released

gdulli · on May 15, 2014

I have an 8TB database with 10 bn rows, each with multiple json columns where most of the real data is stored... so jsonb day is like christmas for me.

gwintrob · on May 15, 2014

Mind explaining your setup a bit more and why you're not using NoSQL? That sounds painful for Postgres.

gdulli · on May 15, 2014

We have an organization full of analysts who know SQL very well and write complex queries. Moving them all to a mapreduce query language proprietary to mongo, riak, or cassandra (and then being locked into it) would have been much more painful. Postgres has worked out very well, and I had confidence in it from prior experience which influenced the decision. With proper partioning in the schema and a json type we have the best of both worlds (SQL+NoSQL) as far as this system needs.

meritt · on May 15, 2014

Or alternately: Why wasn't the JSON parsed into normal form if the intent was to search & filter across that data.

gdulli · on May 15, 2014

To get the same flexibility you'd want from a nosql data store. Without giving up the simpler, more open access methods. The schema is partly predictable (and indexed or fielded on those parts) and partly unpredictable.

ddorian43 · on May 15, 2014

why not add the partly predicted keys as columns ?

also jsonb will require a little more storage

is 8tb in 1 tablespace/db ?

darkhorn · on May 15, 2014

johanneskanybal · on May 15, 2014

After playing around in nosql land for the last 3 years I've landed at postgresql as my go-to sql alternative although the market will keep being fragmented.

edit: json support in combination with a less fanatic approach to normalization can approximate some of the benefits a nosql model can while maintaining the bulk of the tried-and-tested flows in a relational database.

rpedela · on May 15, 2014

SQL alternative?

symfoniq · on May 15, 2014

I think he probably means that certain Postgres features like hstore and JSON support allow it to be used in situations where he once would have chosen a no-SQL solution. This has been my own experience.

johanneskanybal · on May 15, 2014

sql alternative for nosql. @symf: yea pretty much.

tbarbugli · on May 15, 2014

Postgresql is getting better and better! I am happy lately every release has some more feature / improvement about replication.

alternize · on May 15, 2014

looking especially forward to JSONB, the extended array manipulation functions and the improved GIN index!

giulianob · on May 15, 2014

I can't wait until they support manipulating JSON directly (e.g. insert object to end of a JSON array)

ddorian43 · on May 15, 2014

i think there were functions for the new hstore(that became jsonb) but coudn't make it in core so will probably/hopefully be released as an extension

ukd1 · on May 15, 2014

non blocking refresh on materialized views is going to be massive! I wonder when it's going GA and Heroku rollout.

elchief · on May 15, 2014

Should be GA in August/September if the present mimics the past.

vfclists · on May 15, 2014

When is PostgreSQL going to get its own modern language which is functional/object-oriented/homoiconic/has-all-new-fangled-goodies built in, with a GUI language thrown in for good measure, so I can do everything in PostgreSQL without having to turn to another language?

audreyt · on May 15, 2014

Well, http://pgre.st/ was posted 167 days ago here — https://news.ycombinator.com/item?id=6818284 — that's how we do everything in PostgreSQL without having to turn to another language :-)

It's like Kanso/CouchApp for CouchDB, and has support for the functional/object-oriented/new-fangled LiveScript language on top of the v8 engine.

Still working on the homoiconicity part, though!

jeffdavis · on May 16, 2014

By "homoiconic" do you mean that the language would be expressed as a table (unordered multiset of tuples)?

Interesting thought experiment, but it sounds pretty radical.

vfclists · on May 16, 2014

To be honest I have no idea what homoiconic means.

I think it means that everything is the same, ie whether a function in the case of functional languages like Haskell, or an object in languages like Smalltalk.

jeffdavis · on May 16, 2014

http://en.m.wikipedia.org/wiki/Homoiconic

"In a homoiconic language the primary representation of programs is also a data structure in a primitive type of the language itself. This makes metaprogramming easier..."

For lisp, that means programs are are also lists, so can be manipulated with macros easily (because lisp macros are also lisp, which is good at working with lists).

For a relational language, that would mean the programs are also relations, which would be a little weird because relations are unordered.

azinman2 · on May 15, 2014

Going to be such a great release. Congrats PostgreSQL team! I love your product!

polskibus · on May 15, 2014

I'm particularly excited about improvements in materialised views that will make Postgresql better suited to OLAP usage scenarios.

corford · on May 15, 2014

pg_prewarm looks interesting but the docs are currently a little opaque as to how best to use it (for me at least!): http://www.postgresql.org/docs/devel/static/pgprewarm.html

wfn · on May 15, 2014

Oh, is it now a built-in module in 9.4beta?

I've used it in 9.1 (downloading the patch, applying it, recompiling postgres..) - have found this article re: postgres caching mechanisms (and pg_prewarm use in particular (though re: pg_prewarm, most of it is about applying the patch itself, etc.)) useful:

http://raghavt.blogspot.com/2012/04/caching-in-postgresql.ht...

P.S. here are some more comments from the author: http://www.postgresql.org/message-id/CA+TgmoZihvzFW6n6pPwzDO... (btw, I've found the pgsql-hackers to be a great mailing list indeed!)

corford · on May 15, 2014

Thanks wfn, those links are really helpful.

alternize · on May 15, 2014

very interesting indeed, thanks for the pointer

bradleysmith · on May 15, 2014

"ALTER SYSTEM SET, which enables modifications to postgresql.conf from the SQL command line and from remote clients, easing administration tasks"

This is fantastic. We're always adding and removing ip addresses and restarting the server with each update was a pain.

icebraining · on May 15, 2014

Why did you restart the server? You can just send SIGHUP to have it reload the file without restarting.

bradleysmith · on May 15, 2014

Actually, I wasn't restarting the server. The pain was reloading that file, as it would break work that other people were doing at the time. I'm looking into whether modifying the postgresql.conf file from a query line will let us do what we were having a need to do.

bradleysmith · on May 15, 2014

Hah! Because I'm self-taught, and I didn't know I could.

Appreciate it!

joevandyk · on May 15, 2014

There's a number of options that you can change in postgresql without having to restart the whole server.

http://www.postgresql.org/docs/9.3/static/auth-pg-hba-conf.h... explains that you can just do a reload after changing pg_hba.conf.

http://www.postgresql.org/docs/9.3/static/config-setting.htm... explains more.

bradleysmith · on May 15, 2014

thanks much, digging through both.

joevandyk · on May 15, 2014

Make sure to have your backup routine archive the custom config file.

gfisher · on May 15, 2014

The refresh concurrently materialized views is huge. I am glad that it was introduced since I currently have to use tables with functions to create/drop to get the same functionality.

nevi-me · on May 15, 2014

Am I right in saying that JSONB is another way to say BSON?

icebraining · on May 15, 2014

No, they're different formats: http://pgeoghegan.blogspot.pt/2014/03/what-i-think-of-jsonb....

justplay · on May 15, 2014

waiting for upset improvement.

pilif · on May 15, 2014

In that regard, have a look at http://www.depesz.com/2012/06/10/why-is-upsert-so-complicate... which is a very good article explaining why you have to continue to wait.

Getting MERGE right is one of these "pick two" things. You get to chose between "fast", "doesn't corrupt data" and "doesn't throw errors at times" but you only get to pick two.

ajtaylor · on May 16, 2014

Pretty much every Pg related post on depesz.com is a gold mine. You should read them all, especially where he explains upcoming features in new Pg releases. I've learned a ton simply be browsing through his site, which is very readable with great examples to put the concepts into practice.

spitfire · on May 15, 2014

I pick correct semantics NOW and fix performance issues at your leisure.

upsert has been an embarrassment for Postgres for far too long now. Particularly obnoxious was the "upsert is hard! Roll your own" stance taken.

jeffdavis · on May 15, 2014

A completely reasonable stance.

However, it's not so much a performance question as one of concurrency. The easiest way to get the right semantics is by locking the entire destination table against concurrent writes. Maybe that's fine for a lot of users. But it has caused some hesitation, because other users might be disappointed.

It would be fair to say "just do it" though.

audreyt · on May 16, 2014

Part of the motivation behind PgREST is to get a reusable and generic enough upsert implementation as a stored procedure; see https://github.com/pgrest/pgrest/blob/master/test/upsert.ls for details.

azinman2 · on May 15, 2014

That's a really great blog post, as well as the most informative blog I've seen on postgresql in general! Thanks for the link!

andyl · on May 15, 2014

With 9.4, what will be the best option for multi-master replication? Will 3rd party tools like bucardo remain the preferred method, or will there be another strategy that is more integrated with PG?

jeffdavis · on May 15, 2014

The logical decoding feature is a very important foundation that will help multi-master solutions. No multi-master out-of-the-box today though.

EDIT: logical decoding allows you to get access to a stream of logical data changes, e.g. inserts, updates, deletes, in a robust and efficient way.

rpedela · on May 15, 2014

My understanding is you still need the 3rd party tools. The new replication features lay the foundation for other replication strategies like multi-master, but they are not implemented yet.

rdtsc · on May 15, 2014

In general on the CAP spectrum where does multi-master replication with Postgres fall in?

DrJokepu · on May 15, 2014

Replication in PostgreSQL is actually highly configurable so it depends on how you set it up: http://www.postgresql.org/docs/9.3/static/high-availability....

gatehouse · on May 15, 2014

CP, ~A, AFAIK