Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Using PostgreSQL for JSON Storage (crunchydata.com)
71 points by craigkerstiens on April 30, 2020 | hide | past | favorite | 28 comments


> Let's get the users last name.

    SELECT json_content ##> {person, last_name} FROM mytable;
> The #> or #> is the JSON path navigator with the difference being #> returns JSON and the ##> returns the JSON text value.

That should be #>> (two ">"), not ##> (two "#").

The official docs are really good and they include a number of inline examples showing the difference between the operators: https://www.postgresql.org/docs/current/functions-json.html

The bottom example is missing a colon on the cast as well:

> SELECT json_content FROM mytable WHERE json_content @> '{"status": "Awesome"}':jsonb;

Should be two colons ("::") for an explicit cast: '{"status": "Awesome"}'::jsonb


I don't like those seemingly arbitrary json operators. ->> #>> ?& @>. IMHO, they'd be easier to understand with proper function names.


They’d have to invent a syntax for infix function calls akin to Haskell’s backticks which other people would complain about.

(I’m happy we don’t have to write queries in yaml, only half joking.)



And Postgres has a ton of `JSON_` and `JSONB_` functions, so the weird operators are even more baffling.


Why have infix anything? Functions have names and they're always the first symbol in a function call. SQL also has a convention of using `PLAIN ENGLISH WORDS` which declaratively describe the result set or operation. The weird JSONB operators are just... way out of place.


I guess PostgreSQL doesn't do everything for me :D Thanks for catching those and I will be updating the post soon.

Did you get a chance to try the hands on exercise?


Should be fixed now - thanks!


Not to complain, but it seems the Web has become a huge spam bucket with useful nuggets few and far between.

I have seen such articles a gazillion times till now.

An article that talks about a subject, very very superficially, includes an obvious example, and concludes with a generic statement.

Maybe the "users are the content creators" is not such a good idea after all.


I don't disagree that this article isn't news to anyone following Postgres and having used it. Yet I also have a conversation with someone about once a week that has no idea Postgres has any form of JSON support much less whats possible with it. There are a lot more beginners out there in the world and this type of content is useful to them.

A deep dive of how GIN indexing works for JSON might be a useful nugget for those that want to dive really deep, but there is also a place for plenty of beginner content and things that may seem obvious to you.


I understand your point. But my contention is that there is a lot of spam, especially in the beginner content. So much so that real useful information is nowhere to be found in search engines.

I have recently started learning React and when I search for a topic, I have to wade through a myriad of articles that are basically "Get Node, Get React, CRA new, etc. etc." Nothing of actual use, but search engines pick up because of SEO I guess.

Finding something of value is becoming increasingly difficult. On top of that, I use DDG as my primary search engine and it doesn't help.

For articles regarding PGSQL, their documentation is the ultimate source and I frankly see no reason for these articles to exist.


I hear you on value of articles diving deeper. The Postgres docs are awesome, but I find them most valuable as reference docs once you already know how to do something. I find if you're trying to learn how to do a new thing they're not quite as useful.

There is definitely some value in filling the gap between a hello world of JSON and exactly how to use a specific operator. I was gonna follow up with if there are any specific things that you might find useful on the Postgres arena would be happy to start personally writing a few, but for yourself sounds like docs are good enough :)


Thank you for the gesture.

I love the PGSQL docs. And I usually experiment my way through the docs, as most of the time, I am usually solving a problem and will learn a bit of the underlying mechanism, quirks and tips for performance before going ahead with a method.

For those, I find those docs more than sufficient.

I found the same with Django docs too.

The worst offender is NodeJS, and to a lesser extent, React.

Edit : If only there was a method to allow users to annotate comments onto the core docs, and there was a way to upvote / downvote the comments, we could have better documentation.

I have seen primitive versions of such mechanisms in PHP docs.


Thanks for the feedback. I think the article was really trying to whet people's appetite for the hands on tutorial linked in the article. I wrote the blog post just to give people enough to: 1. Know they can actually store, query, and update JSON in PostgreSQL 2. Get curious and dig in deeper with the free hands on training.


This type of article is just as important as the deeper ones. It's not "obvious" to everyone; most non-Postgres users don't know about its JSON features (yes, most! https://xkcd.com/1053/)


I use JSONB heavily in a large application in production for many years. A few things to be careful of:

Unbounded Data this will have performance problems for your data set as the size of your table grows. One single value in one row that is 8MB of JSON will bring down the performance of an entire collection.

which leads to

Schema keep your documents structured and validate them before they touch the database. At least the parts your indexes care about. GIN indexes can get pretty big.

Indexes GIN indexes are fairly costly to produce on large data sets. You may need to consider partitioning and careful use of GIN indexes to maintain fast query speeds with online updates.


Thanks for the this great information. In the hands on tutorial linked in the article I address some of these constraints you talk about.


At my office, we heavily lean on Postgres' JSONB capabilities, especially for archived data that is occasionally accessed, and also to handle data dumps from other services. The JSONB traversal operators are intuitive and the searching is fast-enough to be useful. Yet another reason that I reach for Postgres first time, every time.


And you can index the fields in the jsonb too! PG all the things!


I use JSONB heavily for a few things:

* Tags. Basically the same as array of strings, which postgres also supports, but I don't need to remember a different set of operators.

* Polymorphic data. Instead of countless nullable fields that represent the union of all fields of all subtypes, I just include a single (or in some cases, a couple) JSONB columns whose schema varies by type. The schema is enforced by types at the application layer.

* Truly schemaless data. Sometimes I just need to keep track of an arbitrary JSON blob defined by some other service (say, an Order from shopify). Just dump it in my table as-is and now I can search orders.

Postgres' json functions and operators aren't wholly intuitive (updating bits of nested data can be hard; it's really a lot better if you can just update whole json fields at once) but overall it works pretty well.


> Polymorphic data. Instead of countless nullable fields that represent the union of all fields of all subtypes, I just include a single (or in some cases, a couple) JSONB columns whose schema varies by type. The schema is enforced by types at the application layer.

I am surprised there aren't many database interface libraries that build upon this. This is a very powerful aspect of JSONB in PGSQL.


Also because the JSONB is so close to a relational data model it's a bit easier I've found to pick up some relational data work when needed or visa versa.

Accounting for user balances vis a vis some clearing and payment processing accounts? Let's do that with ref integrity and relational model.

Developers want to add all sorts of cruft and notes and weird data to the user record or order record they may want to look back on once in a while or display somehow, give them JSONB.

The alternative seems to a docDB that folks stretch 100 ways to sunday to try and shoehorn a relational model in, or visa versa (exploding schemas and slow dev velocity as whoever manages DB complains).

And with AWS doing hosted postgresql RDS somewhat recently with t3 micros as an option - it's really easy to reach for this hammer. I just do a 1 year reserved instance at $5/month for development, and you can deploy to any size.


Postgres admittedly cheated a bit at first with the basic JSON support, validation of JSON but then storing it in a text field didn't quite make it a full document database. But that was also 8 years ago... Even then there is still a valid use case for JSON if you're not looking to parse it or want to preserve the whitespacing-this is handy when recording API input/output for logs.

JSONB came just 2 years later and very much gives more document capabilities. Being a compressed format, the ability to index it, query based on keys and values... all those things make it incredibly handy. It's very rare for me to have an application these days with Postgres that doesn't leverage JSONB in someway.


> Postgres admittedly cheated a bit at first with the basic JSON support, validation of JSON but then storing it in a text field didn't quite make it a full document database.

Did anybody claim it was back then?

> Being a compressed format

I'd argue its a structured format, rather than a compressed format. I.e. it allows traversal without (re-)parsing the json from text. But most of the time the jsonb version won't be (meaningfully) smaller.


Hi Andres... I think I did back then ;)

Though to be fair, yes the PostgreSQL has never officially said it's a full document database to my knowledge.


This is probably premature optimization, but what's the performance cost of updating a deeply nested field inside a jsonb? What about indexing it?

(compared to storing in some normalized form or in other popular document stores)


I made a linter for PG with a rule that will complain if you use JSON instead of JSONB: [link redacted]


Love it!




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: