Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Transactional semantics are problematic with a lot of nosql databases but I've used a few and you can work around this if you have some kind of consistency checks using content hashes. Postgres is pretty nice these days for a wide variety of use cases; including nosql ones. And it does transactions pretty nicely.

Regarding the object relational impedance mismatch, check here: https://en.wikipedia.org/wiki/Object-relational_impedance_mi...

In short, there are lots of things you'd do different in an OO domain model vs. properly normalized tables. A lot of what ORMs do is about taking object relations and mapping those to some kind of table structure. You either end up making compromises on your OO design to reduce the number of tables or on the database side to end up with way too many tables and joins (basically most uses of ORM I've encountered in the wild).

For reporting, you can of course choose to go for a hybrid document/column based approach. I've done that. In a pinch you can even extract some data from the json in an sql query using whatever built in functions the database provides. Kind of tedious and ugly but I've done it.

Or you can use something that actually was built to do reporting properly. I do a lot of stuff in Elasticsearch with aggregations and it kind of blows most sql databases out of the water for this kind of stuff if you know what you are doing. In a pinch, I can do some sql queries and I've also used things like amazon athena (against json or csv in s3 buckets) as well. Awesome stuff but limited. Either way, if that's a requirement, I'd optimize the database schema for it.

But for the kind of stuff people end up doing where they have an employee and customer class that are both persons that have addresses and a lot of stuff that is basically only ever going to be fetched by person id and never queried on, I'll take a document approach every time vs. doing joins between a dozen tables. I also like to denormalize things into documents. Having a category table and then linking categories by id is a common pattern in relational databases. Or you can just decide that the category id is a string that contains some kind of urn or string representation of the category and put those directly in in a column or in the json. You lose the referential integrity check on the foreign key of course; but then you should not rely on your database to do input validation so that check would be kind of redundant.



> Postgres is pretty nice these days for a wide variety of use cases; including nosql ones.

um... what? Are you meaning to say that Postgres does a pretty good job as a document store? (not synonymous with "nosql")

Despite that that wikipedia article says, most (if not all) of the "impedence mismatches" described apply to most document stores as well. I would be curious to hear which of the mismatches described in that article you think are avoided by using Postgres as a document store. In my mind, the reason for using a document store is to have flexibility in the structure of your data (which can be a positive or negative depending on your needs).

> Or you can use something that actually was built to do reporting properly. I do a lot of stuff in Elasticsearch...

Of course there are document stores with good reporting. I was talking specifically about the downside of using Postgres as a document store given your complaints about its native json support being fiddly.

> But for the kind of stuff people end up doing where they have an employee and customer class that are both persons that have addresses and a lot of stuff that is basically only ever going to be fetched by person id and never queried on, I'll take a document approach every time vs. doing joins between a dozen tables. I also like to denormalize things into documents.

I often de-normalize addresses in my tables, but that choice is based on how you will want to store and update that data. A separate address table is good if you want to be able to automatically propagate address edits between records. A de-normalized address is good if you want keep records of that address for the purpose for which it was used. De-normalization is always an option with a relational DB, but normalization is not always easy some document stores.

> Having a category table and then linking categories by id is a common pattern in relational databases. Or you can just decide that the category id is a string that contains some kind of urn or string representation of the category and put those directly in in a column or in the json. You lose the referential integrity check on the foreign key of course; but then you should not rely on your database to do input validation so that check would be kind of redundant.

I'm not quite sure what you are on about here. You can use constraints on columns that are strings and you can have tables that are composed entirely of an indexed string column to point that constraint towards. Integer Ids are primarily used just to save space. (I don't really see how this is relevant.)

I don't see anything here to justify your assertion:

> A good table structure often makes for a poor domain model and vice versa. The friction you get from the object relational impedance mismatch is best avoided by treating them as two things instead of one.

To be frank, it sounds to me like you ran across a bunch of poorly designed DB schemas (or schemas you didn't understand the design decisions for) and decided that it must be impossible to design good DB schemas and so you just use unstructured document stores instead.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: