This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.
The closer an ORM's API is to a monad, the happier you'll be. Fundamentally, accessing the database is executing remote code; the more you can package up into the query before it goes off and does anything, the better performance you'll see.
IMO trying to shoehorn objects (in the OO sense, with polymorphism, data hiding and behaviour) into a database is wrong-headed. Data hiding in particular is a wrong mental model for thinking about facts in the database, and the more is hidden, the harder it will be to reason about performance and bulk remote operations generally.
> This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.
That isn't true for e.g. Django's ORM, which lazily evaluates the query and only actually accesses the db after that, with a single query, and filtering done in SQL.
This is not the case - with Django's ORM you are still susceptible to the N+1 problem. It can be largely mitigated through `select_related` and `prefetch_related`, but the fundamental issue is there - that simply by accessing an attribute for an object that wasn't already fetched, you can do another database query without it being at all clear in the code that this will happen.
That's true, though select_related is mostly the default, and that behaviour is documented well. In practice, I very seldomly find N+1 cases with Djangos ORM, even when doing relatively complex data analysis.
> Fundamentally, accessing the database is executing remote code; the more you can package up into the query before it goes off and does anything, the better performance you'll see
The queries that most tend to be making just aren't that sophisticated. Relationships tend to be basic.
And I haven't even mentioned stuff that'd really, really hard to express/manage in pure SQL like tree/nested information that has to stay balanced [1]. Thanks django-treebeard/mptt and sqlalchemy-orm-tree.
> The closer an ORM's API is to a monad, the happier you'll be.
Developers using ORM's simply aren't caring about ORM's matching a certain construct. They care that models emit correct representations of their schemas and that the data is retrieved "fast enough".
Take it a different way: the best part about ORM's? They're effective 95% of the time, right out the box, so you end up avoiding time that'd be spent over and prematurely optimizing.
> IMO trying to shoehorn objects (in the OO sense, with polymorphism, data hiding and behaviour) into a database is wrong-headed.
Objects in things like SQLAlchemy declarative and Django Models map perfectly to generated SQL, so they also act as a way to generate migrations. It's that accurate. A lot of the relationships project's need expressed tend to be vanilla joins.
> Data hiding in particular is a wrong mental model for thinking about facts in the database, and the more is hidden, the harder it will be to reason about performance and bulk remote operations generally.
ORM's strive to hit a value sweet-spot in terms of code expressiveness, reducing duplication, handling the bread and butter relationships and types. That covers what most developers really need.
Perhaps there are projects out there not fitting to ORM's. Not all projects are sophisticated data mart projects, but even then, a good share of those still go back to simple joins at the end of the day.
And I've even gone as far as trusting heavy-duty stuff like django-mptt, along with plugins that filter and sort. I don't even look at the queries, all I see is they're running performantly. In all these years, SQL queries have never been a bottleneck. Maybe it's because I'm only storing simple stuff.
An ORM provides easy DB access and various implicit operations. The trade off is that if you don't know what you're doing, your app gonna have bad performance. Seems fair to me.
Also, Django ORM documentation is very clear and you can use Django Debug Toolbar to analyse the raw sql generated.
>This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.
Only if you use naive ways of doing this. In .NET at least your overcomplicated expression can be compiled down to the minimum query needed to pull the bits of data you use.
At least with EF it's possible to completely disable lazy loading. I always recommend doing so -- when you're passing entities between methods it's pretty easy to lose track of what's implicitly 'loaded' on the object. Innocent changes to a method that operates on an entity can cause a database round trip...horrible idea.
This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.
The closer an ORM's API is to a monad, the happier you'll be. Fundamentally, accessing the database is executing remote code; the more you can package up into the query before it goes off and does anything, the better performance you'll see.
IMO trying to shoehorn objects (in the OO sense, with polymorphism, data hiding and behaviour) into a database is wrong-headed. Data hiding in particular is a wrong mental model for thinking about facts in the database, and the more is hidden, the harder it will be to reason about performance and bulk remote operations generally.