Exactly. I've been in web dev for 20 years but mostly in the front end space. A ...

mjfisher · on May 6, 2022

This might be just over-familiarity on my part, but does SQL really have a strong learning curve, or is it just not used often enough directly these days that people can get by without knowing it?

Standard SQL is a really simple grammar and a very small keyword set - there's basically selecting, updating, deleting, filtering with where, aggregate queries, grouping and joins, and that's like 95% of it. Sub-queries maybe too.

jseban · on May 6, 2022

> This might be just over-familiarity on my part, but does SQL really have a strong learning curve, or is it just not used often enough directly these days that people can get by without knowing it?

I think the problem is that it's declarative instead of imperative, which is really kind of a shock if you are not used to it (you can't go step by step, there's no debugger, there are no branches etc), and also that you have to think in sets in terms of your solution, which is also awkward when you're not used to it.

I think it's definitely worth it though, as nothing we have beats the relational model for CRUD, and there are so many great learning tools online, for example: https://sqlbolt.com/

marcosdumay · on May 6, 2022

SQL has a large learning curve, you can keep learning new thing on it for ears. But not a particularly steep one, you can start using it with very little knowledge, and anything extra your learn immediately improves your situation.

outworlder · on May 6, 2022

> does SQL really have a strong learning curve

Depends. It's easy if all you want to do is select * from whatever;

When you get into subqueries and a whole ton of joins to get the information you need, it can get pretty complicated.

I mean, we have a full university course which was 80% SQL, spanning 6 months.

TL;DR the language is not complicated. Actually using it, can be.

spmurrayzzz · on May 6, 2022

In my experience mentoring entry-level/junior devs, mongo's API anecdotally seems to have a much steeper learning curve over SQL. Once you get past the fundamental CRUD idioms, there are a multitude of implementation details that, if treated as opaque by devs, can introduce significant footguns in even moderate throughput load services.

Some of these details go all the way down to the WiredTiger storage engine, but others are more vanilla (e.g. indexing strategies, atomicity guarantees, causal consistency, etc).

I personally abandoned SQL about a decade ago, but I can appreciate how clean the interface semantics are for even non-technical folks. There are certainly platform-specific implementation details that can matter, especially when you get into the world of partitioning. But largely for most service loads, you're writing queries that satisfy the known index constraints that you imposed on yourself rather than constraints resultant from implementation details.

(I totally realize that even with SQL, that last statement completely changes at a certain scale threshold.)

Xeronate · on May 6, 2022

> I personally abandoned SQL about a decade ago

So you're in the camp that nosql data stores like dynamo/mongo is a good replacement for most SQL workflows? Can you expand on this a bit if you have the time?

spmurrayzzz · on May 8, 2022

I don't actually view decisions like this as binary or mutually exclusive at all. I'm a big proponent of polyglot persistence [1], use the data stores you know and double down on what you know well. I use mongo primarily in my current work, but also have redis, elasticsearch, graphite, and etcd as sidecars in the same ecosystem.

I didn't jettison SQL because there was some fundamental limitation from a storage or scalability perspective. It was clear SQL wasn't going anywhere and would be a good infrastructure bet moving forward.

But initially what drew me to mongo was the clean interop between it and JS (Node.js is my runtime of choice). The shell is written in JS, you query (filter) using objects, you insert and update with objects. This seems like a small thing but this sort of developer experience over time is impactful. Everything feels very native to JS and it does so without any heavy abstractions like ORM/ODM.

After having used it now for 10+ years though, there's much more that I admire about it. Both from a pure architecture lens, but also from API perspective as well as it continues to get better.

To cherry pick an example— The new time series collection feature is a good example of that. For years, folks were using the bucket pattern [2] as a means to optimize query strategies for time series data by lowering timestamp cardinality. Now in v5.0, they give you a native way to specify the same kind of collection but they handle all the storage implementation details related to bucketing for you. Your API to interface with that collection remains mostly the same as any other collection. This sort of community-driven roadmap inertia is attractive to me as an engineer.

(Somewhat of a stream of consciousness here, but hopefully gives you some context as to why I made the switch so long ago)

[1] https://martinfowler.com/bliki/PolyglotPersistence.html [2] https://www.mongodb.com/blog/post/building-with-patterns-the...

hodgesrm · on May 9, 2022

Excellent post. I'm very much in the SQL camp myself, but the clean mapping between MongoDB and Javascript data models is outstanding. If you have a front end that just needs persistence as opposed to complex queries MongoDB is the obvious answer.

spmurrayzzz · on May 9, 2022

Its worth noting, for clarity/posterity, that my prior post is actually discussing mongo in a (mostly) backend distributed systems environment. I find its just as impactful there, not just in more conventional CRUD/browser/full-stack apps.

BackBlast · on May 6, 2022

I'll bite, I have use cases alone these lines. I don't use SQL in new projects.

When iterating on new systems, especially one with live users. I can keep users at different document schemas. If I'm careful I can make it so that all document schema changes don't break old ones yet also allow for new functionality not requiring mass migrations of documents.

CouchDB allows the db to just be exposed to the world directly. Projects where it's reasonable (user owned/controlled data particularly), I can stand up the system with 97% front end code 3% backend. Having the near entirety of your application stack in one place means you can use smaller more specialized teams and your overall areas of concerns are smaller without needing to draw up a formal spec for your data transport.

The whole GraphQL vs REST debate is meaningless when you don't even have to think about your transport stack between the server and the browser. There are other perks to this model such as providing a fully functional website/webapp even while offline. It's trivial to switch between a couchdb backend and local pouchdb copy of your db. Potentially lower bandwidth use while just transferring updated docs instead of consistent queries or asset fetching where the same data moves across the wire over multiple uses (not a win for single visits to a single page style sites). Keeping multiple clients on the same document set in sync without socket.io work.

pmoriarty · on May 6, 2022

"SQL has a strong learning curve which many devs want to avoid"

Really? Of the dozens of languages that I've learned, SQL has been the easiest.

It really feels like it was designed for non-programmers.

reidjs · on May 6, 2022

I started a toy project with the intention of using raw SQL, but I ended up starting to build my own ORM around all the models.

If I have a User and who is trying to create a new Post, with Prisma you eventually set it up to do something like User.createPost(content).

What does createPost method look like with raw SQL? Does it read in from a .sql file that you pass values to?

necovek · on May 6, 2022

The big benefit of ORMs is in the query builders they provide: basically, syntax checking for SQL inside your language of choice, and nicer composition of SQL query parts (to make your code more DRY). Actual mapping to objects is always too heavy in my experience.

However, a slightly unrelated comment on your choice of API design: this approach always introduces an asymmetry in the model that restricts what you can do. If you start allowing post imports that auto-detect authors, you now need a Post.create(content) and Post.setUser(user) too. And then your API users start wondering what's the idiomatic way to create a new post.

The problem is that you are making an early assumption that all posts will belong to a user, yet representing that in an SQL database with database relations, one being independent of the other (User: id, name, email...), and another referencing the first (Post: id, date, user -> User, content...). Your database model allows easy transition to allowing nulls for `user`, yet your API doesn't.

Moving to a more functional API makes this much more natural and less restrictive. Shallow DAOs for User and Post and a function create_post(content, user) may look just like a namespacing difference, but they match your database design more closely. If you want to allow nulls for user in the database, you just do the same in the create_post function.

You can wrap related functions into modules (or classes) — in the domain driven design, most of these would be port/adapter functions, but if your DAO classes are sufficiently shallow, they could be service or domain functions too, etc — they are still ultimately functions (no shared state or side effects).

reidjs · on May 6, 2022

Thanks for clarifying, that makes a lot of sense.

jseban · on May 6, 2022

Just a string in my case, JDBC prepared statement or the equivalent. But if I could really choose freely, I would put all queries as functions/procedures inside the DB to achieve real decoupling from the schema, get consistency with transactions etc, but if I mention that idea, the pitchforks come out and I get chased off the property by the backend developers who become pretty much obsolete in that architecture.

jeffdn · on May 6, 2022

That, or just a string in your application’s code.

The problem with using the ORM as you describe is that when you hit any sort of scale, you need to be doing bulk operations, otherwise your latency goes through the roof, to the point that the number of inefficient queries you are doing can tank the database. I speak from the experience of having seen a database collapse under the load of a backend written in this fashion having request load grow past a certain point — not pretty! The interim solution is to bulkify existing queries and functions in place to the greatest extent possible, while preparing for:

Converting a codebase from having endpoints doing individual ORM operations as described to having proper separation of concerns with a business logic layer between the endpoints and the database is a _massive_ cost. The earlier you implement that, the happier you will be in the longer term. It doesn’t have to be with raw SQL, but many bulk operations are much easier to express with SQL than with the ORM.

cloverich · on May 6, 2022

Doesn't an escape hatch on the ORM provide that though? I seem to remember in both sqlalchemy and (libraries that use) knex being able to dip down into SQL when needed.

foobarian · on May 6, 2022

Coincidentally, modern graphQL backend libraries will do this for you. See e.g. graphql-java, apollo-server, many others.

pycal · on May 6, 2022

If you put PostgREST in front of your Postgres instance, it looks like

POST https://my.website.com

{ title: “Cool New Technology”, article: “I learned a thing today”, user_id: 1234 }

jimbokun · on May 6, 2022

I'm fond of query builder APIs, that only allow you to generate valid queries.

So createPost would just generate the appropriate query with the necessary parameters, and execute it.

eurasiantiger · on May 6, 2022

How many different entity types and relationships between entities does your typical application have?