What makes you think that SQL doesn't have "for loops"? Ever heard of LATERAL jo...

beefield · on Feb 5, 2024

I don't think that the point was that there are no for loops in SQL. I think the point was that almost always using for loops is wrong and super inefficient and there is a much more efficient way to just use joins instead.

fijiaarone · on Feb 6, 2024

I wonder if anyone realizes that selects and joins in every database are implemented literally with for loops written in c.

The reason people use a relational database is because it has loops that are faster, safer, and more efficient than anything you can write.

fifilura · on Feb 6, 2024

What does it matter? Yeah, at the very bottom of it all there will be a loop. I wouldn't attribute that to C but to the von Neumann architecture.

The point is that by getting rid of loops you remove one way of telling the computer "How" to do it. Start here, do it in this order, combine the data this way.

When "How" is a solved problem, it is a waste of your brain to think about that again. "What" is a better use of your brain cycles.

ako · on Feb 6, 2024

Exactly, SQL is like a very specific natural language used to tell a database what data you need. The database contains AI (the query optimizer) that will use statistics to determine the fastest way to retrieve the data requested. Depending on your query and the data you have the fastest way to get your data will change. It’s hard to manually write this yourself.

riku_iki · on Feb 6, 2024

> joins in every database are implemented literally with for loops written in c.

different kind of loops can be different, e.g. 2 nested loop with quadratic time:

for i in t1: for j in t2:

vs sort + merge join with n log n time.

fifilura · on Feb 5, 2024

This is not what I call a for loop but more akin to a list comprehension.

And this is the point I was trying to make.

Instead of start with the "how", learn to do the "what".

3rd3 · on Feb 5, 2024

What is the point of the cross join? This would work as well:

   SELECT loop.value, loop.value * loop.value
   FROM generate_series(1,5) AS loop(value)

ComputerGuru · on Feb 5, 2024

For this example, nothing. It would be useful where neither of the two SELECT queries is a subset/superset of the other. (Not saying you didn't know that.)

3rd3 · on Feb 5, 2024

Could you give an example?

ako · on Feb 5, 2024

This will be useful if you have a table with some related history records, e.g., products with product price history, and you want to get the latest price. The lateral join would get all prices for the current product, sort them by date and then pick the top row.

3rd3 · on Feb 5, 2024

How does a lateral (cross) join compare to a window function in your example?

ses1984 · on Feb 6, 2024

Is a lateral join more efficient than just saying where product = current sort by date desc ?

ComputerGuru · on Feb 5, 2024

LATERAL first available in PostgreSQL 9.3 (2013), but still not available in SQLite.

fifilura · on Feb 5, 2024

SQLite has RECURSIVE, so you can generate a table with all numbers using something like:

WITH RECURSIVE cnt(x) AS ( SELECT 1 UNION ALL SELECT x+1 FROM cnt LIMIT 5 ) SELECT x FROM cnt;

And then do a regular CROSS JOIN on that table.

ComputerGuru · on Feb 5, 2024

But generating a table with all numbers was just a throwaway example of the LATERAL JOIN syntax/use, and (hopefully) not what you'd actually use it for in the real world.

It's not clear to me that (mathematically) a lateral join can be reduced to a recursive cte (and if the performance of a recursive cte would be acceptable for the cases where it does work as a substitute).

neuronexmachina · on Feb 5, 2024

gavinray · on Feb 5, 2024

They're incredibly useful -- any time that you want to both:

- Reference data from the previous part of the query (the "left-hand side")

- Return multiple columns

The only way you can achieve it is with LATERAL/CROSS APPLY.

Regular correlated subqueries can only return a single column, so something like this doesn't work:

  SELECT
    loop.val, (SELECT loop.val * loop.val, 'second column') AS squared
  FROM
    (SELECT loop.val FROM generate_series(1,5) AS loop(val)) as loop

You'd get:

   error: subquery must return only one column

ttfkam · on Feb 5, 2024

I love LATERALs, but this still fits within set theory and a bulk application rather than an iterative for-loop. It may even be implemented as a for-loop within the engine, but SQL being declarative abstracts that away from the query interface.

It's sets all the way down. A set of f(x) is still a set.

gavinray · on Feb 5, 2024

Lets get even more cursed then:

    CREATE TEMP TABLE temp_results(value int, value_squared int);

    DO $$
    DECLARE
        r int;
    BEGIN
        FOR r IN SELECT generate_series FROM generate_series(1,5)
        LOOP
            INSERT INTO temp_results VALUES (r, r * r);
        END LOOP;
    END$$;

    SELECT * FROM temp_results;

fifilura · on Feb 5, 2024

Uw you win, I guess. Already started to think what I would write in that code review comment ;)

koolba · on Feb 5, 2024

At the very least mention that re-running the SELECT in the same connection would include the prior results as well because they are preserved across commands within the same connection.

ttfkam · on Feb 5, 2024

Ha! plpgsql's seemingly sole purpose is to inject imperative code into a set-based environment. Probably does it more smoothly than most pl languages, but that's at the cost of imperative clarity.

But you're right. Postgres does allow for-loops like this. (They're also slower than the equivalent set-oriented approach.)

samstave · on Feb 5, 2024

If anyone is interested, I had gpt explain this SQL to me - and it was really helpful, as I couldnt parse that with my level of SQL...

https://chat.openai.com/share/931b1778-6393-4e86-94b4-b3b5a5...