Hacker News new | past | comments | ask | show | jobs | submit login

What makes you think that SQL doesn't have "for loops"?

Ever heard of LATERAL joins/CROSS APPLY?

  SELECT loop.value, x.squared
  FROM generate_series(1,5) AS loop(value)
  CROSS JOIN LATERAL (SELECT loop.value * loop.value AS squared) AS x;



I don't think that the point was that there are no for loops in SQL. I think the point was that almost always using for loops is wrong and super inefficient and there is a much more efficient way to just use joins instead.


I wonder if anyone realizes that selects and joins in every database are implemented literally with for loops written in c.

The reason people use a relational database is because it has loops that are faster, safer, and more efficient than anything you can write.


What does it matter? Yeah, at the very bottom of it all there will be a loop. I wouldn't attribute that to C but to the von Neumann architecture.

The point is that by getting rid of loops you remove one way of telling the computer "How" to do it. Start here, do it in this order, combine the data this way.

When "How" is a solved problem, it is a waste of your brain to think about that again. "What" is a better use of your brain cycles.


Exactly, SQL is like a very specific natural language used to tell a database what data you need. The database contains AI (the query optimizer) that will use statistics to determine the fastest way to retrieve the data requested. Depending on your query and the data you have the fastest way to get your data will change. It’s hard to manually write this yourself.


> joins in every database are implemented literally with for loops written in c.

different kind of loops can be different, e.g. 2 nested loop with quadratic time:

for i in t1: for j in t2:

vs sort + merge join with n log n time.


This is not what I call a for loop but more akin to a list comprehension.

And this is the point I was trying to make.

Instead of start with the "how", learn to do the "what".


What is the point of the cross join? This would work as well:

   SELECT loop.value, loop.value * loop.value
   FROM generate_series(1,5) AS loop(value)


For this example, nothing. It would be useful where neither of the two SELECT queries is a subset/superset of the other. (Not saying you didn't know that.)


Could you give an example?


This will be useful if you have a table with some related history records, e.g., products with product price history, and you want to get the latest price. The lateral join would get all prices for the current product, sort them by date and then pick the top row.


How does a lateral (cross) join compare to a window function in your example?


Is a lateral join more efficient than just saying where product = current sort by date desc ?


LATERAL first available in PostgreSQL 9.3 (2013), but still not available in SQLite.


SQLite has RECURSIVE, so you can generate a table with all numbers using something like:

WITH RECURSIVE cnt(x) AS ( SELECT 1 UNION ALL SELECT x+1 FROM cnt LIMIT 5 ) SELECT x FROM cnt;

And then do a regular CROSS JOIN on that table.


But generating a table with all numbers was just a throwaway example of the LATERAL JOIN syntax/use, and (hopefully) not what you'd actually use it for in the real world.

It's not clear to me that (mathematically) a lateral join can be reduced to a recursive cte (and if the performance of a recursive cte would be acceptable for the cases where it does work as a substitute).


TIL


They're incredibly useful -- any time that you want to both:

- Reference data from the previous part of the query (the "left-hand side")

- Return multiple columns

The only way you can achieve it is with LATERAL/CROSS APPLY.

Regular correlated subqueries can only return a single column, so something like this doesn't work:

  SELECT
    loop.val, (SELECT loop.val * loop.val, 'second column') AS squared
  FROM
    (SELECT loop.val FROM generate_series(1,5) AS loop(val)) as loop
You'd get:

   error: subquery must return only one column


I love LATERALs, but this still fits within set theory and a bulk application rather than an iterative for-loop. It may even be implemented as a for-loop within the engine, but SQL being declarative abstracts that away from the query interface.

It's sets all the way down. A set of f(x) is still a set.


Lets get even more cursed then:

    CREATE TEMP TABLE temp_results(value int, value_squared int);

    DO $$
    DECLARE
        r int;
    BEGIN
        FOR r IN SELECT generate_series FROM generate_series(1,5)
        LOOP
            INSERT INTO temp_results VALUES (r, r * r);
        END LOOP;
    END$$;

    SELECT * FROM temp_results;


Uw you win, I guess. Already started to think what I would write in that code review comment ;)


At the very least mention that re-running the SELECT in the same connection would include the prior results as well because they are preserved across commands within the same connection.


Ha! plpgsql's seemingly sole purpose is to inject imperative code into a set-based environment. Probably does it more smoothly than most pl languages, but that's at the cost of imperative clarity.

But you're right. Postgres does allow for-loops like this. (They're also slower than the equivalent set-oriented approach.)


If anyone is interested, I had gpt explain this SQL to me - and it was really helpful, as I couldnt parse that with my level of SQL...

https://chat.openai.com/share/931b1778-6393-4e86-94b4-b3b5a5...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: