Sure, but in both cases you are running a real risk of producing incorrect data ...

mjr00 · 2025-06-02T19:44:05 1748893445

You're absolutely right. You need to verify the script works, and you need to be able to read the code to see what it's actually doing and if it passes the smell test (as a sibling commenter said, the same way you would for a code snippet off StackOverflow). But ultimately for these bits which are largely rote "take data from API, transform into data format X" tasks, LLMs do a great job getting at least 95% of the way there, in my experience. In a lot of ways they're the perfect job for LLMs: most of the work is just typing (as in, pressing buttons on a keyboard) and passing the right arguments to an API, so why not outsource that to an LLM and verify the output?

The challenge comes when dealing with larger systems. Like an LLM might suggest Library A for accomplishing a task, but if your codebase already has Library B for that already, or maybe Library A but a version from 2020 with a different API, you need to make judgment calls about the right approach to take, and the LLM can't help you there. Same with code style, architecture, how future-proof-but-possibly-YAGNI you want your design to be, etc.

I don't think "vibe coding" or making large changes across big code bases really works (or will ever really work), but I do think LLMs are useful for isolated tasks and it's a mistake to totally dismiss them.

bluefirebrand · 2025-06-02T19:54:08 1748894048

> so why not outsource that to an LLM and verify the output?

I mean sure, why not. My argument isn't that it doesn't work, it's that it doesn't really save time

If you try to have it do big changes you will be swamped reviewing those changes for correctness for a long time while you build a mental model of the work

If you have it do small changes, the actual performance improvement is marginal at best, because small changes already don't take much time or effort to create

I really think that LLM-coding has largely just shifted "time spent typing" to "time spent reviewing"

Yes, past a certain size reviewing is faster than typing. But LLMs are not producing terribly good output for large amounts of code still

mjr00 · 2025-06-02T20:04:43 1748894683

I disagree that it doesn't save time for some classes of problems.

As a concrete recent example, I had to write a Python script which checked for any postgres tables where the primary key was of type 'INT' and print out the max value of the ID for each table. I know broadly how to do this, but I'd have to double check which information_schema table to use, the right names of the columns to use, etc. Plus a refresher on direct use of psycopg2 and the cursor API. Plus the typing itself. I just put that query into an LLM and it gave me exactly what I needed, took about 30-60 seconds total. Between the research and typing that's easily 10 minutes saved, maybe closer to 20 really.

And I mean, no, this example isn't worth the $10 trillion or whatever the economy thinks AI is worth, but given that it exists, I'm happy to take advantage of it.

bluefirebrand · 2025-06-02T20:39:56 1748896796

I don't see a lot of value in "saving 10-20 minutes here and there" tbh

Especially since I'm not ever likely to see any benefit from my employer for that extra productivity

keybrd-intrrpt · 2025-06-02T19:32:56 1748892776

> you still should verify the output is correct

And that's a problem with the workflow, not a problem with the LLM.

It's no different than verifying the information from your Google search or the Stack Overflow answer you found works. But for some reason there are people that have higher expectations of LLM output.

bluefirebrand · 2025-06-02T19:39:53 1748893193

People aren't trying to produce entire codebases in 10 minutes using Stack Overflow, or giving it free reign to refactor the entire codebase