I find it amazing, that the same ideas pop up in the same period of time. For ex...

LAC-Tech · 2025-06-18T21:45:59 1750283159

I'd have much more confidence in an AI codebase where the human has chosen the property tests, than a human codebase where the AI has chosen the property tests.

Tests are executable specs. That is the last thing you should offload to an LLM.

bccdee · 2025-06-18T22:22:15 1750285335

Also, a poorly designed test suite makes your code base extremely painful to change. A well-designed test suite with good abatractions makes it easy to change code, on top of which, it makes tests extremely fast to write.

I think the whole idea of getting LLMs to write the tests comes from a pandemic of under-abstracted, labour-intensive test suites. And that just makes the problem worse.

LAC-Tech · 2025-06-18T22:24:18 1750285458

Perhaps the viewpoint that tests are a chore or grunt work; something you have to do but you don't really view as interesting or important.

(like how I describe what git should do and I get the LLM to give me the magic commands with all the confusing nouns and verbs and dashes in the right place).

bccdee · 2025-06-19T00:04:17 1750291457

Yeah—I like writing elegant test abstractions much more than I like writing clumsy, verbose unit tests, and there's an inverse relationship between those. Maybe people just don't want to ever bother to refactor a test suite, and so early shortcuts turn into walls of boilerplate.

kenjackson · 2025-06-19T01:13:31 1750295611

While I agree in theory -- the problem I have is that humans I've worked with are much worse at writing tests than they are at writing the implementation. Maybe its motivation or experience, but test quality generally is much worse than implementation quality -- at least in my experience.

koakuma-chan · 2025-06-18T22:08:02 1750284482

How about an LRM?

LAC-Tech · 2025-06-18T22:22:30 1750285350

I do not know this term; could you give a concise explanation?

koakuma-chan · 2025-06-18T22:26:48 1750285608

LRM is a new term for reasoning LLMs. From my experience, either I am bad at prompting, or LRMs are vastly better than LLMs at instruction following.

wahnfrieden · 2025-06-18T20:12:21 1750277541

An under-explored approach is to collect data on human usage of the app (from production and from internal testers) and feed that to your generative inputs