If any non-trivial ask of an LLM also requires the prompts/scaffolding to be lis...

mwigdahl · 2025-03-28T14:17:54 1743171474

That isn't what I'm saying. I'm saying you can't make a blanket statement that LLMs in general aren't fit for some particular task. There are certainly tasks where no LLM is competent, but for others, some LLMs might be suitable while others are not. At least some level of detail beyond "they used an LLM" is required to know whether a) there was user error involved, or b) an inappropriate tool was chosen.

butlike · 2025-03-28T16:11:13 1743178273

then they shouldn't market it as one-size fits all

mwigdahl · 2025-03-28T16:30:27 1743179427

Are they? Every foundation model release includes benchmarks with different levels of performance in different task domains. I don't think I've seen any model advertised by its creating org as either perfect or even equally competent across all domains.

The secondary market snake oil salesmen <cough>Manus</cough>? That's another matter entirely and a very high degree of skepticism for their claims is certainly warranted. But that's not different than many other huckster-saturated domains.

TexanFeller · 2025-03-28T17:39:09 1743183549

People like Zuckerberg go around claiming most of their code will be written by AI starting sometime this year. Other companies are hearing that and using it as a reason(or false cover) for layoffs. The reality is LLMs still have a way to go before replacing experienced devs and even when they start getting there there will be a period of time where we’re learning what we can and can’t trust them with and how to use them effectively and responsibly. Feels like at least a few years from now, but the marketing says it’s now.

some_random · 2025-03-28T15:46:16 1743176776

In many, many cases those problems are resolved by improvements to the model. The point is that making a big deal about LLM fuck ups in 3 year old models that don't reproduce in new ones is a complete waste of time and just spreads FUD.