This honestly doesn’t surprise me. We have reached a point where it’s becoming clearer and clearer that AGI is nowhere to be seen, whereas advances in LLM ability to ‘reason’ have slowed down to (almost?) a halt.
I hate to say this but I think the LLM story is going to go the same way as Teslas stock - everyone knows its completely detached from fundamentals and driven by momentum and hype but nobody wants to do the right thing.
We're not even solving problems that humanity can solve. There's been several times where I've posed to models a geometry problem that was novel but possible for me to solve on my own, but LLMs have fallen flat on executing them every time. I'm no mathematician, these are not complex problems, but they're well beyond any AI, even when guided. Instead, they're left to me, my trusty whiteboard, and a non-negligible amount of manual brute force shuffling of terms until it comes out right.
They're good at the Turing test. But that only marks them as indistinguishable from humans in casual conversation. They are fantastic at that. And a few other things, to be clear. Quick comprehension of an entire codebase for fast queries is horribly useful. But they are a long way from human-level general intelligence.
I'm pretty sure there are billions of people on the Earth unable to solve your geometry problem. That doesn't make them less human. It's not a benchmark. You should think about something almost any human can do, not selected few. That's the bar. Casual conversation is one of the examples that almost any human can do.
Any human could do it, given the training. Humans largely choosing not to specialize in this way doesn't make them less human, nor did I imply that. Humans have the capacity for it, LLMs fall short universally.
What do you mean reliably distinguish a computer from a human? I haven't been surprised one time yet. I always find out eventually when I'm talking to an AI. It's easy usually, they get into loops, forget about conversation context, don't make connections between obvious things and do make connections between less obvious things. Etc.
Of course they can sound very human like, but you know you shouldn't be that naive these days.
Also you should of course not judge based on a few words.