Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's difficult to fully describe, so let's just give up and use a deeply flawed benchmark? Why not try to develop benchmarks that actually work and tell us something useful instead?

The key issue is that there is no result an AI can achieve on a standard IQ test which guarantees that same AI can do any task at a superhuman level, apart from taking IQ tests. Can an LLM that scores 250 replace a human driver? Who knows? Can it replace a senior software engineer? Who knows? Can it replace a manual laborer? Again, who can say? We know a human with a 250 IQ can do all those things, but with an AI we have no idea, because those tasks have many more inputs than IQ.

Rather than IQ, which tells us almost nothing concrete, I think we should focus on what tasks it can actually achieve. What's a Waymo's IQ? Who cares?! I don't care about its IQ. I care about its ability to drive me safely across the city. Similarly, I don't care whether a coding AI can drive a car or write great novels.

Of course it's interesting to measure and speculate about IQ as it relates to AGI, but I think it gives people the very mistaken impression that we are on some kind of linear path where all we need to do is keep pushing up a single all-important number.



> It's difficult to fully describe, so let's just give up and use a deeply flawed benchmark? Why not try to develop benchmarks that actually work and tell us something useful instead?

Two reasons. First: in this sub-thread I'm focusing on employment issues due to AI, so consider the quote from above:

> At the point that we have an AI that's capable of every task that say a 110 IQ human is, including manipulating objects in the physical world, then basically everyone is unemployed unless they're cheaper than the AI.

IQ doesn't capture what machines do, but it does seem to capture a rough approximation of what humans do, so when the question is "can this thing cause humans to become economically irrelevant?", that's still a close approximation of the target to beat.

You just have to remember that as AI don't match human thinking, so an AI which is wildly superhuman at arithmetic or chess isn't necessarily able to tie its own shoelaces. The AI has to beat humans at everything (at least, everything economically relevant) at that IQ level for this result.

Second: Lots of people are in fact trying to develop new benchmarks.

This is a major research topic all by itself (as in "I could do a PhD in this"), and also a fast-moving topic (as in "…but if I tried to do a PhD, I'd be out of date before I've finished"). I'm not going to go down that rabbit hole in the space of a comment about the exact performance thresholds an AI has to reach to be economically disruptive.

For a concrete example of quite how fast-moving the topic is, here's a graph of how fast AI is now beating new benchmarks: https://ourworldindata.org/grapher/test-scores-ai-capabiliti...


More importantly, basically all of the IQ tests are in the training sets, so it's hard to know how the models would perform on similar tests not in the training set.


Indeed. People do try to overcome this, for example see the difference in results between "Show Offline Test" and "Show Mensa Norway" on https://trackingai.org/IQ

Even the lower value is still only an upper-bound on human-equivalent IQ, as we can't really be sure the extent to which the training data is (despite efforts) enough to "train for the test", nor can we really be sure that these tests are merely a proxy for what we think we mean by intelligence rather than what we actually mean by intelligence (a problem which is why IQ tests have been changed over the years).

My intention in this sub-thread is more of the economic issues rather than the technical, which complicates things further, because if you have an AI architecture where spending a few tens of millions on compute — either from scratch or as fine-tuning — gets you superhuman performance (performance specifically, regardless of if you count it as "intelligent" or not), then any sector which employs merely a couple of hundred workers (in rich economies) will still have an economic incentive to train an AI on those workers to replace them within a year.

This is still humans having jobs, and still being economically relevant, but makes basically everyone into a contractor that has to be ready and able to change jobs suddenly, which is also economically disruptive because we're not set up for that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: