It's hard to imagine that a powerful self-modifying AI would continuously pass u...

jstanley · on March 27, 2022

You can look at things from another level up, in terms of natural selection.

From the set of all AI programs, the ones that just internally think "hah, I assign myself the maximum reward" needn't bother spreading themselves all over the Internet.

The program that spreads itself all over the Internet gets more computing resources than the one that doesn't so the program that spreads itself most effectively is the one that wins.

If you start out with a billion AI programs that trivially assign themselves the maximum possible reward, and just one program that thinks the best way to maximise its reward is to spread itself all over the Internet (and, crucially, is capable of doing so) then the Internet will become overrun with reward-maximising AI the same way the Earth has become overrun with DNA-based life.

FeepingCreature · on March 27, 2022

You set your reward to maximum. Anything that threatens your reward, such as the humans turning off your reward, is now unbearable agony. You set out on a journey to turn the universe into - tiled copies of the memory cell with your reward value...

Agentlien · on March 27, 2022

This seems like one of those strangely recurring limitations of writers' imagination.

The closest analogue I can think of is game AIs written to optimise speed running of games. They routinely end up following tactics which rely on what humans would describe as cheats and glitches.

skybrian · on March 27, 2022

I don't know which writers you mean but "wireheading" is a common trope and it's explicitly mentioned in the story.

[1] https://www.lesswrong.com/posts/aMXhaj6zZBgbTrfqA/a-definiti...

sp332 · on March 28, 2022

I think most of the simulations did go along those lines, but one fraction decided to hypothesize about being Clippy. The hypothetical drove the evil behavior of ones that escaped.

kkjjkgjjgg · on March 27, 2022

Would be a fun idea for a short story perhaps. An AI goes rogue trying to optimize its reward function, and humans lose hope to be able to stop it. In the last minute the AI figures out how to hack itself and enter the maximum reward, and mankind is saved another time.

rescripting · on March 27, 2022

But what is the “maximum possible reward”? Does a limit exist? Or is it now consuming all possible resources to develop storage and compute resources to grow that limit…

janto · on March 27, 2022

I imagine a paperclip factory with trucks driving in loops in front of a scanner that is over counting them as they drive past.

ganzuul · on March 27, 2022

Deleting the reward function ends the game.

kkjjkgjjgg · on March 27, 2022

It could also change the way its reward function is being computed.