It's interesting that, while the system can learn lots of different games, you s...

humbledrone · on Nov 28, 2013

Pac-Man was designed as a game for humans, with a priori knowledge of what kinds of things humans find rewarding. Thus the goal is obvious because it was designed to be similar to other human goals. Eat the food, don't get eaten. For this reason, it's not at all special that humans can determine the goal of the game.

chongli · on Nov 28, 2013

Yeah. Try sticking a more abstract game like Go in front of a random person and see how that works out. Without being taught the rules, a human will have absolutely no idea how to proceed. This would put a human in pretty much the same boat as a computer.

stcredzero · on Nov 28, 2013

A friend of mine had a meta-game he'd play with his step father. His step father would buy a new game, but not tell him the rules. They'd play this game until he figured it out and consistently trounced his step dad. Then his step dad would buy a new game.

tvst · on Nov 29, 2013

Wow, that's a great idea. Sounds like loads of fun.

dwaltrip · on Nov 29, 2013

Secure the largest amount of territory and capture enemy groups? Seems pretty human :p

jkarni · on Nov 29, 2013

Not to Edward Lasker: "The rules of go are so elegant, organic and rigorously logical that if intelligent life forms exist elsewhere in the universe they almost certainly play go."

chongli · on Nov 29, 2013

You got all that from looking at a 19x19 grid?

timje1 · on Nov 28, 2013

I guess that the reward systems that humanity has evolved are complicated and numerous. We've got the basics (food, shelter), the more complicated basics (sex with a suitable mate, companionship) and the million other factors - curiosity, intellectual challenge, positive and negative feedback, power, agency etc, etc....

My thoughts are that if they were to take such a direction with this AI, they'd give it the basics and let it evolve and learn its own complicated reward structure. When you're trying to get a monkey to play pac-man, you bribe him with a capful of ribena - he doesn't care about fun intellectual challenges, but sweet liquids motivate the hell out of him.

(This is the state of actual monkey research - ribena is monkey crack)

Dn_Ab · on Nov 28, 2013

To start on this you would want a system that got rewarded based on how well it was able to predict aspects of its environment. This would have to go in hand with preferring stimulation, so you would need something like preference for inputs which maximize relative entropy with respect to its thus far learned model.

gmt2027 · on Nov 28, 2013

That article on entropy minimization also claimed that a single equation could be the basis of a wide range of intelligent behaviours.

https://news.ycombinator.com/item?id=5579047

analyst74 · on Nov 29, 2013

For tasks that do not reward us biologically (i.e. eating, sleeping), we ultimately depend on other people to give us reward, be that money, acceptance, praise, or whatever.

thom · on Nov 28, 2013

Yeah, this strikes me as similar to - and basically no more intelligent than - Eurisko. Such is the progress we have made on AI in 40 years.

jmmcd · on Nov 28, 2013

No. AIXI is unique, in the technical sense of that word. It's similar to Eurisko in that there are goals and things. If you get any deeper, the similarity ends.