Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Andrej Karpathy had some awesome tweets on this yesterday, you can see the thread in the links, but these two were my favorite:

"Yes AlphaGo only won by 0.5, but it was not at all a close game. It's an artifact of its training objective."[0]

and:

"it prefers to win by 0.5 with 99.9999999% chance instead of 10.0 with 99.99% chance."[1]

[0]https://twitter.com/karpathy/status/867075706827689985 [1]https://twitter.com/karpathy/status/867077807779717121



I know it's a totally different game, but it would be interesting to see AlphaGo play a game optimized for expected number of points, instead of low variance wins. I assume it would require retraining.


I would really love to see an AlphaGo with even the minimal tweak of maximize points while maintaining win % > 95%.

I really wouldn't be shocked in those final moves it was optimizing for 5 or 6 "9"s of success probability. Which will still true to programming, it also obfuscates how good it really it.

That said, there is some benefit to not completely embarrassing your human player who previously was known as the best, but has volunteered to be beaten repeatedly on international TV.

Going into this Ke Jie knew he was going to lose, and still agreed.

Speaking as an inexperienced go player, I would not be shocked if the new Alpha Go were 500-1000 Elo points better than Ke Jie. It's true skill being masked by its ultra-conservative style of play.

The original AlphaGo only lost a single game to Lee Sedol that was essentially just a bug, not because it was "weaker". DeepMind has said this completely retrained AlphaGo is significantly stronger than the original.

[1] Partial support for the possibly 1000 pt Elo advantage of Alpha Go. http://en.chessbase.com/post/alphago-vs-lee-sedol-history-in...


> that was essentially just a bug

I'm not sure that's right, everything was working correctly, it just didn't read out a low-probability move very deep.


I'd agree it was an algorithmic bug and not an implementation bug.

From my reading, it essentially got AlphaGo into a state where it was no longer reading the board correctly. The algorithmic bug was play a decent but extremely improbable move and AlphaGo won't know how to respond.

Or by a similar argument, I think most people would say that if Lee Sedol hadn't played that move and the game continued 'normally' he would've lost like the other games. The rarity of the move is why he won, not the "strength".

Essentially they trained the app on too specific data. Their main fix was to retrain the next version from "scratch" instead of from moves humans are likely to make.


If I remember correctly, the problem wasn't that it didn't see the move, the problem was that in response to the move it played a few really bad exchanges, like the stupid-looking wedge in the bottom-left, and adding stones to a dead group on the right. Playing bad exchanges when behind is a bad idea even from AlphaGo's perspective. It's still not correct to call it a bug, though.


It really bugs me whenever I see this as evidence of AlphaGo playing in some fundamentally nonhuman way. It's a basic tenant of Go strategy that, if you are ahead in points, then you play conservatively. Professional players are very good at counting territory and assessing point values netted by various moves, especially in the endgame. They'll take a smaller margin and a safer win over a riskier but larger but riskier margin, too.


Sure, but:

A) Conservative play (by humans) tends to maintain the margin, not shrink it

B) Show me just one human player who would not waver under the psychologic pressure of only leading by 0.5 points


Would that mean that a slight change in board evaluation rules would require Alpha Go to retrain all of its strategies?


So, effectively, it's playing with its food?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: