Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You should pencil out on a napkin just how long "more time" is. Here, i'll get you started:

1600 inferences per move * 1ms per inference * 250 moves/game * 30M games played = 12B seconds. 140k days; muzero with gumbel brought down the 1600 to ~40, but either way, you need some more scale.

It turns out a lot of the difficulties, judgment calls, and implementation details involve data pipelining. Some of those choices affect the final skill ceiling you reach. Which ones? How much? Are they path dependent? Well, you'll need to run it more than once...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: