Hacker News new | past | comments | ask | show | jobs | submit login

Those were trained on human play. This had to figure it out from scratch.



Ah, is this full RL?

I was reading something about LLMs earlier and was thinking that LLMs could probably write a simple case based script for controlling a player, that could accive a decent success rate.


Yes, it's RL from scratch and sparse rewards




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: