Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> it's difficult for a designer to design something as complex as the designer themselves

AlphaGo ... hello? It beat its creators at Go, and a few months later the top players. I don't think supervised learning can ever surpass its creators in generalization capability, but RL can.

The key ingredient is learning in an environment, which is like a "dynamic dataset". Humans discovered science the same way - hypothesis, experiment, conclusion, rinse and repeat, all possible because we had access to the physical environment in all its glory.

It's like the difference between reading all books about swimming (supervised) and having a pool (RL). You learn to actually swim from the water, not the book.

A coding agent's environment is a compiler + cpu, pretty cheap and fast compared to robotics which require expensive hardware and dialogue agents which can't be evaluated outside their training data without humans in the loop. So I have high hopes for its future.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: