Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Uhh, not all but maybe most in production. You can use any optimization technique you want on training the weights including things like evolutionary algorithms or simulated annealing which are entirely different from what's listed here. Evolutionary style methods may be SOTA for continuous control reinforcement learning problems... Consider how backprop or hill climbing or LBFG-S performs on something basic like cart pole


Evolutionary algorithms are SOTA for continuous control? Never saw a paper claiming that. Which one of them can even compare to soft actor-critic?

I believe gradient descent is the king everywhere interesting right now (e.g. CV, NLP, RL).



I have seen this paper. The blog post does not report, that they were able to train a better agent, than A3C. Only that ES allowed them to use more compute power to train in parallel.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: