Hacker News new | past | comments | ask | show | jobs | submit login

...hmm, that was counter to my understanding (limited though it may be...) which was partially formed by this paper: https://arxiv.org/abs/1712.09913

TLDR - loss landscapes are nasty, but you can tame them with skip connections.




These two papers are not necessarily contradicting each other, but perhaps my description was a bit sloppy.

Sagun et al. (and derivative works) only focus on the Hessian on the trajectory followed by gradient descent, while Li et al. give a broader look at the loss surface as a whole.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: