...hmm, that was counter to my understanding (limited though it may be...) which... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

muppet_frog on Oct 20, 2020 | parent | context | favorite | on: Why deep learning works even though it shouldn’t

...hmm, that was counter to my understanding (limited though it may be...) which was partially formed by this paper: https://arxiv.org/abs/1712.09913

TLDR - loss landscapes are nasty, but you can tame them with skip connections.

blackbear_ on Oct 21, 2020 [–]

These two papers are not necessarily contradicting each other, but perhaps my description was a bit sloppy.

Sagun et al. (and derivative works) only focus on the Hessian on the trajectory followed by gradient descent, while Li et al. give a broader look at the loss surface as a whole.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact