These two papers are not necessarily contradicting each other, but perhaps my description was a bit sloppy.
Sagun et al. (and derivative works) only focus on the Hessian on the trajectory followed by gradient descent, while Li et al. give a broader look at the loss surface as a whole.
TLDR - loss landscapes are nasty, but you can tame them with skip connections.