Hacker News new | past | comments | ask | show | jobs | submit login

It can, but sin(x) has infinite number of extremes, and the gradients will vanish at those points. Activations will get stuck at 1 and -1 (x=π/2, 3π/2, ...). They set x+(1/a)*sin²(x) to be monotonic, which fixes this.

Or you need to optimize without using gradients.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: