Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure, but you do need to take into account that often fitting a neural network consists of finding the maximum likelihood estimate. So from a Bayesian perspective you're ignoring the prior and you risk overfitting by not considering anything but the most likely alternative. Most attempts to avoid overfitting do not really translate well to the Bayesian perspective.

You can actually recover from this a bit. I saw a paper once where they used the Hessian to approximate the posterior as a gaussian distribution around the maximum likelihood. Can't remember what this paper was called unfortunately.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: