Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If training is trivially fast that allows you to iterate on architecture choices, hyperparameters, choices which data to include, etc

Of course that only works if the trial runs are representative of what your full scale model will look like. But within those constraints optimising training time seems very valuable





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: