Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nowadays, even the definition of an "epoch" is not well defined. Traditionally it meant a pass over the entire training set, but datasets are so massive today that many now define an epoch as X steps - where a step is a minibatch (of whatever size) from the training set. So 1 epoch is a random sample of X minibatches from the training set. I'd guess the logic is that datasets are so massive that you pick as much data as you can fit in VRAM.

Karpathy's Zero To Hero series also uses this.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: