Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
totoglazer
on Sept 26, 2019
|
parent
|
context
|
favorite
| on:
At Tech’s Leading Edge, Worry About a Concentratio...
Look into the modern nlp models. BERT and its many derivatives, RoBERTa, XLNet. Training all of these require roughly TB of data, and generally take days on multiple TPUs. You often can’t even fine tune on a single GPU without some clever tricks.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: