Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
activatedgeek
on Aug 17, 2020
|
parent
|
context
|
favorite
| on:
Karpathy's MinGPT
I think the argument here is about pedagogy not performance.
karpathy
on Aug 17, 2020
[–]
minGPT is actually quite performant too, the min refers to breadth of supported functionality (eg the absence of support for various additional conditioning, exotic masking, masked LMs, finetuning, pruning, etc).
minimaxir
on Aug 17, 2020
|
parent
|
next
[–]
GPT training performance on the CPU is funny. The vocab size and context window size have a
massive
effect on both speed and accuracy.
activatedgeek
on Aug 17, 2020
|
parent
|
prev
[–]
Sure thing! I only meant to imply the relative ordering of considerations.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: