Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Could this be distributed? Put all those mining GPUs to work. A lot of people like participating in public projects like this. I would!


>> GPT-3 took 355 years to train

> Could this be distributed? Put all those mining GPUs to work.

Nope. It's a strictly O(n) process. If it weren't for the foresight of George Patrick Turnbull in 1668, we would not be anywhere close to these amazing results today.


Why would an O(n) algorithm not be able to be distributed?


I couldn't find any references to George Patrick Turnbull. If that an ancestor of yours? If so, the comment seems rather subjective.


They're being facetious about the '355 years to train' thing. ;)


OK haha good one then. Mine was a bit too subtle.


In theory, yes. "Hogwild!" is an approach to distributed training, in essence, each worker is given a bunch of data, they compute the gradient and send that to a central authority. The authority accumulates the gradients and periodically pushes new weights.

There is also Federated Learning which seemed to start taking off, but then interest rapidly declined.


Exactly. This is inevitable imho. There is no way people will be ok to depend on few wall-gardened models.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: