Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here is a C implementation of GPT-2. https://bellard.org/nncp/gpt2tc.html

I don't disagree that hardware acceleration is key in enabling these models, but I still find it interesting how simple the core techniques are.



Is there anything Bellard hasn’t done?


He didn’t release blueprints of cheap homemade 1MW fusion reactor yet, but otherwise - yep, it seems that he covered everything else.


he didn't rewrite everything in rust.


Once he picks up Rust, he'll become a 10x programmer.


What a loss. To be downgraded that much.


That's not really a C implementation of GPT-2 since it cannot be used to do the thing everyone cares about: self-supervised learning from text. In fact, it doesn't even use the weights in the same way GPT-2 does, so it's not clear how close it is to GPT-2's inference mode. The source isn't even on the page.


Hm, I notice the source code is missing? I did I overlook it


This is very cool, thanks for sharing! From the readme (https://bellard.org/nncp/readme-gpt2tc.txt), the program benchmarks very comparably to CMIX, which is the top algorithm on the Large Text Compression Benchmark (http://mattmahoney.net/dc/text.html). I'm guessing that any GPT implementation would be ineligible for the benchmark because of its file size but impressive nonetheless.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: