Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What’s required to run the model?


The biggest GPT2 (1.5B params) takes about 10GB VRAM, meaning it runs on a RTX 2080 TI, or the 12GB version of the RTX 3080


What's the largest language model I can run on a 3090 with 24 GiB RAM?


Depends on precision, you can run ~5B model with fp32 precision or ~11B fp16 model max. Int8 is really bad for real world use case so not mentioning it.

But if you are looking to get performance of ChatGPT or GPT-3 then don't waste your time, all GPT-3 like small LLM models (below at least 60B params) are useless for any real world use case, they are just toys.


If you specifically mean a general LLM trained on a general language corpus with instruction finetuning this is correct.

Fortunately very few real world use cases need to be this general.

If you are training a LLM on a domain specific corpus or finetuning on specific downstream tasks even relatively tiny models at 330m params are definitely useful and not “toys” and can be used to accurately perform tasks such as semantic text search, document summarization and named entity recognition.


> If you specifically mean a general LLM trained on a general language corpus with instruction finetuning this is correct.

Yes, thanks, that's what I meant.

> If you are training a LLM on a domain specific corpus or finetuning on specific downstream tasks even relatively tiny models at 330m params are definitely useful and not “toys” and can be used to accurately perform tasks such as semantic text search, document summarization and named entity recognition.

Agree, BERT family is a good example here.


Okay, thank you. Perfect response.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: