Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm seeing ~38--42 tps on a 4090 in a fresh build of llama.cpp under Fedora 42 on my personal machine.

(-t 32 -ngl 100 -c 8192 -fa -ctk q8_0 -ctv q8_0 -m models/gemma-3-27b-it-qat-q4_0.gguf)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: