Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
a_e_k
6 days ago
|
parent
|
context
|
favorite
| on:
Gemma 3 QAT Models: Bringing AI to Consumer GPUs
I'm seeing ~38--42 tps on a 4090 in a fresh build of llama.cpp under Fedora 42 on my personal machine.
(-t 32 -ngl 100 -c 8192 -fa -ctk q8_0 -ctv q8_0 -m models/gemma-3-27b-it-qat-q4_0.gguf)
Join us for
AI Startup School
this June 16-17 in San Francisco!
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
(-t 32 -ngl 100 -c 8192 -fa -ctk q8_0 -ctv q8_0 -m models/gemma-3-27b-it-qat-q4_0.gguf)