For that GPU the best Gemma 3 model you'll be able to run (with GPU-only inferen...

For that GPU the best Gemma 3 model you'll be able to run (with GPU-only inference) is 4-bit quantized 12b parameter model: https://ollama.com/library/gemma3:12b

You could use CPU for some of the layers, and use the 4-bit 27b model, but inference would be much slower.