Within the GGUF (and some other formats) you'll see each layer gets its own quan...

		smcleod 83 days ago \| parent \| context \| favorite \| on: Cerebras launches Qwen3-235B, achieving 1.5k token... Within the GGUF (and some other formats) you'll see each layer gets its own quantisation, for example embeddings layers are usually more sensitive to quantisation and as such are often kept at Q8 or FP16. If you run GGUF-dump or click on the GGUF icon on a model in huggingface you'll see.