Ollama depends on llama.cpp as its backend, so if there are any changes that nee... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		HanClinto on July 18, 2024 \| parent \| context \| favorite \| on: Mistral NeMo Ollama depends on llama.cpp as its backend, so if there are any changes that need to be made to support anything new in this model architecture or tokenizer, then it will need to be added there first. Then the model needs to be properly quantized and formatted for GGUF (the model format that llama.cpp uses), tested, and uploaded to the model registry. So there's some length to the pipeline that things need to go through, but overall the devs in both projects generally have things running pretty smoothly, and I'm regularly impressed at how quickly both projects get updated to support such things.

HanClinto on July 18, 2024 | [–]

Issue to track Mistral NeMo support in llama.cpp: https://github.com/ggerganov/llama.cpp/issues/8577

codetrotter on July 18, 2024 | [–]

> I'm regularly impressed at how quickly both projects get updated to support such things.

Same! Big kudos to all involved

Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact