Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tried the E4B model in Ollama and it's totally broken when interpreting images. The output depends only on the text and is consistent in that way, but otherwise completely wrong.

Works fine with regular Gemma 3 4B, so I'll assume it's something on Ollama's side. edit: yep, text-only for now[1], would be nice if that was a bit more prominent than burried in a ticket...

Don't feel like compiling llama.cpp myself, so I'll have to wait to try your GGUFs there.

[1]: https://github.com/ollama/ollama/issues/10792#issuecomment-3...



Oh I don't think multimodal works yet - it's text only for now!




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: