Local LLM support? (Llama/Ollama)

kevinlu1248 · on Feb 9, 2024

We have tried many of the current open source models but unfortunately the only model whose capability is close to GPT-4 is Deepseek and unfortunately Deepseek can’t follow our specified format and is very sensitive to prompt changes.

kevinlu1248 · on Feb 9, 2024

The other problem is with latency: Deepseek 34B on A100s seem slower than GPT-4 but perhaps it will be better on H100s.

okwhateverdude · on Feb 9, 2024

Ollama folks just announced api compat with openai's stuff

https://ollama.ai/blog/openai-compatibility

So apparently, yeah.

smcleod · on Feb 9, 2024

Would be nice if rather than adhering to a specific very-closed source company first things were developed for standards, or at least used something like litellm.

kevinlu1248 · on Feb 9, 2024

It probably makes it easier for many companies to move off of OpenAI since they won’t need to drastically alter their codebase.