Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Looks like no quantized options with llama.cpp?

https://github.com/ggerganov/llama.cpp/issues/1602



We're very much looking forward to seeing Falcon-40B support on llama.cpp. For production use cases, this is also highly relevant: https://huggingface.co/blog/sagemaker-huggingface-llm




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: