Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What kind of issues did you have with streaming? I also set up ollama on fly.io, and had no issues getting streaming to work.

For the LLM itself, I just used a custom startup script that downloaded the model once ollama was up. It's the same thing I'd do on a local cluster though. I'm not sure how fly could make it better unless they offered direct integration with ollama or some other inference server?



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: