Doesn't this depend a lot on your application though? Not every workload needs l...

		hereonout2 on Sept 12, 2023 \| parent \| context \| favorite \| on: Fine-tune your own Llama 2 to replace GPT-3.5/4 Doesn't this depend a lot on your application though? Not every workload needs low latency and massive horizontal scalability. Take their example of running the llm over the 2 million recipes and saving $23k over GPT 4. That could easily be 2 million documents in some back end system running in a batch. Many people would wait a few days or weeks for a job like that to finish if it offered significant savings.

That's more of a fair use case.

It though also demonstrates why the economics are complicated and there's no one-size-fits-all.