Having only a basic knowledge of how GPT works under the hood - is it not comput...

herval · on Oct 14, 2023

It is expensive, yes. Fine-tuning is a way to encode instructions without having to resubmit them every time. You also have to resubmit _past iterations_, such that the agent has “memory”, so that’s also quite wasteful

Openai is allegedly launching some big changes nov 6 that’ll make that less wasteful, but I don’t think there’s a ton of info out there on what exactly that’ll be yet

cypress66 · on Oct 14, 2023

Not really. Most of it can be cached. And prompt processing is quite fast anyway. See vllm for an open source implementation that has most optimizations needed to serve many users.

hmage · on Oct 15, 2023

Yes, you finetune the model on your example conversations, and the probability of the model replying in the style of your example conversation increases.

You'll need to feed about 1000 to 100000 example conversations covering various styles of input and output to have a firm effect, though, and that's not cheap.