Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can someone explain what DSPy does that fine tuning doesn’t? Structured IO, optimized to better results. Sure. But why just just go straight to weights, instead of trying to optimize the few-shot space?


The main idea behind DSPy is that you can’t modify the weights, but you can perhaps modify the prompts. DSPy’s original primary customer was multi-llm-agent systems where you have a chain / graph of LLM calls (perhaps mostly or all to OpenAI GPT) and you have some metric (perhaps vague) that you want to increase. While the idea may seem a bit weird, there have been various success stories, such as a UoT team winning medical-notes-oriented competition using DSPy https://arxiv.org/html/2404.14544v1


It has multiple optimization strategies. One is optimizing the few shot list. Another is to let the model write prompts and pick the best one based on the given eval. I doubt latter much more intriguing although I have no idea how practical it is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: