This is super cool! I wonder if you could do a similar thing, but choosing between a collection of prompts for a task based on the input. Similar to dynamic few-shot prompting, but replacing the entire prompt instead of just the examples.
I agree this is an interesting direction, I think this is on the roadmap for DSPy [https://github.com/stanfordnlp/dspy], but right now they mainly focus on optimizing the in-context examples.