No, not really. As I posted in the other thread, there are quite a few historical examples of why the big labs won’t take the entire market. They will push to publish something like this soon. Also, I think reinforcement fine-tuning is more convenient on the data-control side. Our platform allows you to self-host the reward function, so we only need the prompts; everything else can theoretically stay on the user side.